skip to Main Content

I am trying to fetch this query in batches at it contains calculated values before I insert it into another table. So I declare it to fetch in batches.

It seems there might be an issue with cursor naming conventions in my PostgreSQL environment. I tried using it in double quote and it didn’t work too.

When I run the code I got the error message

ERROR: type "cur" does not exist on OPEN cur;

DO $$ 
DECLARE 
    BatchSize INT := 1000;
    RowCount INT := 0;
    Done BOOLEAN := FALSE;
BEGIN
    -- Declare a cursor for your SELECT statement
    DECLARE cur CURSOR FOR
        SELECT
            td.date,
            so.SoilId,
            cr.CropId,
        -- Add more more
        FROM
            Time td
        LEFT JOIN
            SoilDimension so ON DATE(so.timestamp) = td.date
        LEFT JOIN
            CropDimension cr ON DATE(cr.timestamp) = td.date
        LEFT JOIN
            SensorDimension se ON DATE(se.timestamp) = td.date
        LEFT JOIN
            IrrigationDimension ir ON DATE(ir.timestamp) = td.date
        LEFT JOIN
            WeatherDimension we ON DATE(we.timestamp) = td.date;

    -- Declare variables to hold row values
    DECLARE
        date_col DATE;
        SoilId_col INT;
        CropId_col INT;
        -- Add more variables for other columns as needed

    -- Open the cursor
    OPEN cur;

    -- Loop to fetch and insert data in batches
    LOOP
        FETCH cur INTO
            date_col,
            SoilId_col,
            CropId_col;
            -- Fetch other column values as needed

        -- Check if any rows were fetched, and break the loop if none were fetched
        IF NOT FOUND THEN
            Done := TRUE;
            EXIT;
        END IF;

        -- Insert the fetched row into YourTargetTable
        INSERT INTO YourTargetTable (
            date,
            SoilId,
            CropId,
            -- Add other columns here
        ) VALUES (
            date_col,
            SoilId_col,
            CropId_col,
            -- Add other values here
        );

        -- Increment the row count
        RowCount := RowCount + 1;

        -- Add a delay if necessary to avoid overwhelming the system
        -- PERFORM pg_sleep(1); -- Uncomment this line and adjust the delay as needed

        -- Exit the loop if the desired batch size is reached
        IF RowCount >= BatchSize THEN
            EXIT;
        END IF;
    END LOOP;

    -- Close the cursor
    CLOSE cur;
END $$;

2

Answers


  1. You are mixing SQL command DECLARE and plpgsql command OPEN. It doesn’t work. You should to use only plpgsql syntax. The cursor’s declaration should be before BEGIN.

    DECLARE
        curs1 refcursor;
        curs2 CURSOR FOR SELECT * FROM tenk1;
        curs3 CURSOR (key integer) FOR SELECT * FROM tenk1 WHERE unique1 = key;
    BEGIN
      OPEN curs2;
      ...
    

    It is little bit messy, because inside PL/pgSQL we can work with SQL commands, and SQL supports cursors, but PL/pgSQL supports cursors little bit different way and uses own implementation. SQL cursors and PL/pgSQL cursors are interface over some API inside Postgres, but these interfaces cannot to cooperate. So inside PL/pgSQL use only PL/pgSQL cursors.

    Login or Signup to reply.
  2. Plpgsql is a block structured language consisting of 3 sections:

    • Declaration (declare),
    • Execution (begin … end;)
    • and Exception handler (exception)

    With only Execution section required. The Declaration sections contains local variable declarations including cursors. Further blocks are nest-able with declare or begin starting a new block. Anything declared within an inner block is not visible in an outer block.Your structure contains nested blocks. It can/should(?) be reduced to a single block. So:

    do $$ 
    declare 
       -- declare local control variables
        batchsize int := 1000;
        rowcount int := 0;
        done boolean := false;
    
        -- declare a cursor for select statement
        cur cursor for
            select
                td.date,
                so.soilid,
                cr.cropid,
            -- add more more
            from
                time td
            left join
                soildimension so on date(so.timestamp) = td.date
            left join
                cropdimension cr on date(cr.timestamp) = td.date
            left join
                sensordimension se on date(se.timestamp) = td.date
            left join
                irrigationdimension ir on date(ir.timestamp) = td.date
            left join
                weatherdimension we on date(we.timestamp) = td.date;
            
        -- declare variables to hold row values
            date_col date;
            soilid_col int;
            cropid_col int;
    begin
    
        -- open the cursor
        open cur;
    
        -- loop to fetch and insert data in batches
        loop
            fetch cur into
                date_col,
                soilid_col,
                cropid_col;
                -- fetch other column values as needed
    
            -- check if any rows were fetched, and break the loop if none were fetched
            if not found then
                done := true;
                exit;
            end if;
    
            -- insert the fetched row into yourtargettable
            insert into yourtargettable (
                date,
                soilid,
                cropid,
                -- add other columns here
            ) values (
                date_col,
                soilid_col,
                cropid_col,
                -- add other values here
            );
    
            -- increment the row count
            rowcount := rowcount + 1;
    
            -- add a delay if necessary to avoid overwhelming the system
            -- perform pg_sleep(1); -- uncomment this line and adjust the delay as needed
    
            -- exit the loop if the desired batch size is reached
            if rowcount >= batchsize then
                exit;
            end if;
        end loop;
    
        -- close the cursor
        close cur;
    end $$;
    

    FYI, this entire block can be reduced to a single SQL statement as:

    insert into yourtargettable (
                date,
                soilid,
                cropid,
                -- add other columns here
                ) 
        select
            td.date,
            so.soilid,
            cr.cropid,
            -- add more more
          from time td
          left join soildimension       so on date(so.timestamp) = td.date
          left join cropdimension       cr on date(cr.timestamp) = td.date
          left join sensordimension     se on date(se.timestamp) = td.date
          left join irrigationdimension ir on date(ir.timestamp) = td.date
          left join weatherdimension    we on date(we.timestamp) = td.date
          limit 1000;
    

    Issue Your select ... statement does not have an order by clause which makes the statement nondeterministic. If you run it multiple times you will get different results. This is so for both the do block and the single statement.

    Note: Not tested no test data supplied.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search