skip to Main Content

I’ve made csv-backup from SELECT query and now trying to import it back to the database. But i am getting this error:

COPY doc FROM '/tmp/doc.csv' DELIMITER ',' CSV HEADER;


ERROR: invalid input syntax for type uuid: "null"

As you can see, i have NULL as "null" in my file.

This happens on the optional field which was empty before.

I found this solution: https://stackoverflow.com/a/40428667/8443131

But it is not working for me:

COPY doc FROM '/tmp/doc.csv' DELIMITER ',' CSV HEADER QUOTE '"null"' NULL '';

ERROR:  COPY quote must be a single one-byte character

How do i import this file?

UPD: I tried to replace nulls with empty quotes.

Command tried:

COPY doc FROM '/tmp/null.csv' DELIMITER ',' CSV HEADER QUOTE '"' NULL '';

ERROR:  invalid input syntax for type uuid: ""

Short version of file:

"id","removed","modified_at","root_id","parent_id","acl","properties","data","file_meta"
"f6a16ff7-4a31-11eb-be7b-8344edc8f36b","false","2021-01-04 00:00:12.347988","","","IS_PUBLIC","","",""
"2fdd0b8b-4a70-11eb-99fd-ad786a821574","false","2021-01-04 00:00:06.87298","","","IS_PUBLIC","","",""
"2c6d5fd1-4a70-11eb-99fd-ad786a821574","false","2021-01-04 00:00:07.536212","","","IS_PUBLIC","","",""
"fd645c21-4a6f-11eb-99fd-ad786a821574","false","2021-01-04 00:00:11.892367","","","IS_PUBLIC","","",""
"35c1fc53-4a70-11eb-99fd-ad786a821574","false","2021-01-04 00:00:05.517109","","","IS_PUBLIC","","",""
"35d165a4-4a70-11eb-99fd-ad786a821574","false","2021-01-04 00:00:01.72546","","","IS_PUBLIC","","",""
"fd40806d-4a6f-11eb-99fd-ad786a821574","false","2021-01-04 00:00:09.173726","","","IS_PUBLIC","","",""
"30ba4b45-4a70-11eb-99fd-ad786a821574","false","2021-01-04 00:00:04.655073","","","IS_PUBLIC","","",""

Table creation:


-- Dumped from database version 13.0 (Debian 13.0-1.pgdg100+1)
-- Dumped by pg_dump version 13.0 (Debian 13.0-1.pgdg100+1)



CREATE TABLE public.doc (
    id uuid NOT NULL,
    removed boolean,
    modified_at timestamp without time zone,
    root_id uuid,
    parent_id uuid,
    acl jsonb,
    properties jsonb,
    data jsonb,
    file_meta jsonb
);



ALTER TABLE ONLY public.doc
    ADD CONSTRAINT doc_pkey PRIMARY KEY (id);

ALTER TABLE ONLY public.doc
    ADD CONSTRAINT fk_document_entity FOREIGN KEY (id) REFERENCES public.main_table(id);

ALTER TABLE ONLY public.doc
    ADD CONSTRAINT fk_document_parent FOREIGN KEY (parent_id) REFERENCES public.doc(id);

3

Answers


  1. Chosen as BEST ANSWER

    Even though @Abelisto command was working i still wasn't be able to upload some jsonb lines.

    But i also had an .json alternative of my file like this:

    [
        {
            "c0": "f6a16ff7-4a31-11eb-be7b-8344edc8f36b",
            "c1": false,
            "c2": "2021-01-04 00:00:12.347988",
            "c3": null,
            "c4": null,
            "c5": "IS_PUBLIC",
            "c6": null,
            "c7": null,
            "c8": null
        },
        ...
    ]
    

    So i end up in writing this python script which worked for me:

    import json
    import psycopg2
    from datetime import datetime
    import uuid
    
    
    connection = psycopg2.connect(user="admin",
                                  password="admin",
                                  host="127.0.0.1",
                                  port="5432",
                                  database="postgres")
    cursor = connection.cursor()
    
    
    def insertLine(line):
    
        id = uuid.UUID(line['c0']).hex
    
        removed = bool(line['c1'])
    
        modified_at = datetime.strptime(line['c2'], '%Y-%m-%d %H:%M:%S.%f')
    
        root_id = uuid.UUID(line['c3']).hex if line['c3'] else None
        parent_id = uuid.UUID(line['c4']).hex if line['c4'] else None
    
        acl = json.dumps(line['c5']) if line['c5'] else None
    
        properties = json.dumps(line['c6']) if line['c6'] else None
        data = json.dumps(line['c7']) if line['c7'] else None
        file_meta = json.dumps(line['c8']) if line['c8'] else None
    
        record_to_insert = (id, removed, modified_at, root_id,
                            parent_id, acl, properties, data, file_meta)
    
        try:
            postgres_insert_query = """INSERT INTO doc (id, removed, modified_at, root_id, parent_id, acl, properties, data, file_meta) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)"""
            
            cursor.execute(postgres_insert_query, record_to_insert)
    
            connection.commit()
            count = cursor.rowcount
    
    
        except psycopg2.Error as error:
            print("ERROR:" + str(error))
    
    
    file = 'table.json'
    
    with open(file) as json_file:
        data = json.load(json_file)
        for p in data:
            insertLine(p)
    
    
    if connection:
        cursor.close()
        connection.close()
        print("PostgreSQL connection is closed")
    
    

    So i guess it is just bad practice to backup jsonb fields in csv.


  2. There is no way you can load this file with COPY, since "null" is quoted with double quotes and consequently cannot be used as NULL placeholder – it is always interpreted as string.

    The best you can do is to load the file into a table where the respective columns are defined as text, then do something like

    ALTER TABLE doc ALTER uuidcol TYPE uuid USING CAST(nullif(uuidcol, 'null') AS uuid);
    
    Login or Signup to reply.
  3. I reproduced your case with the following, assuming the second column is a boolean and the 3rd a timestamp

    create table test (col1 varchar, col2 boolean, col3 timestamp, col4 varchar, col5 varchar, col6 varchar, col7 varchar, col8 varchar, col9 varchar)                                  ;                                                                                               
    

    if now i use

    copy test from STDIN delimiter ',' CSV QUOTE '"' NULL 'null';
    

    and pass the string you mentioned

    "f6a16ff7-4a31-11eb-be7b-8344edc8f36b","false","2021-01-04 00:00:12.347988","null","null","IS_PUBLIC","null","null","null"
    

    The data is parsed correctly

    COPY 1
    

    and the output from the table looks correct.

    defaultdb=> select * from test;
                     col1                 | col2 |            col3            | col4 | col5 |   col6    | col7 | col8 | col9 
    --------------------------------------+------+----------------------------+------+------+-----------+------+------+------
     f6a16ff7-4a31-11eb-be7b-8344edc8f36b | f    | 2021-01-04 00:00:12.347988 | null | null | IS_PUBLIC | null | null | null
    (1 row)
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search