I am trying to avoid duplicate entries in the table based on type and names however the unique constraint not getting applied.
CREATE TABLE IF NOT EXISTS test
(
id integer NOT NULL DEFAULT nextval('threshold_retry.threshold_details_id_seq'::regclass),
component_type text COLLATE pg_catalog."default" NOT NULL,
component_names text[] COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT test_pk PRIMARY KEY (id)
)
Data to be inserted:
INSERT INTO test(
id, component_type, component_names)
VALUES (1, 'INGESTION', '{ingestiona,atul, ingestiona, ingestionb}'),
(2, 'INGESTION', '{test_s3_prerit, atul}'),
(3, 'DQM', '{testmigration}'),
(4, 'SCRIPT', '{scripta}'),
(5, 'SCRIPT', '{testimportscript, scripta}'),
(6, 'SCRIPT', '{Script_Python}'),
(7, 'BUSINESS_RULES', '{s3_testH_Graph}'),
(8, 'EXPORT', '{Export2}')
;
I want the result to be similar like :
component_type component_names
INGESTION {ingestiona,atul,ingestionb}
INGESTION {test_s3_prerit}
DQM {testmigration}
SCRIPT {scripta}
SCRIPT {testimportscript}
SCRIPT {Script_Python}
BUSINESS_RULES {s3_testH_Graph}
EXPORT {Export2}
Where atul, ingestiona and scripta is getting removed.
I tried using gist but it is not working for text and text[] cobination
ALTER TABLE threshold_retry.test
ADD CONSTRAINT exclude_duplicate_names EXCLUDE USING gist (component_type with =, component_names with &&);
Tried creating Operator class but that too not working.
How to achieve this while insterting or updating the value?
2
Answers
I would suggest to manage the deduplication logic in the application layer or via a function before inserting or updating data in the table like the following
Then Create a trigger to use this function
You will insert data without checking duplicates at this point because the function will handle deduplication. After inserting the data, you should query the table to ensure that duplicates within the same
component_type
have been handled appropriately:You can do this using a function and a
CHECK
constraint. This is likely more efficient than a trigger.The function checks that all values are unique: in other words after grouping by the value there is no group with more than one value.
Then:
db<>fiddle