skip to Main Content

In postgres, what data type should I use to store a unique set of strings?

Is using a simple array then when I want to write, I read the current set of strings, determine what the final set should be considering uniqueness then write it back the best way? Seems somewhat inefficient.

I likely will be commonly do = ANY queries too to determine if the array contain an element

A possible data model looks like:

BlogPost
- title
- tags: string[]

2 common operations:

  • Find blog posts with a given tag
  • Update the list of tags (while ideally keeping it a unique set)

3

Answers


  1. In postgreSQL, we can utilize the ‘ text[] ‘ information type to store a variety of strings. If we want uniqueness, we can utilize cluster capabilities like unnest and array_agg to dispose of copies. For effective questions, think about utilizing fitting ordering on the cluster segment, e.g., GIN or Substance files, while performing activities like = ANY. The selection of information model and order relies upon the particular use case and information circulation.

    Login or Signup to reply.
  2. I would suggest to use a separate table for the tags. The advantage is that the tags can be indexed and efficiently accessed. The index further guarantees that the tags are unique per post.

    create table blog_post (
      id int8 primary key,
      title varchar(255)
    );
    
    create table blog_tag (
      tag varchar(255),
      blog_post_id int8,
      constraint fk_blog_post foreign key(blog_post_id) references blog_post(id)
    );
    
    create unique index blog_tag_uk on blog_tag(tag, blog_post_id);
    
    insert into blog_post values(1, 'title1');
    insert into blog_tag values('tag-a', 1);
    insert into blog_tag values('tag-b', 1);
    
    insert into blog_post values(2, 'title2');
    insert into blog_tag values('tag-a', 2);
    insert into blog_tag values('tag-c', 2);
    
    select *
    from blog_post p left join blog_tag t on t.blog_post_id = p.id;
    
    Login or Signup to reply.
  3. What I understood from your question that you want to assure you have unique set of strings for each post, and if so then you can use TEXT with UNIQUE to assure that like so:

    CREATE TABLE BlogPost (
      id SERIAL PRIMARY KEY,
      title TEXT,
      tags TEXT[] UNIQUE
    );
    

    UNIQUE will assure every set you have in the table is unique. If you tried to insert the same set twice you will get an error like so:

    postgres=# INSERT INTO BlogPost (tags) VALUES ('{"sql", "postgresql"}');
    INSERT 0 1
    postgres=# INSERT INTO BlogPost (tags) VALUES ('{"sql", "postgresql"}');
    ERROR:  duplicate key value violates unique constraint "blogpost_tags_key"
    DETAIL:  Key (tags)=({sql,postgresql}) already exists.
    

    I hope this answers your question, and let me know how it goes with you!

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search