I have a "trademarks" table like this:
create table
public.trademarks (
id bigint generated always as identity,
owner_id text null,
trademark text null,
constraint trademarks_pkey primary key (id)
) tablespace pg_default;
My goal is to pass a text to a query and get all trademarks that the text contains. For example passing:
Walt Disney was an American entrepreneur, animator, and film producer
who became one of the most influential figures in the entertainment
industry. He co-founded the Disney Brothers Studio, which later became
The Walt Disney Company, and created beloved characters such as Mickey
Mouse and Donald Duck. Through his innovative and enduring works,
Disney left an indelible mark on the world of animation, theme parks,
and television, capturing the imagination of audiences for
generations.
Should return the rows with "Walt Disney", "Mickey Mouse", "Donald Duck". etc.
I came up with this solution which works, but I wanted to hear some expert opinions on this problem:
SELECT
*
FROM
trademarks
WHERE
'Walt Disney was an American entrepreneur, animator, and film producer who became one of the most influential figures in the entertainment industry. He co-founded the Disney Brothers Studio, which later became The Walt Disney Company, and created beloved characters such as Mickey Mouse and Donald Duck. Through his innovative and enduring works, Disney left an indelible mark on the world of animation, theme parks, and television, capturing the imagination of audiences for generations.' ILIKE '%' || trademark || '%';
2
Answers
My understanding:
This does not sound advisable to do it directly in the DB.
trademarks in sorted order.
to
But for a real life maintainable solution,I think you will / should do it in the application layer – by using a tree-based model and parallelization of the searches involved
Create a Full-Text Search Index:
First, you need to create a full-text search index on the column you want to search within. Let’s assume you have a table named documents with a content column that contains the long text data. Here’s how you can create the index:
(This index uses the to_tsvector function to convert the text data into a format optimized for full-text searching.)
Perform Full-Text Searches:
This query uses the @@ operator to match the text in the content column against the tsquery created from the search term.
To Optimize Performance further:
When dealing with a large number of records, you can further optimize performance by using features like pagination (LIMIT and OFFSET) to retrieve results in smaller chunks like:
Additionally, consider using tools like caching and indexing strategies to speed up queries even further.
Remember that full-text search is a powerful tool, but its performance depends on various factors, including the complexity of the search, the amount of data, and the server’s resources. It’s also a good practice to regularly analyze and monitor the performance of your queries to ensure they remain efficient as your data grows.