skip to Main Content

I am new to PostgreSql, I am having a table like below,

enter image description here

I need to get output as count of the given word (test) in text_col like below table using ts_vector function,
Please give me some suggestion, Thanks !.

enter image description here

3

Answers


  1. What about counting spaces?

    The following takes the length of the original string, calculates the length of the string removing the spaces and then sums 1

    with first_sel as (
        select 'Ciao io sono Ugo' mytext union all
        select 'Ciao io non sono Ugo' 
    )
    select mytext, length(mytext) - length(replace(mytext, ' ','')) + 1 nr_words from first_sel;
    

    result

            mytext        | nr_words
    ----------------------+----------
     Ciao io sono Ugo     |        4
     Ciao io non sono Ugo |        5
    (2 rows)
    

    If it’s not only about spaces you can try removing the needed characters with regex_replace

    Login or Signup to reply.
  2. SELECT filename, cardinality(string_to_array(text_col,'test'))-1 
    FROM documents;
    

    dbfiddle

    Login or Signup to reply.
  3. Using tsvector is like (globally)

    select * from ts_stat('select to_tsvector(text_col) from document_table')
    where word='test';
    

    where ndoc is number of rows (documents) and nentry number of occurrences of the word

    Or by row

    select file_name, text_col, to_tsvector(text_col), 
    substring(to_tsvector(text_col)::text, '''test'':(d+(,d+)*)') positions_in_text,
    length(regexp_replace(substring(to_tsvector(text_col)::text, '''test'':(d+(,d+)*)'), '[^,]', '', 'g'))+1 occurrences 
    from document_table
    where text_col @@ to_tsquery('test');
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search