skip to Main Content

I have a COLUMN in a Postgres table which has values with the following pattern.

INPUT-X

INPUT-X-Y

INPUT-X-Y-Z

INPUT-X-Y-Z-A-...

I am trying to count the number of entries in COLUMN where the value begins with INPUT and contains a given number of hyphens. So if the input value is 2023-08-04T09:00:32.822+00:00, I want to count all of the entries in COLUMN which begin with 2023-08-04T09:00:32.822+00:00 and contain 3 hyphens. If values in COLUMN include:

2023-08-04T09:00:32.822+00:00-1 
2023-08-04T09:00:32.822+00:00-1-1 
2023-08-04T09:00:32.822+00:00-2 
2023-08-04T09:00:32.822+00:00-2-1 
2023-08-04T09:00:32.822+00:00-3
2023-08-04T09:00:32.822+00:00-4

The count of this operation should equal 4. (The purpose of this is to calculate the index for the next value of the series, which in this case would be 2023-08-04T09:00:32.822+00:00-5. Each of the values are a reference to a parent entry e.g. 2023-08-04T09:00:32.822+00:00-5 signifies the 5th reply to entry 2023-08-04T09:00:32.822+00:00. Likewise, 2023-08-04T09:00:32.822+00:00-2-3 signifies the 3rd reply to entry 2023-08-04T09:00:32.822+00:00-2.)

The code I’m currently using is:

select count(*) from messages_dev where ref like input || '%-' || repeat('-', hyphens) || '%' into result;

but that is returning 0, for counts that should have higher numbers.

Can anybody recommend a way to get this count in Postgres?

2

Answers


  1. As you have a string, you can use string functions, to get the next number.

    but betst would be not to store it that way, as string functions are usually ver slow, and a normalized table structure suits better for relatinal databases

    This query would find all sequences and get the higest number, if you want only specific ones, you need to add which dates you aczaully want.

    when you want to use it in parallel systems, you should read about concrency.

    CREATE TABLE mytable
        ("mycolumn" varchar(33))
    ;
        
    INSERT INTO mytable
        ("mycolumn")
    VALUES
        ('2023-08-04T09:00:32.822+00:00-1'),
        ('2023-08-04T09:00:32.822+00:00-1-1'),
        ('2023-08-04T09:00:32.822+00:00-2'),
        ('2023-08-04T09:00:32.822+00:00-2-1'),
        ('2023-08-04T09:00:32.822+00:00-3'),
        ('2023-08-04T09:00:32.822+00:00-4')
    ;
    
    
    WITH CTE AS (select Substring(mycolumn,0,30) datetxt,MAX(split_part(mycolumn, '-', 4))::int +  1 newnum
    FROM mytable
    GROUP BY 1)
    SELECT CONCAT(datetxt,'-', newnum::text)
      FROM CTE
    
    concat
    2023-08-04T09:00:32.822+00:00-5

    fiddle

    Login or Signup to reply.
  2. This is a good question.

    Your approach seems to be heading in the right direction, but it falls short due to a few errors in the query input || '%-' || repeat('-', hyphens) || '%'. Let us address the small pattern errors and then move onto the query in itself.

    1. There is an issue with the repeat('-', hyphens) pattern. This would mean --- like this when hyphens = 3. Note that we need to have integers after a - as well. Also, having the pattern '%- is not included in the count as well. Changing the pattern to input || repeat('-%') brings us 1 step closer to the solution.
    2. The query does not count the number of - in the input string, because you have input concatenated with a pattern of - repeated hyphen times. In case the input string is always 2023-08-04T09:00:32.822+00:00, changing it to repeat('-%', hyphens-2) would bring us closer to the answer but there is another problem here –
      % symbol represents any sequence of characters (including zero characters). It can match any substring of a string.
      So, the query '2023-08-04T09:00:32.822+00:00' || repeat('-%', 1) would match with all 6 values.

    Solution:

    1. To mitigate all these issues, I would suggest using other string functions – namely the LENGTH and REPLACE Functions.
    2. We can find the length of the entire string (let us call this l1) and find the length of a new string – which has all its hyphens removed (l2). Substracting l2 from l1, would give us the number of - in the string.
    3. We can remove the hyphens in the string by replacing the - with a null character ''.

    Overall, the query would look something like this:

    SELECT COUNT(*) AS result
    FROM messages_dev
    WHERE ref LIKE  input || '-%' 
      AND LENGTH(ref) - LENGTH(REPLACE(ref, '-', '')) = hyphens; 
    

    You could try this out here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search