skip to Main Content

I have a list of alphanumeric strings that includes two special characters + and -. The input, as shown below, represents part numbers for our company’s products. I would like to extract the individual components as an output. There are about 45,000 products, and the data below is a sample of the part numbers. How can I parse this string to distinguish between + and - and provide the correct output?

enter image description here

I am unsure how to distinguish between range indicated by - and addition indicated by +.

2

Answers


  1. Create procedure:

    DELIMITER //
    
    CREATE PROCEDURE ConvertRangesToList(input_string VARCHAR(255))
    BEGIN
        DECLARE output_list VARCHAR(255) DEFAULT '';
        DECLARE current_part VARCHAR(255);
        DECLARE range_start INT;
        DECLARE range_end INT;
        DECLARE range_separator_pos INT;
        DECLARE plus_separator_pos INT;
        DECLARE current_number INT;
        DECLARE delimiter_pos INT;
    
        -- Replace '+' with ','
        SET input_string = REPLACE(input_string, '+', ',');
    
        -- Add a comma at the end to ensure the last number is processed
        SET input_string = CONCAT(input_string, ',');
    
        -- Process each part of the input string
        WHILE LENGTH(input_string) > 0 DO
            -- Find the position of the next comma delimiter
            SET delimiter_pos = INSTR(input_string, ',');
            -- Extract the current part
            SET current_part = TRIM(SUBSTRING(input_string, 1, delimiter_pos - 1));
            -- Remove the processed part from the input string
            SET input_string = SUBSTRING(input_string, delimiter_pos + 1);
    
            -- Find the position of the range separator '-'
            SET range_separator_pos = INSTR(current_part, '-');
            IF range_separator_pos > 0 THEN
                -- If a range separator is found
                SET range_start = CAST(SUBSTRING_INDEX(current_part, '-', 1) AS UNSIGNED);
                SET range_end = CAST(SUBSTRING_INDEX(current_part, '-', -1) AS UNSIGNED);
    
                -- Append the range to the output list
                WHILE range_start <= range_end DO
                    SET output_list = CONCAT(output_list, IF(output_list = '', '', ','), range_start);
                    SET range_start = range_start + 1;
                END WHILE;
            ELSE
                -- If no range separator is found, it's a single number
                SET current_number = CAST(current_part AS UNSIGNED);
                SET output_list = CONCAT(output_list, IF(output_list = '', '', ','), current_number);
            END IF;
        END WHILE;
    
        SELECT output_list AS Output;
    END//
    
    DELIMITER ;
    

    and call it:

    CALL ConvertRangesToList('1-12+20');
    

    Gisted it: MySQL procedure to convert formula to list of numbers

    Login or Signup to reply.
  2. See example.
    Disassemble a string, generate series and reassemble it using two recursive subqueries and grouping back.

    with recursive splt as( -- parsing formula string to parts
    select id,0 lvl,inFormula,outTest
      ,case when regexp_instr(inFormula,'[0-9]{1,}')>0 then
             cast(regexp_substr(inFormula,'[0-9]{1,}') as signed) 
       end n1
      ,case when regexp_instr(inFormula,'[0-9]{1,}')>0 then
             cast(regexp_substr(inFormula,'[0-9]{1,}') as signed) 
       end n2
      ,case when regexp_instr(inFormula,'[0-9]{1,}')>0 then
          substr(inFormula,length(regexp_substr(inFormula,'[0-9]{1,}'))+1,100)
       else ''
       end rest
    from test
    union all
    select id,lvl+1 lvl,inFormula,outTest
      ,case when left(rest,1)='-' then n2+1
            when regexp_instr(rest,'[0-9]{1,}')>0 then
         cast(regexp_substr(rest,'[0-9]{1,}') as signed) 
       end n1
      ,case when left(rest,1)='-' then 
              case when regexp_instr(rest,'[0-9]{1,}')>0 then
                cast(regexp_substr(rest,'[0-9]{1,}') as signed)
              end
            when regexp_instr(rest,'[0-9]{1,}')>0 then
              cast(regexp_substr(rest,'[0-9]{1,}') as signed) 
       end n2
      ,case when regexp_instr(rest,'[0-9]{1,}')>0 then
          substr(rest,length(regexp_substr(rest,'[0-9]{1,}'))+2,100)
       else ''
       end rest
    from splt
    where regexp_instr(rest,'[0-9]{1,}')>0  -- and lvl<5  <- for debug only
    )
    ,series as( -- generate additional rows for diapasons like 4-6
       select id,lvl,0 lvl2,inFormula,outTest,n2 lastN
          ,n1
       from splt
       union all
       select id,lvl,lvl2+1 lvl2,inFormula,outTest,lastN
          ,n1+1
       from series
       where (n1+1)<=lastN
    )
    -- assembling all back
    select id,inFormula,outTest
      ,group_concat(n1 order by n1) newList
    from series 
    group by id,inFormula,outTest
    order by id;
    

    Output

    id inFormula outTest newList
    1 1-3 1,2,3 1,2,3
    2 1+3 1,3 1,3
    3 4+7+8 4,7,8 4,7,8
    4 1-3+5-6 1,2,3,5,6 1,2,3,5,6
    5 2+4-6 2,4,5,6 2,4,5,6
    6 2-4+6 2,3,4,6 2,3,4,6
    7 2-4-6 2,3,4,5,6 2,3,4,5,6
    8 23 23 23
    9 19- 19 19
    10 null
    11 27-28-30-31 27,28,29,30,31 27,28,29,30,31
    12 999-1001+10004 999,1000,1001,1004 999,1000,1001,10004

    Fiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search