skip to Main Content

MySQL database has wp_post table in which the post_content column is located.
I need to build a SQL request so that in all posts of the post_content links of the type:

http://example.com/component/tags/tag/15-text.html

Replace with links of the type:

http://example.com/tag/text.html

That is, I need to replace all the characters that are between ‘example.com/‘ and ‘text.html‘, with ‘tag/
taking into account the fact that the number in the link can be any one-, two- or three-digit (from 1 to 999 ).

If the wildcard ‘%‘ could be used in the REPLACE function, then the request could look something like this:

UPDATE wp_posts SET post_content = REPLACE(post_content, 'component/tags/tag/%-', 'tag/');

Unfortunately, this does not work. I have no experience with databases,
but it seems to me that the desired request can be created using SUBSTRING_INDEX,
but I didn’t succeed in doing it myself. I would be grateful for any help.

More examples:

There are -> should be

http://example.com/component/tags/tag/15-text.html  ->  http://example.com/tag/text.html

http://example.com/component/tags/tag/1-sometext.html  ->  http://example.com/tag/sometext.html

http://example.com/component/tags/tag/2-anothertext.html  ->  http://example.com/tag/anothertext.html

and so on

http://example.com/component/tags/tag/999-moreanothertext.html  ->  http://example.com/tag/moreanothertext.html

What exactly needs to be done: to remove from these links part Component/Tags/Tag/Somenumber-,
Where is Somenumber is only Placeholder (Variable),

but everything else does not need to be changed

3

Answers


  1. You need not just a replace, but a replace with regexp. Look at the request below.

    SELECT REGEXP_REPLACE('http://example.com/component/tags/tag/15-text.html', 'http://([^/]+)[^-]+-(text.html)', 'http://$1/tags/$2');
    
    Login or Signup to reply.
  2. This can be done via straightforward string replacement methods – CONCAT, SUBSTRING, LOCATE, REPLACE. The first 3 queries show the step-by-step logic, while query #4 puts it all together.

    These queries assume http://example.com/component/tags/tag/ is the same for all URLs. If that’s not the case, you could use similar logic to detect the position of /tag/ and use SUBSTRING from there.

    The final query to actually update the table (query #6; note that duplicate URLs might exist after this and can be detected ahead of time with query #5 below):

    UPDATE
      test
    SET
      url = CONCAT(
        'http://example.com/tag/',
        SUBSTRING(
          REPLACE(
            url,
            'http://example.com/component/tags/tag/',
            ''
          ),
          LOCATE(
            '-',
            REPLACE(
              url,
              'http://example.com/component/tags/tag/',
              ''
            )
          ) + 1
        )
      );
    

    Schema (MySQL v5.7)

    CREATE TABLE test (
      id INT,
      url TEXT
    );
    INSERT INTO test (id, url) VALUES (1, 'http://example.com/component/tags/tag/15-thing.html');
    INSERT INTO test (id, url) VALUES (2, 'http://example.com/component/tags/tag/16-another.html');
    INSERT INTO test (id, url) VALUES (3, 'http://example.com/component/tags/tag/9999-texttext.html');
    INSERT INTO test (id, url) VALUES (4, 'http://example.com/component/tags/tag/9999-more-and-more-and-more-and-more-text.html');
    INSERT INTO test (id, url) VALUES (5, 'http://example.com/component/tags/tag/6534562-thing.html');
    INSERT INTO test (id, url) VALUES (6, 'http://example.com/component/tags/tag/0-thing.html');
    INSERT INTO test (id, url) VALUES (7, 'http://example.com/component/tags/tag/1-sometext.html');
    INSERT INTO test (id, url) VALUES (8, 'http://example.com/component/tags/tag/2-anothertext.html');
    INSERT INTO test (id, url) VALUES (9, 'http://example.com/component/tags/tag/999-moreanothertext.html');
    

    Query #1 – remove the known/standard text before the filename (e.g. 15-thing.html)

    SELECT
      REPLACE(
        url,
        'http://example.com/component/tags/tag/',
        ''
      ) as variable_text
    FROM test;
    
    variable_text
    15-thing.html
    16-another.html
    9999-texttext.html
    9999-more-and-more-and-more-and-more-text.html
    6534562-thing.html
    0-thing.html
    1-sometext.html
    2-anothertext.html
    999-moreanothertext.html

    Query #2 – find the index of – in the remaining string (e.g. 3)

    SELECT
      LOCATE(
        '-',
        REPLACE(
          url,
          'http://example.com/component/tags/tag/',
          ''
        )
      ) as hyphen_index
    FROM test;
    
    hyphen_index
    3
    3
    5
    5
    8
    2
    2
    2
    4

    Query #3 – remove everything after that index (e.g. thing.html)

    SELECT
      SUBSTRING(
        REPLACE(
          url,
          'http://example.com/component/tags/tag/',
          ''
        ),
        LOCATE(
          '-',
          REPLACE(
            url,
            'http://example.com/component/tags/tag/',
            ''
          )
        ) + 1
      ) as filename
    FROM test;
    
    filename
    thing.html
    another.html
    texttext.html
    more-and-more-and-more-and-more-text.html
    thing.html
    thing.html
    sometext.html
    anothertext.html
    moreanothertext.html

    Query #4 – construct the full URL (e.g. http://example.com/tag/thing.html)

    SELECT
      CONCAT(
        'http://example.com/tag/',
        SUBSTRING(
          REPLACE(
            url,
            'http://example.com/component/tags/tag/',
            ''
          ),
          LOCATE(
            '-',
            REPLACE(
              url,
              'http://example.com/component/tags/tag/',
              ''
            )
          ) + 1
        )
      ) as new_url
    FROM test;
    

    Query #5 – detect duplicates ahead of actually updating the table

    SELECT GROUP_CONCAT(id) as duplicate_ids, new_url FROM (
      SELECT
        id,
        CONCAT(
          'http://example.com/tag/',
          SUBSTRING(
            REPLACE(
              url,
              'http://example.com/component/tags/tag/',
              ''
            ),
            LOCATE(
              '-',
              REPLACE(
                url,
                'http://example.com/component/tags/tag/',
                ''
              )
            ) + 1
          )
        ) as new_url
      FROM test) as new_urls
    GROUP BY new_url
    HAVING COUNT(*) > 1;
    
    duplicate_ids new_url
    1,5,6 http://example.com/tag/thing.html

    Query #6 – actually update the table

    UPDATE
      test
    SET
      url = CONCAT(
        'http://example.com/tag/',
        SUBSTRING(
          REPLACE(
            url,
            'http://example.com/component/tags/tag/',
            ''
          ),
          LOCATE(
            '-',
            REPLACE(
              url,
              'http://example.com/component/tags/tag/',
              ''
            )
          ) + 1
        )
      );
    

    There are no results to be displayed.


    Query #7 – view the new URLs

    SELECT * FROM test;
    

    View on DB Fiddle

    Login or Signup to reply.
  3. UPDATE wp_posts
    SET post_content = CONCAT(
        SUBSTRING_INDEX(post_content, 'example.com/', 1), -- Extract the part before 'example.com/'
        'example.com/tag/',                             -- Append the new base URL
        SUBSTRING_INDEX(SUBSTRING_INDEX(post_content, 'example.com/', -1), '-text.html', 1), -- Extract the number between 'example.com/' and '-text.html'
        '-text.html',                                   -- Append the constant '-text.html'
        SUBSTRING_INDEX(post_content, '-text.html', -1) -- Extract the part after '-text.html'
    )
    WHERE post_content LIKE '%example.com/component/tags/tag/%-text.html%';
    

    With the examples given:

    UPDATE wp_posts
    SET post_content = CONCAT(
        SUBSTRING_INDEX(post_content, 'example.com/', 1), -- Extract the part before 'example.com/'
        'example.com/tag/',                             -- Append the new base URL
        REGEXP_REPLACE(
            REGEXP_SUBSTR(post_content, 'example.com/component/tags/tag/[0-9]{1,3}-([^"]+).html'),
            'example.com/component/tags/tag/[0-9]{1,3}-',
            ''
        )
    )
    WHERE post_content REGEXP 'example.com/component/tags/tag/[0-9]{1,3}-([^"]+).html';
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search