skip to Main Content

I have a table with webpage data that looks somewhat like this:

row_id  ⁞ page_id ⁞ lang ⁞ title              ⁞ slug
────────┼─────────┼──────┼────────────────────┼────────────────
1       ⁞1        ⁞ en   ⁞ Welcome!           ⁞ begin
2       ⁞1        ⁞ fr   ⁞ Bienvenue!         ⁞ bienvenue
3       ⁞2        ⁞ pl   ⁞ Podstrona          ⁞ podstrona
4       ⁞2        ⁞ en   ⁞ Subpage            ⁞ subpage
5       ⁞3        ⁞ pl   ⁞ Podstrona 2        ⁞ podstrona-2
6       ⁞4        ⁞ fr   ⁞ Coordonnées        ⁞ coordonnees
7       ⁞5        ⁞ pl   ⁞ Podstrona poziom 2 ⁞ podstrona-lvl-2
8       ⁞5        ⁞ en   ⁞ Subpage Lvl 2      ⁞ subpage-lvl-2
9       ⁞6        ⁞ pl   ⁞ Poziom 3           ⁞ poziom-3
10      ⁞6        ⁞ en   ⁞ Level 3            ⁞ lvl-3
11      ⁞7        ⁞ pl   ⁞ Błąd 404           ⁞ 404
12      ⁞7        ⁞ en   ⁞ Error 404          ⁞ 404
13      ⁞7        ⁞ fr   ⁞ Erreur 404         ⁞ 404

I want to fetch a single language version of each page in a prioritized order. Let’s say I’d like to fetch all pages of fr language, but if some pages lack that language version, then get the en version instead, but again if there’s no en, try pl, and so on; so in this case the result would look like this:

row_id  ⁞ page_id ⁞ lang ⁞ title              ⁞ slug
────────┼─────────┼──────┼────────────────────┼────────────────
2       ⁞1        ⁞ fr   ⁞ Bienvenue!         ⁞ bienvenue
4       ⁞2        ⁞ en   ⁞ Subpage            ⁞ subpage
5       ⁞3        ⁞ pl   ⁞ Podstrona 2        ⁞ podstrona-2
6       ⁞4        ⁞ fr   ⁞ Coordonnées        ⁞ coordonnees
8       ⁞5        ⁞ en   ⁞ Subpage Lvl 2      ⁞ subpage-lvl-2
10      ⁞6        ⁞ en   ⁞ Level 3            ⁞ lvl-3
13      ⁞7        ⁞ fr   ⁞ Erreur 404         ⁞ 404

The language count is not predifined, however each page is bound to have at least one language version.

It’s possible to do with a whoopload of queries, or by fetching all the data and then process it programatically by PHP, but I was wondering if there is an elegant SQL (SQLite) solution that would solve this in one query?

2

Answers


  1. With conditional aggregation:

    select t.* 
    from (
      select 
        page_id,
        max(lang ='fr') fr,
        max(lang ='en') en,
        max(lang ='pl') pl
      from tablename
      group by page_id
    ) g inner join tablename t
    on t.page_id = g.page_id
    and t.lang = case 1
      when g.fr then 'fr' 
      when g.en then 'en'
      when g.pl then 'pl'
    end
    

    See the demo.

    Or if your version of SQLite allows the use of window functions:

    with 
      cte(id, lang) as (
        select * from 
        (values (1, 'fr'), (2, 'en'), (3, 'pl')) 
      ),
      langs as (   
        select t.*,
          row_number() over (partition by t.page_id order by c.id) rn
        from cte c inner join tablename t
        on t.lang = c.lang
      )  
    select row_id, page_id, lang, title, slug 
    from langs  
    where rn = 1
    order by page_id 
    

    See the demo.

    Results:

    | row_id | page_id | lang | title         | slug          |
    | ------ | ------- | ---- | ------------- | ------------- |
    | 2      | 1       | fr   | Bienvenue!    | bienvenue     |
    | 4      | 2       | en   | Subpage       | subpage       |
    | 5      | 3       | pl   | Podstrona 2   | podstrona-2   |
    | 6      | 4       | fr   | Coordonnées   | coordonnees   |
    | 8      | 5       | en   | Subpage Lvl 2 | subpage-lvl-2 |
    | 10     | 6       | en   | Level 3       | lvl-3         |
    | 13     | 7       | fr   | Erreur 404    | 404           |
    
    Login or Signup to reply.
  2. This is a type of prioritization query. I would just use a case expression with row_number():

    select t.*
    from (select t.*,
                 row_number() over (case lang when 'fr' then 1 when 'en' then 2 when 'pl' then 3 else 4 end) as seqnum
          from t
         ) t
    where seqnum = 1;
    

    In older versions of SQLite, you can use aggregation and coalesce():

    select page_id,
           coalesce( max(case when lang = 'fr' then title end),
                     max(case when lang = 'en' then title end),
                     max(case when lang = 'po' then title end),
                     max(title)  -- catchall for everything else
                   ) as title,
           coalesce( max(case when slug = 'fr' then title end),
                     max(case when slug = 'en' then title end),
                     max(case when slug = 'po' then title end),
                     max(slug)  -- catchall for everything else
                   ) as slug
    from t
    group by page_id;
    

    Or, use a correlated subquery:

    select t.*
    from t
    where t.lang = (select t2.lang
                    from t t2
                    where t2.page_id = t.page_id
                    order by (case t2.lang when 'fr' then 1 when 'en' then 2 when 'po' then 3 else 4 end)
                    limit 1
                   ) ;
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search