skip to Main Content

Basically I have a lot of language codes it, en, en-GB, de, de-CH, and so on... and from these I need to get a full locale code format: LANGCODE-COUNTRYCODE with the default country of the language if the country code is not already specified.

An example of what I mean/need:

INPUT      OUTPUT  

it     ->  it-IT  
it-IT  ->  it-IT  
en-GB  ->  en-GB  
en     ->  en-US  
es-AR  ->  es-AR   
es-MX  ->  es-MX 
es     ->  es-ES  

is there any library I’m unaware of or a simple way of achieving this in PHP?

I’ve tried finding solutions on google a lot but either it doesn’t exist or I’m just using the wrong keywords…
Do I really have to make a manual array of this by hand? there must be a better way, I’m sure!

2

Answers


  1. Chosen as BEST ANSWER

    Thanks to the help of iso.org and localeplanet.com plus some good old googling and a lot of elbow grease, I came up with this list below. It might not be perfect, but it will do the job for me... I Hope it can be of help to others!

    <?php
        
        $defaultLocales = [
            'aa' => 'aa-ET',
            'ab' => 'ab-GE',
            'af' => 'af-ZA',
            'am' => 'am-ET',
            'ar' => 'ar',
            'as' => 'as-IN',
            'ay' => 'ay-BO',
            'az' => 'az-AZ',
            'ba' => 'ba-RU',
            'be' => 'be-BY',
            'bg' => 'bg-BG',
            'bh' => 'bh-IN',
            'bi' => 'bi-VU',
            'bn' => 'bn-IN',
            'bo' => 'bo-CN',
            'br' => 'br-FR',
            'ca' => 'ca-ES',
            'co' => 'co-FR',
            'cs' => 'cs-CZ',
            'cy' => 'cy-GB',
            'da' => 'da-DK',
            'de' => 'de-DE',
            'div' => 'div-MV',
            'dz' => 'dz-BT',
            'el' => 'el-GR',
            'en' => 'en-US',
            'eo' => 'eo',
            'es' => 'es-ES',
            'et' => 'et-EE',
            'eu' => 'eu-ES',
            'fa' => 'fa',
            'fi' => 'fi-FI',
            'fj' => 'fj-FJ',
            'fo' => 'fo-FO',
            'fr' => 'fr-FR',
            'fy' => 'fy-NL',
            'ga' => 'ga-IE',
            'gd' => 'gd-IE',
            'gl' => 'gl-ES',
            'gn' => 'gn-PY',
            'gu' => 'gu-IN',
            'ha' => 'ha-NG',
            'he' => 'he-IL',
            'hi' => 'hi-IN',
            'hr' => 'hr-HR',
            'hu' => 'hu-HU',
            'hy' => 'hy-AM',
            'ia' => 'ia',
            'id' => 'id-ID',
            'ie' => 'ie',
            'ik' => 'ik-US',
            'in' => 'in-ID',
            'is' => 'is-IS',
            'it' => 'it-IT',
            'iw' => 'iw-IL',
            'ja' => 'ja-JP',
            'ji' => 'ji',
            'jw' => 'jw-ID',
            'ka' => 'ka-GE',
            'kk' => 'kk-KZ',
            'kl' => 'kl-GL',
            'km' => 'km-KH',
            'kn' => 'kn-IN',
            'ko' => 'ko-KR',
            'kok' => 'kok-IN',
            'ks' => 'ks-IN',
            'ku' => 'ku-IQ',
            'ky' => 'ky-KG',
            'kz' => 'kz-KG',
            'la' => 'la',
            'ln' => 'ln-CD',
            'lo' => 'lo-LA',
            'ls' => 'ls-SI',
            'lt' => 'lt-LT',
            'lv' => 'lv-LV',
            'mg' => 'mg-MG',
            'mi' => 'mi-NZ',
            'mk' => 'mk-MK',
            'ml' => 'ml-IN',
            'mn' => 'mn-MN',
            'mo' => 'mo-MD',
            'mr' => 'mr-IN',
            'ms' => 'ms-MY',
            'mt' => 'mt-MT',
            'my' => 'my-MM',
            'na' => 'na-NA',
            'nb' => 'nb-NO',
            'ne' => 'ne-NP',
            'nl' => 'nl-NL',
            'nn' => 'nn-NO',
            'oc' => 'oc-FR',
            'om' => 'om-IN',
            'or' => 'or-IN',
            'pa' => 'pa-PK',
            'pl' => 'pl-PL',
            'ps' => 'ps-AF',
            'pt' => 'pt-PT',
            'qu' => 'qu-PE',
            'rm' => 'rm-IT',
            'rn' => 'rn-BI',
            'ro' => 'ro-RO',
            'ru' => 'ru-RU',
            'rw' => 'rw-RW',
            'sa' => 'sa-IN',
            'sb' => 'sb-DE',
            'sd' => 'sd-PK',
            'sg' => 'sg-CF',
            'sh' => 'sh-BA',
            'si' => 'si-LK',
            'sk' => 'sk-SK',
            'sl' => 'sl-SL',
            'sm' => 'sm-WS',
            'sn' => 'sn-ZW',
            'so' => 'so-SO',
            'sq' => 'sq-AL',
            'sr' => 'sr-RS',
            'ss' => 'ss-ZA',
            'st' => 'st-ZA',
            'su' => 'su-SD',
            'sv' => 'sv-SE',
            'sw' => 'sw-TZ',
            'sx' => 'sx-ZA',
            'syr' => 'syr',
            'ta' => 'ta-IN',
            'te' => 'te-IN',
            'tg' => 'tg-TJ',
            'th' => 'th-TH',
            'ti' => 'ti-ER',
            'tk' => 'tk-TM',
            'tl' => 'tl-PH',
            'tn' => 'tn-PH',
            'to' => 'to-TO',
            'tr' => 'tr-TR',
            'ts' => 'ts-ZA',
            'tt' => 'tt-RU',
            'tw' => 'tw-GH',
            'uk' => 'uk-UA',
            'ur' => 'ur-PK',
            'us' => 'us-US',
            'uz' => 'uz-UZ',
            'vi' => 'vi-VN',
            'vo' => 'vo',
            'wo' => 'wo-SN',
            'xh' => 'xh-ZA',
            'yi' => 'yi',
            'yo' => 'yo-BJ',
            'zh' => 'zh-CN',
            'zu' => 'zu-ZA'
        ];
        
        function getLocaleFromLang($lang) {
            global $defaultLocales;
            return $defaultLocales[$lang] ?? $lang;
        }
        
    ?>
    

    If you have any suggestion on how I might emproove it, feel free to comment below!


  2. The rules you describe in your follow-up comment are nothing but your subjective opinion about what countries are "more important" in regards of a given language. You pick Spain arguing Spanish is originally from there, but then pick United States for English (German is not mentioned, but it probably pre-dates Germany itself). You won’t find an algorithm for such ruleset, so there’s no other way than composing our own hard-coded list. Once you do that, the PHP portion can be as simple as an array lookup:

    $defaults = [
        'it' => 'it-IT',
        'en' => 'en-US',
        'es' => 'es-ES',
        // ...
    ];
    $requests = [
        'it',
        'it-IT',
        'en-GB',
        'en',
        'es-AR',
        'es-MX',
        'es',
    ];
    foreach ($requests as $input) {
        $output = $defaults[$input] ?? $input;
        echo "$input -> $outputn";
    }
    

    Demo

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search