How can I group strings by longest shared ending substring in JavaScript?

AliHasan
May 18, 2023
184 views
0 votes
2 Answers

I have a strange problem. I am trying to write code that groups strings that end with the longest same substring.

For example, I have a collection of strings:

["samsung.phone.com", "lg.phone.com", "phone.com", "camera.dsrl.nikon.com", "amd.gpu.com", "intel.cpu.com" ]

I want to create a dictionary where they will be grouped by the longest-ending string (excluding the last part – .com).

The dictionary should look like this:

{
"phone.com" : ["lg.phone.com", "samsung.phone.com"],
"camera.dsrl.nikon.com" : [], 
"amd.gpu.com": [], 
"intel.cpu.com": []
}

And if I add "cpu.com" to the collection, the new dictionary should look like this:

{
"phone.com" : ["lg.phone.com", "samsung.phone.com"],
"camera.dsrl.nikon.com" : [], 
"amd.gpu.com" : [], 
"cpu.com": ["intel.cpu.com"]
}

And if I add "hello.samsung.phone.com" to the collection, the new dictionary should look like this:

{
"phone.com": ["lg.phone.com"],
"samsung.phone.com": ["hello.samsung.phone.com"]
"camera.dsrl.nikon.com": [], 
"amd.gpu.com": [], 
"cpu.com": ["intel.cpu.com"]
}

Any ideas ?

Tags: javascript reactjs

Answers

Not the prettiest code but it works (i think):

Make each element of the initial list x a key of a dictionary dict.
For each element A of x, find all other elements B which have A as subdomain.
Push B to corresponding list in dict.

Deal with corner cases.

const x = [
  "samsung.phone.com",
  "lg.phone.com",
  "phone.com",
  "camera.dsrl.nikon.com",
  "amd.gpu.com",
  "intel.cpu.com",
  "cpu.com",
  "hello.samsung.phone.com"
];
const dict = {};
x.forEach((el) => (dict[el] = []));
// console.log(dict);

for (let i = 0; i < x.length; i++) {
  for (let j = 0; j < x.length; j++) {
    if (i !== j) {
      const subdomain = x[j].substring(x[j].indexOf(".") + 1);
      if (subdomain === x[i]) {
        // x[i] is the subdomain of x[j]
        dict[x[i]].push(x[j]);
      }
    }
  }
}

let deleteKeys = [];
for (const [key, value] of Object.entries(dict)) {
  deleteKeys = [...deleteKeys, ...value];
}

deleteKeys.forEach((x) => {
  if (dict[x].length == 0) delete dict[x];
});

const toRemove = Object.keys(dict);
for (const [key, value] of Object.entries(dict)) {
  dict[key] = value.filter(function(el) {
    return !toRemove.includes(el);
  });
}

console.log(dict);

It’s best to step through the above algorithm with breakpoints and console logs to understand what’s happening.

You can do so using a regex like the one below

(?<!.)([^.]+)(?=.).(.*)

What it does is simply splitting the given domain into a subdomain and main domain.

The code would look like this:

const regex = /(?<!.)([^.]+)(?=.).(.*)/

const list = ["samsung.phone.com", "lg.phone.com", "phone.com", "camera.dsrl.nikon.com", "amd.gpu.com", "intel.cpu.com"];

const result = {};

list.forEach(d => {
  const matches = d.match(regex);

  // matches[1] is the subdomain and 
  // matches[2] is the main domain
  // if an entry with the given main domain exists, 
  // just push the new one to it, 
  // else create a new entry
  result[matches[2]] ? result[matches[2]].push(d) : result[matches[2]] = [d]
})

console.log(result);

The output would be like this:

{
    phone.com:["samsung.phone.com", "lg.phone.com"]
    com:["phone.com"]
    ...
}

Please signup or login to give your own answer.

Click here to cancel reply.