skip to Main Content

I need help with complex tree walking.

I should find a match in a tree, then find another item at the SAME path depth. EG it will ALWAYS be at the same amount of layers deep, but will appear in SOME parent array in a different position

  1. Search the object recursively to find the TEXT_TO_MATCH
  2. Reverse walk though every array it finds, until it finds another item
  3. Search that item to see if it has a path SIMILAR to the TEXT_TO_MATCH path but just in a different position in some PARENT array

EG: TEXT_TO_MATCH exists at .a[0].b.c.[0].value,

It should figure out the next value could exist at:

.a[ANY_NUMBER].b.c.[ANY_NUMBER].value

findObjectInTree(DATA, TEXT_TO_MATCH)

For example, my input data looks like this:

{
  a:[
    {b:{c:[{value:'a'}]}},
    {b:{c:[{foo:1},{value:'aa'}]}},
    {b:{c:[{value:'aaa'}]}}
  ]
}

The result would be:
['a','aa','aaa']

I’m open for any tools or libraries that might help- this feels very complex. The actual data my be larger and more nested- but a text match will always exist on the SAME path but at a different index of a ARRAY.

2

Answers


  1. A solution to gather values at the same level regardless of key names:

    const obj = {
        a: [
            { b: { c: [{ value: 'a' }] } },
            { b: { c: [{ foo: 1 }, { value: 'aa' }] } },
            { b: { c: [{ value: 'aaa' }] } }
        ]
    };
    
    let level;
    const collect = [];
    
    walk(obj, 'a');
    
    console.log(collect);
    
    function walk(obj, str, deep = 0) {
    
        for (const val of Object.values(obj)) {
    
            if (Array.isArray(val) || val.__proto__.constructor.name === 'Object') {
                walk(val, str, deep + 1);
                continue;
            }
            if (typeof val === 'string' && val.includes(str)) {
                level ??= deep;
                if (level === deep) {
                    collect.push(val);
                }
    
            }
        }
    }
    Login or Signup to reply.
  2. This looks like a depth-first search (DFS), with some modifications to keep track of the path we have taken while traversing the object. When we find a match, we can then iterate through all the other elements at the same depth (i.e., in the same parent array) to find other matches.

    This solution does not require any external libraries and should be able to handle large and deeply nested data structures, as long as they fit in memory. However, if you are dealing with very large data structures, you might need to look into more efficient algorithms or data structures.

    That also needs to store the path of the first found match, and then looking for matches at that path (ignoring the array index) in all subsequent array elements we encounter in our traversal.
    Something like:

    function findObjectInTree(data, textToMatch) {
        let result = [];
        let matchPath = null;
    
        function DFS(node, path = []) {
            if (Array.isArray(node)) {
                for (let i = 0; i < node.length; i++) {
                    DFS(node[i], [...path, i]);
                }
            } else if (typeof node === 'object' && node !== null) {
                for (let key in node) {
                    DFS(node[key], [...path, key]);
                }
            } else if (node === textToMatch) {
                if (matchPath === null) {
                    // Store the path of the first match we find
                    matchPath = path;
                    result.push(node);
                } else if (pathsMatch(path, matchPath)) {
                    // If we have found a match before and the paths match (ignoring array indices), 
                    // add the current node to the result
                    result.push(node);
                }
            }
        }
    
        function pathsMatch(path1, path2) {
            if (path1.length !== path2.length) {
                return false;
            }
    
            for (let i = 0; i < path1.length; i++) {
                if (!Number.isInteger(path1[i]) && path1[i] !== path2[i]) {
                    return false;
                }
            }
    
            return true;
        }
    
        DFS(data);
        return result;
    }
    

    The DFS function stores the path of the first match it finds in the matchPath variable. For all subsequent nodes, it compares their path to the matchPath (ignoring array indices) and adds them to the result if they match. The comparison is done in the pathsMatch function, which returns false if the paths do not match or if they are different lengths.

    The function should work for larger and more deeply nested data structures, as it uses a depth-first search (DFS) algorithm, which is capable of traversing such structures. However, it is important to note that for very large data structures, you may run into performance issues or limitations with JavaScript’s recursion depth.

    If performance becomes an issue, there are a few strategies you could consider:

    1. Iterative DFS: Instead of using recursion for the DFS, you could use an iterative approach with a stack. This would avoid the potential for a stack overflow error with deep recursion, but would require some modification to keep track of the path to each node.

    2. Parallelization: If your data is very large and you have access to a machine with multiple cores, you could consider using a parallel algorithm to search different parts of the data concurrently. This would be more complex to implement and would only be beneficial if the data is large enough.

    3. Indexing: If you are frequently searching the same data structure, it might be beneficial to build an index that allows you to quickly find all nodes at a given path. Building the index would have an upfront cost, but would make subsequent searches much faster.

    Remember, these are more advanced strategies, and the best approach depends on the specifics of your use case and data. The function I provided should work well for most reasonably-sized data structures, and it is always a good idea to start simple and only add complexity if necessary.

    A library like EmilStenstrom/json-traverse provides a way to parse JSON data using a simple query syntax. It makes parsing JSON easy and compact. You could install it using pip and then use the JsonTraverseParser object to traverse the JSON data.

    This library can find all elements matching a specific path in the JSON structure and return their values. However, it seems to lack the ability to dynamically adjust the path to match similar structures at different indices of arrays, which is part of your requirement.

    Most JSON traversal libraries provide mechanisms to navigate and extract data from a JSON object, but they usually require exact paths or keys to find the data.

    You might need to create a custom function or extend an existing library to perform the exact operation you’re looking for. The custom function I provided earlier can be a starting point.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search