skip to Main Content

I have the following json that is quite complex and detailed.

I need to extrapolate information (text) from the json only if the resource-id contains the following text "com.shazam.android:id/" and if it contains the following props (["title", "subtitle", "datetime"]).

If last element does not contain a datetime then resorceIdFinal is not inserted.

Below is an example json I want to get.

Here you can find the json I use, I couldn’t insert it here due to character space.

Link: https://pastebin.com/raw/hFLmP2T8

const check = (obj, keys) => !keys.some((el) => !obj.hasOwnProperty(el));

const isObjectEmpty = (objectName) => {
  return (
    objectName &&
    Object.keys(objectName).length === 0 &&
    objectName.constructor === Object
  );
};

const getChildren = (el, replaceText, props, resorceIdFinal) => {
  var arr = [];

  el.forEach((el) => {
    let resorceId = getResorceId(el.attributes, replaceText, props, resorceIdFinal);
    if(!isObjectEmpty(resorceId)) arr.push(resorceId);   
    //console.log(arr);
    getChildren(el.children, replaceText, props, resorceIdFinal);
  });
  
  return arr;
};


const getResorceId = (el, replaceText, props, resorceIdFinal) => {
  var o = {};
  if (el["resource-id"] !== undefined) {
    var resorceId = el["resource-id"].replace(replaceText, "");
    if (props.includes(resorceId)) {
      o = { [resorceId]: el.text };
    }
  }
  return o;
};

function readPro(json, replaceText, props, filterA, resorceIdFinal) {
  var arr = [];

  json.forEach((el) => {
    arr.push(getChildren(el.children, replaceText, props, resorceIdFinal));
  });
  
  console.log(arr)

  filtered = arr.filter(
    ((s) => (o) =>
      ((k) => !s.has(k) && s.add(k))(filterA.map((k) => o[k]).join("|")))(
        new Set()
      )
  );

  return filtered;
}


var res = readPro(
  a.children,
  "com.shazam.android:id/",
  ["title", "subtitle", "datetime"],
  ["title", "subtitle"],
  "datetime"
);

console.log(res);

Json result I would like to get:

[
   {
        "title": "Believe",
        "subtitle": "Chuther",
        "datetime": "12 giu, 16:42"
   },
   {
        "title": "시작 (Inst.)"
        "subtitle": "Gaho"
        "datetime": "12 giu, 16:42"
   },
   {
        "title": "Give It Up"
        "subtitle": "KC and the Sunshine Band"
        "datetime": "12 giu, 16:41"
   },
   {
        "title": "GRAVITY"
        "subtitle": "Jong Ho"
        "datetime": "12 giu, 16:41"
    }
]

Can you give me a hand?

2

Answers


  1. This function recursively calls itself, but ignores the result of that call:

    const getChildren = (el, replaceText, props, resorceIdFinal) => {
      var arr = [];
    
      el.forEach((el) => {
        let resorceId = getResorceId(el.attributes, replaceText, props, resorceIdFinal);
        if(!isObjectEmpty(resorceId)) arr.push(resorceId);   
        //console.log(arr);
        getChildren(el.children, replaceText, props, resorceIdFinal);
      });
      
      return arr;
    };
    

    Nothing is ever done with the return value of getChildren(el.children, replaceText, props, resorceIdFinal). Perhaps the intent was to add that result to the arr array?:

    const getChildren = (el, replaceText, props, resorceIdFinal) => {
      var arr = [];
    
      el.forEach((el) => {
        let resorceId = getResorceId(el.attributes, replaceText, props, resorceIdFinal);
        if(!isObjectEmpty(resorceId)) arr.push(resorceId);   
        //console.log(arr);
        arr.push(...getChildren(el.children, replaceText, props, resorceIdFinal));
      });
      
      return arr;
    };
    

    Note that there is a lot of code here, and this answer makes no guarantee that this is the only problem in the code shown. But it at least appears to be the most relevant to the problem described. You are encouraged to use a debugger to narrow down any additional problems, asking a separate question if any are found.

    Login or Signup to reply.
  2. In continuation from comments, provided your requirements haven’t changed significantly, the data to be extracted can be done so by treating the json string on a "line by line" basis.

    From looking at the desired output and the source json, it is the text attributes in the json structure that then become title, subtitle, and datetime. As such, when any attribute named text comes along (linearly), then the algorithm saves it as being the last encountered. Next when a desired sentinel is encountered, each of these:

    com.shazam.android:id/title
    com.shazam.android:id/subtitle
    com.shazam.android:id/datetime
    

    … The previously (or last) saved text attribute is assigned to be that property.
    When datetime is recorded, this is assumed to be the end of the record and it is pushed onto an array so that the next record if any can then be progressively built in the same fashion.

    The rest is explained by way the code itself and code comments.

    This is a working example. Copy and paste the json into the textarea then click the button. The json must be complete because it tries to parse it first, then stringifies so that its properly separated into lines. These steps can be commented out when you already know the input is separated into lines.

    function extract( input_json ) {    
            
        // In the event that the source json is not already broken into lines:
        try {
            let _parsed = JSON.parse( input_json );
            input_json = JSON.stringify(_parsed , null, 1);
        }
        catch (e) {
            console.log("Error parsing json");
            return false;
        }
        
        
        // Turn into an array, one line per index
        
        let json_lines = input_json.split("n");
        
        
        // Now onto a linear scan/search to gather the pertinent data
        
        let output = []; // this will hold the end result
        
        let last_passed_text = ''; // this will be the last "text" attribute's value encountered
        
        
        let current_record = {}; // this will hold the current record being built progressively
        
        
        for (let i = 0; i < json_lines.length; i++) {
            
            let textPos = json_lines[i].indexOf('"text"');
            
            if ( textPos > -1) {
            
                let _firstColon = json_lines[i].indexOf(":");
                
                if (_firstColon > textPos) {
            
                // split by colon since `"text" :`  indicates definition desired
                
                    let relevantPart = json_lines[i].substring( _firstColon + 1).trim(); // characters after colon to end of string, trimmed
                
                    relevantPart = decodeURIComponent( relevantPart ); // just in case
                    
                    // assume that characters after colon behave as `"abcdef",` and if not it doesn't matter, e.g. boolean values
                    
                    let token = relevantPart.substring(1,relevantPart.length-2); // get value between the assumed double quotes, -2 because of trailing comma, if any
                    
                    last_passed_text = token.toString(); // de-ref
                    
                    // console.log("last_passed_text",last_passed_text);
                    
                }
            }
            
            if ( json_lines[i].indexOf("resource-id") > -1) {
                
                
                
                if ( json_lines[i].indexOf("com.shazam.android:id/title") > -1) {
                    
                    // presence of title always indicates where to start collecting
                    
                    current_record = {};
                    current_record.title = last_passed_text.toString(); // de-ref
                    
                    //console.log("title", current_record.title);
                    
                    // console.log( json_lines[i], last_passed_text );
                    
                    
                }
                else if ( json_lines[i].indexOf("com.shazam.android:id/subtitle") > -1) {
                    current_record.subtitle = last_passed_text.toString(); // de-ref
                    
                    // console.log( json_lines[i], last_passed_text );
                }
                else if ( json_lines[i].indexOf("com.shazam.android:id/datetime") > -1) {
                    current_record.datetime = last_passed_text.toString(); // de-ref
                    
                    // presence of datetime always indicates where to end collecting
                    
                    output.push( JSON.parse(JSON.stringify( current_record )) );
                    
                    // console.log( json_lines[i], last_passed_text );
                    
                    
                }
            }
        
        }
        
        
        // console.log(output);
        
        return output;
    }
    Paste json here: (input)<br>
    <textarea id="input-json" rows=15 cols=50></textarea>
    <hr>
    <button type="button" onclick="
        console.log( 
            extract( 
                document.getElementById('input-json').value
            ) 
        );">Extract To Console</button>
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search