skip to Main Content

i want to remove some string and save other part of string that i need from a file with emeditor ..

file line like :

{"message":"{"_":"user","pFlags":{"contact":true},"user_flags":2143,"id":702212125,"access_hash":"914250561826","first_name":"david","last_name":"jones","username":"david_d192","phone":"051863329875","status":{"_":"userStatusRecently"}}","phone":"051863329875","version":"3","type":"unknown","token":"1556189892619764206","p_id":702212125,"username":"david_d192","type":"redis","user_flags":2143,"host":"win",from":"contacts"}
{"index": {"_type": "_doc", "_id": "36GG54F"}}

{"message":"{"_":"user","pFlags":{"contact":true},"user_flags":2143,"id":702212125,"access_hash":"914250561826","first_name":"david","last_name":"jones","username":"david_d192","phone":"051863329875","status":{"_":"userStatusRecently"}}","phone":"051863329875","version":"3","type":"unknown","token":"1556189892619764206","p_id":702212125,"username":"david_d192","type":"redis","user_flags":2143,"host":"win",from":"contacts"}
{"index": {"_type": "_doc", "_id": "36GG54F"}}

{"message":"{"_":"user","pFlags":{"contact":true},"user_flags":2143,"id":702212125,"access_hash":"914250561826","first_name":"david","last_name":"jones","phone":"051863329875","status":{"_":"userStatusRecently"}}","phone":"051863329875","version":"3","type":"unknown","token":"1556189892619764206","p_id":702212125,"type":"redis","user_flags":2143,"host":"win",from":"contacts"}
{"index": {"_type": "_doc", "_id": "36GG54F"}}

i want to save id, first_name , last_name , phone , username(if exist) in every line =>

id:702212125 first_name:david last_name:jones phone:051863329875 username:david_d192,
id:702212125 first_name:david last_name:jones phone:051863329875 username:david_d192,
id:702212125 first_name:david last_name:jones phone:051863329875,

how i can do this ?

thanks

2

Answers


  1. JSON parsing is the optimal way to do this (https://linuxconfig.org/how-to-parse-data-from-json-into-python). But you can make life harder and use regex (here presented in PCRE (PHP) flavor):

    Get all id’s:

    (?<=id":s")(w+)(?=")
    

    See example:
    https://regex101.com/r/g5vfEd/1

    Get all first names:

    (?<=first_name\":\")(w)+(?=\)
    

    See example:
    https://regex101.com/r/g5vfEd/2

    Get all last names:

    (?<=last_name\":\")(w)+(?=\)
    

    See example:
    https://regex101.com/r/g5vfEd/3

    Get all phone numbers:

    (?<=phone\":\")(w)+(?=\)
    

    See example:
    https://regex101.com/r/g5vfEd/4

    Get all user names if they exist:

    (?<=username\":\")(w)+(?=\)
    

    See example:
    https://regex101.com/r/g5vfEd/5

    complete pattern to match everything:

    id\?":s?"?(w+),?[\"].*first_name\":\"(w+).*last_name\":\"(w+).*phone":"(d+).*(?=username)?":"(w+).*
    

    Returns 3 matches, each with the following 5 groups (here match 1 is shown):

    Group 1.    85-94   702212125
    Group 2.    145-150 david
    Group 3.    169-174 jones
    Group 4.    285-297 051863329875
    Group 5.    454-462 contacts
    

    See link: https://regex101.com/r/g5vfEd/6

    Login or Signup to reply.
  2. As you’ve tagged regex and Emeditor you can try this.

    Emeditor version 19.1 onwards supports regex named groups like this:

    (?<id>expression) 
    

    and named backreference by using this form:

    k<id>
    

    So steps:

    Find and Replace (Ctrl-H). Tick “Match Case” and select “Regular Expressions”.

    Find:

    \"id\"[\":]*(?<id>[^\":,]*).*?\"first_name\"[\":]*(?<first_name>[^\":,]*).*?\"last_name\"[\":]*(?<last_name>[^\":,]*).*?\"phone\"[\":]*(?<phone>[^\":,]*)(.*?"username"[\":]*(?<username>[^\":,]*))?
    

    Replace with:

    id:k<id>tfirst_name:k<first_name>tlast_name:k<last_name>tphone:k<phone>tusername:k<username>
    

    Click the down Arrow next to the Extract button and select “To New Document”
    Click the Extract button to output to a new tab delimited file.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search