skip to Main Content

I’m trying to figure out how I can take a broken xml file and turn it into a Json every time it updates? The xml file is broken so creating a functional Json isn’t gonna work. I attempted to use jc –xml but the file is of course not formatted properly. I’m new to coding, trying to build my first application. It’s been a crazy experience so far haha. There are two main things here. Nodes and Stations.
location of xml file /var/log/xlxd.xml where I want to generate the Json file /var/http/www/log/dashboard.json

The server is Debian

I have learned so much from this site, months and months of working on my first application. Some things to note that I have learned. Don’t start a project without looking at the source of the data first haha.

?xml version="1.0" encoding="UTF-8"?>
<Version>2.5.3</Version>
<XLX207  linked peers>
</XLX207  linked peers>
<XLX207  linked nodes>
<NODE>
    <Callsign>K9TON   B</Callsign>
    <IP>144.</IP>
    <LinkedModule>A</LinkedModule>
    <Protocol>YSF</Protocol>
    <ConnectTime>Thursday Thu Dec  7 00:33:05 2023</ConnectTime>
    <LastHeardTime>Saturday Sat Dec  9 22:39:02 2023</LastHeardTime>
</NODE>
<NODE>
    <Callsign>AG7AZ   B</Callsign>
    <IP>184.</IP>
    <LinkedModule>A</LinkedModule>
    <Protocol>YSF</Protocol>
    <ConnectTime>Saturday Sat Dec  9 02:06:34 2023</ConnectTime>
    <LastHeardTime>Saturday Sat Dec  9 02:06:34 2023</LastHeardTime>
</NODE>
<NODE>
    <Callsign>K9TON   D</Callsign>
    <IP>98.</IP>
    <LinkedModule>C</LinkedModule>
    <Protocol>DCS</Protocol>
    <ConnectTime>Saturday Sat Dec  9 04:07:43 2023</ConnectTime>
    <LastHeardTime>Saturday Sat Dec  9 04:07:43 2023</LastHeardTime>
</NODE>
<NODE>
    <Callsign>K9TON   B</Callsign>
    <IP>98.</IP>
    <LinkedModule>A</LinkedModule>
    <Protocol>YSF</Protocol>
    <ConnectTime>Saturday Sat Dec  9 04:43:07 2023</ConnectTime>
    <LastHeardTime>Saturday Sat Dec  9 17:52:30 2023</LastHeardTime>
</NODE>
<NODE>
    <Callsign>NC1D    B</Callsign>
    <IP>68.</IP>
    <LinkedModule>C</LinkedModule>
    <Protocol>DCS</Protocol>
    <ConnectTime>Saturday Sat Dec  9 11:05:38 2023</ConnectTime>
    <LastHeardTime>Saturday Sat Dec  9 22:21:38 2023</LastHeardTime>
</NODE>
<NODE>
    <Callsign>AG7AZ   B</Callsign>
    <IP>184.</IP>
    <LinkedModule>C</LinkedModule>
    <Protocol>DCS</Protocol>
    <ConnectTime>Saturday Sat Dec  9 17:38:07 2023</ConnectTime>
    <LastHeardTime>Saturday Sat Dec  9 18:27:56 2023</LastHeardTime>
</NODE>
<NODE>
    <Callsign>N9MAS   B</Callsign>
    <IP>47.</IP>
    <LinkedModule>A</LinkedModule>
    <Protocol>YSF</Protocol>
    <ConnectTime>Saturday Sat Dec  9 18:40:11 2023</ConnectTime>
    <LastHeardTime>Saturday Sat Dec  9 18:42:34 2023</LastHeardTime>
</NODE>
</XLX207  linked nodes>
<XLX207  heard users>
<STATION>
    <Callsign>NC1D    </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 22:39:02 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>AG7AZ   </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 22:37:33 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>K9TON   </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 22:31:53 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>NC1D     / JOEL</Callsign>
    <Via node>NC1D    B</Via node>
    <On module>C</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 22:21:40 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KA5JKI  </Callsign>
    <Via node>KA5JKI  B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 19:55:27 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KB9MQD  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 19:29:03 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KG4ISH  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 19:28:29 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KK7NLO  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 19:07:40 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KA5JKI  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 18:47:19 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>N9MAS   </Callsign>
    <Via node>N9MAS   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 18:42:34 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>AG7AZ    / ID52</Callsign>
    <Via node>AG7AZ   B</Via node>
    <On module>C</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 18:28:05 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KQ4DZY  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 14:02:10 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KO4ZUL  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 13:49:44 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>WH6GVN  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  9 00:54:04 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KG4ISH  </Callsign>
    <Via node>KG4ISH  B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Friday Fri Dec  8 23:57:25 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>K7TTM   </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Friday Fri Dec  8 21:50:40 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KK7OVZ  </Callsign>
    <Via node>KK7OVZ  B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Friday Fri Dec  8 18:44:11 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>AG7AZ   </Callsign>
    <Via node>AG7AZ   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Friday Fri Dec  8 18:44:00 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KI7ZNH  </Callsign>
    <Via node>KI7ZNH  B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Friday Fri Dec  8 17:45:03 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KO4ZUL  </Callsign>
    <Via node>KO4ZUL  B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Friday Fri Dec  8 16:50:46 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>DH1EM    / 4946</Callsign>
    <Via node>DH1EM   B</Via node>
    <On module>C</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Thursday Thu Dec  7 01:23:17 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KD0TGF  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Wednesday Wed Dec  6 22:03:31 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>NC1D    </Callsign>
    <Via node>NC1D    B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Wednesday Wed Dec  6 11:57:57 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>NC1D     / JOEL</Callsign>
    <Via node>NC1D    D</Via node>
    <On module>C</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Wednesday Wed Dec  6 11:12:21 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KF0MCM  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Monday Mon Dec  4 15:12:34 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>KB9MQD   / JACK</Callsign>
    <Via node>KB9MQD  Y</Via node>
    <On module>C</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Monday Mon Dec  4 02:18:01 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>K9TON    / TONY</Callsign>
    <Via node>K9TON   D</Via node>
    <On module>C</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Monday Mon Dec  4 02:05:48 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>IK2WBK  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Sunday Sun Dec  3 04:36:40 2023</LastHeardTime>
</STATION>
<STATION>
    <Callsign>PU2NMN  </Callsign>
    <Via node>K9TON   B</Via node>
    <On module>A</On module>
    <Via peer>XLX207  </Via peer>
    <LastHeardTime>Saturday Sat Dec  2 23:41:32 2023</LastHeardTime>
</STATION>
</XLX207  heard users>

I tried to research xml and understand that the xml generated is not properly formatted.

2

Answers


  1. The right solution is to fix the XML text and then you can parse it with a standard XML parser.

    So, lets see whats wrong.

    First off, I am assuming that the missing initial < character is just a cut-and-paste error.

    Next is the fact that, as you indicate in a comment, the file does not have a "root" tag. Inserting a <root> after the header and </root> at the end of the entire body is pretty straight forward.

    Next is the improper attributes. XML wants attributes to be key/value pairs,
    and does not permit attributes in closing tags. Since all of the attributes appear to be types for the tag, and no proper key/value pairs exist in the file, it would be simplest to just take all of these attributes like linked peers and convert them to type="linked peers" in the opening tags and remove all attributes from the closing tags.

    The following code gets your file to pass xmllint for me.

    import re
    
    with open('input.xml', 'r') as fh:
        xml = fh.read()
    
    xml = xml.replace('?>', '?><root>') + "</root>"
    
    bad_opening_tag_re = r'<([^?][^ >]*) +([^>]+)>'
    bad_opening_replacement = r'<1 type="2">'
    bad_closing_tag_re = r'</([^ >]+) +([^>]+)>'
    bad_closing_replacement = r'</1>'
    
    xml = re.sub(bad_opening_tag_re, bad_opening_replacement, xml)
    xml = re.sub(bad_closing_tag_re, bad_closing_replacement, xml)
    
    print(xml)
    
    Login or Signup to reply.
  2. A single sed command to fix the sample AS IS

    sed -re '1 s/.*/<&n<root>/; :a; s/([^<]*<)([/]?[[:alnum:]]+) +([[:alnum:]]+.*>)/123/; t a; $ s/.*/&n</root>/' tmp.txt
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search