skip to Main Content

I’m trying to extract some of the data from an HTML response I’m getting after executing an API in Python. Here is the HTML response I get:

<?xml version="1.0" ?>
 <mgmtResponse responseType="operation" requestUrl="https://6.7.7.7/motion/api/v1/op/enablement/ethernet/summary?deviceIpAddress=10.3.4.3" rootUrl="https://6.7.7.7/webacs/api/v1/op">
   <ethernetSummaryDTO>
     <CoredIdentityCapable>false</CoredIdentityCapable>
     <currentIcmpLatency>0</currentIcmpLatency>
     <deviceAvailability>100</deviceAvailability>
     <deviceName>TRP5504.130.Cored.com</deviceName>
     <deviceRole>Unknown</deviceRole>
     <deviceType>Cored TRP 5504</deviceType>
     <ipAddress>10.3.4.3</ipAddress>
     <locationCapable>false</locationCapable>
     <nrPortsDown>49</nrPortsDown>
     <nrPortsUp>16</nrPortsUp>
     <reachability>Reachable</reachability>
     <softwareVersion>7.8.1</softwareVersion>
     <stackCount>0</stackCount>
     <systemTime>2023-Apr-16, 12:47:51 IST</systemTime>
     <udiDetails>
       <description>TRP5500 4 Slot Single Chassis</description>
       <modelNr>TRP-5504</modelNr>
       <name>Rack 0</name>
       <productId>TRP-5504</productId>
       <udiSerialNr>FOX2304P14Z</udiSerialNr>
       <vendor>Cored Systems, Inc.</vendor>
       <versionId>V01</versionId>
     </udiDetails>
     <upTime>87 days 20 hrs 40 mins 27 secs</upTime>
   </ethernetSummaryDTO>
 </mgmtResponse>

Basically, I want to extract data like deviceName and softwareVersion, and `udiSerialNr’ from the HTML response. I tried the following code:

      if response.status_code == 200:
                #resp = response.text
                resp = response.json()
                api_resp = resp["ethernetSummaryDTO"]
                print(api_resp)

so I tried to convert it to JSON, but I end with below error:

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

How can I parse this to extract the required data?

2

Answers


  1. Please parse the response text as following

    import xml.etree.ElementTree as ET
    
    if response.status_code == 200:
        # resp = response.text
        resp = response.text
        parse_response(resp)
        
    def parse_response(xml_string):
        root = ET.fromstring(xml_string)
    
        ethernet_summary = root.find('ethernetSummaryDTO')
        device_name = ethernet_summary.find('deviceName').text
        device_type = ethernet_summary.find('deviceType').text
        ip_address = ethernet_summary.find('ipAddress').text
        nr_ports_down = int(ethernet_summary.find('nrPortsDown').text)
        nr_ports_up = int(ethernet_summary.find('nrPortsUp').text)
        software_version = ethernet_summary.find('softwareVersion').text
        up_time = ethernet_summary.find('upTime').text
    
        data = {
            'device_name': device_name,
            'device_type': device_type,
            'ip_address': ip_address,
            'nr_ports_down': nr_ports_down,
            'nr_ports_up': nr_ports_up,
            'software_version': software_version,
            'up_time': up_time
        }
    
        return data
    
    Login or Signup to reply.
  2. Given your response (I will assign it to a variable, as if I’ve got it from an API call):

    xml_data = '''<?xml version="1.0" ?>
     <mgmtResponse responseType="operation" requestUrl="https://6.7.7.7/motion/api/v1/op/enablement/ethernet/summary?deviceIpAddress=10.3.4.3" rootUrl="https://6.7.7.7/webacs/api/v1/op">
       <ethernetSummaryDTO>
         <CoredIdentityCapable>false</CoredIdentityCapable>
         <currentIcmpLatency>0</currentIcmpLatency>
         <deviceAvailability>100</deviceAvailability>
         <deviceName>TRP5504.130.Cored.com</deviceName>
         <deviceRole>Unknown</deviceRole>
         <deviceType>Cored TRP 5504</deviceType>
         <ipAddress>10.3.4.3</ipAddress>
         <locationCapable>false</locationCapable>
         <nrPortsDown>49</nrPortsDown>
         <nrPortsUp>16</nrPortsUp>
         <reachability>Reachable</reachability>
         <softwareVersion>7.8.1</softwareVersion>
         <stackCount>0</stackCount>
         <systemTime>2023-Apr-16, 12:47:51 IST</systemTime>
         <udiDetails>
           <description>TRP5500 4 Slot Single Chassis</description>
           <modelNr>TRP-5504</modelNr>
           <name>Rack 0</name>
           <productId>TRP-5504</productId>
           <udiSerialNr>FOX2304P14Z</udiSerialNr>
           <vendor>Cored Systems, Inc.</vendor>
           <versionId>V01</versionId>
         </udiDetails>
         <upTime>87 days 20 hrs 40 mins 27 secs</upTime>
       </ethernetSummaryDTO>
     </mgmtResponse>'''
    

    You can use the xml.etree.ElementTree module to parse it.

    For example:

    import xml.etree.ElementTree as ET
    
    # The first element of your XML is the mgmtResponse, I'm directly getting it with [0]
    root = ET.fromstring(xml_data)[0]
    softwareVersion = root.find("softwareVersion").text
    deviceName = root.find("deviceName").text
    
    # For the udiDetails attributes
    udiDetails = root.find("udiDetails")
    udiSerialNr = [det for det in udiDetails if det.tag == "udiSerialNr"][0].text
    # and so on..
    

    The last line to get udiSerialNr is a list comprehension that is allowing to get the value directly from a loop, basically it’s a for loop in one line, equivalent to:

    udiDetails = root.find("udiDetails")
    udiSerialNr = ""
    for det in udiDetails:
        if det.tag == "udiSerialNr":
            udiSerialNr = det.text
    

    Basically, in XML every node is a new list, so mgmtResponse is the first list (made of just one record, ethernetSummaryDTO, that’s why I’ve directly set ET.fromstring(xml_data)[0] to fetch it).

    ethernetSummaryDTO is the second list, but we don’t iterate thought it, we use the .find method to get the attribute (eg softwareVersion).

    udiDetails is just another list, I’ve used a for loop to get its attibutes, but I’ve just tried and we can use .find() again, which is making it easier, without unnecessary code:

    udiDetails = root.find("udiDetails")
    udiSerialNr = udiDetails.find("udiSerialNr").text
    

    Way better!

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search