skip to Main Content

I create table in PostgreSQL and insert to table nested XML(xml_part column):

CREATE TABLE text_xml (
    id numeric,
    xml_part XML
);


insert into text_xml
values (1, '<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Dont forget me this weekend!</body></note>');

insert into text_xml
values (2, '<note><to>Tove</to><from>Alex</from><heading>Reminder</heading><body>Dont forget me this weekend!</body></note>');

Further, I try to read the table in Python with sqlalchemy and pandas:

from sqlalchemy import create_engine
import pandas as pds

# Create an engine instance
alchemyEngine = create_engine('postgresql+psycopg2://<credentials>', pool_recycle=3600)

# Connect to PostgreSQL server
dbConnection = alchemyEngine.connect()

# Read data from PostgreSQL database table and load into a DataFrame instance
dataFrame = pds.read_sql("select * from text_xml", dbConnection)

print(dataFrame.to_xml(index=False))

and I got in print:

<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <id>2.0</id>
    <xml_part>&lt;note&gt;&lt;to&gt;Tove&lt;/to&gt;&lt;from&gt;Alex&lt;/from&gt;&lt;heading&gt;Reminder&lt;/heading&gt;&lt;body&gt;Dont forget me this weekend!&lt;/body&gt;&lt;/note&gt;</xml_part>
  </row>
  <row>
    <id>1.0</id>
    <xml_part>&lt;note&gt;&lt;to&gt;Tove&lt;/to&gt;&lt;from&gt;Jani&lt;/from&gt;&lt;heading&gt;Reminder&lt;/heading&gt;&lt;body&gt;Dont forget me this weekend!&lt;/body&gt;&lt;/note&gt;</xml_part>
  </row>
</data>

HTML symbols &lt; and &gt; ruin my result. I would like to receive without HTML characters like this:

<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <id>2.0</id>
    <xml_part><note><to>Tove</to><from>Alex</from><heading>Reminder</heading><body>Dont forget me this weekend!</body></note></xml_part>
  </row>
  <row>
    <id>1.0</id>
    <xml_part><note><to>Tove</to><from>Alex</from><heading>Reminder</heading><body>Dont forget me this weekend!</body></note></xml_part>
  </row>
</data>

What am I doing wrong?

2

Answers


  1. You can create elements from string and insert this into the tree again:

    import xml.etree.ElementTree as ET
    
    xml_s = """<?xml version='1.0' encoding='utf-8'?>
    <data>
      <row>
        <id>2.0</id>
        <xml_part>&lt;note&gt;&lt;to&gt;Tove&lt;/to&gt;&lt;from&gt;Alex&lt;/from&gt;&lt;heading&gt;Reminder&lt;/heading&gt;&lt;body&gt;Dont forget me this weekend!&lt;/body&gt;&lt;/note&gt;</xml_part>
      </row>
      <row>
        <id>1.0</id>
        <xml_part>&lt;note&gt;&lt;to&gt;Tove&lt;/to&gt;&lt;from&gt;Jani&lt;/from&gt;&lt;heading&gt;Reminder&lt;/heading&gt;&lt;body&gt;Dont forget me this weekend!&lt;/body&gt;&lt;/note&gt;</xml_part>
      </row>
    </data>"""
    
    root = ET.fromstring(xml_s)
    
    el_l = []
    html = root.findall('.//xml_part')
    for elem in html:
        el = ET.fromstring(elem.text)
        el_l.append(el)
        elem.text = ''
        
    for xml_el in root.findall('.//xml_part'):
        for e in el_l:
            xml_el.insert(0, el_l[0])
            del el_l[0]
            
    
    ET.indent(root, space="  ")
    ET.dump(root)
    

    Output:

    <data>
      <row>
        <id>2.0</id>
        <xml_part>
          <note>
            <to>Tove</to>
            <from>Alex</from>
            <heading>Reminder</heading>
            <body>Dont forget me this weekend!</body>
          </note>
        </xml_part>
      </row>
      <row>
        <id>1.0</id>
        <xml_part>
          <note>
            <to>Tove</to>
            <from>Jani</from>
            <heading>Reminder</heading>
            <body>Dont forget me this weekend!</body>
          </note>
        </xml_part>
      </row>
    </data>
    
    Login or Signup to reply.
  2. I can’t reproduce this behavier with postgres V16 on win 10. Maybe a sqlalchemy issue?

    I created a table:

    create_table_query = '''CREATE TABLE xmldata
              (ID INT,
              xml_part XML); '''
    

    I insert your xml string:

    postgres_insert_query = """ INSERT INTO xmldata (ID, xml_part) VALUES (%s,%s)"""
    record_to_insert = (1, '<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Dont forget me this weekend!</body></note>')
    cursor.execute(postgres_insert_query, record_to_insert)
    

    My query is like this:

        postgreSQL_select_Query = "select * from xmldata"
    
        cursor.execute(postgreSQL_select_Query)
        print("Selecting rows from xmldata table using cursor.fetchall")
        xml_data_row = cursor.fetchall()
    
        print("Print each row and it's columns values")
        for row in xml_data_row:
            print("Id = ", row[0], )
            print("xml_part = ", row[1], "n")
    

    And I get this output:

    Selecting rows from xmldata table using cursor.fetchall
    Print each row and it's columns values
    Id =  1
    xml_part =  <note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Dont forget me this weekend!</body></note> 
    
    PostgreSQL connection is closed
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search