Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have an unclear xml and process it with python lxml module. I want replace all in content with space before any processing, how can I do this work for text of all elements.

edit my xml example:

<root>
    <a> dsdfs
 dsf
 sdf
</a>
    <bds> 
        <d>sdf





</d>
        <d>sdf


sdf
sdf

</d>
    </bds>
    ....
    ....
    ....
    ....
</root>

and i wan't to get this in output when i print ittertext:

root = #get root element
for i in root.ittertext():
   print i

dsdfs  dsf  sdf
dsdfs  dsf  sdf
sdf  nsdf sdf  
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
283 views
Welcome To Ask or Share your Answers For Others

1 Answer

Below code will parse the xml into a string, then replace with space and then write to a new xml file. You can do other processing in between, depending what exactly you want to do.

from lxml import etree 
tree = etree.parse('some.xml') 
root = tree.getroot()
# Get the whole XML content as  string
xml_in_str = etree.tostring(root)

# Replace all 
 with space
new_xml_data = xml_in_str.replace(r'
', ' ')

# Do the processing with the new_xml_data string which is formatted

# Maybe also write to a new XML file, without the 

with open('newxml.xml', 'w') as f:
    f.write(new_xml_data)

some.xml looks like:

<root>
    <a> dsdfs
 dsf
 sdf
</a>
    <bds> 
        <d>sdf





</d>
        <d>sdf


sdf
sdf

</d>
    </bds>
    <bds> 
        <d>sdf





</d>
        <d>sdf


sdf
sdf

</d>
    </bds>
    <bds> 
        <d>sdf





</d>
        <d>sdf


sdf
sdf

</d>
    </bds>
</root>

newxml.xml looks like:

<root>
    <a> dsdfs  dsf  sdf </a>
    <bds> 
        <d>sdf      </d>
        <d>sdf   sdf sdf  </d>
    </bds>
    <bds> 
        <d>sdf      </d>
        <d>sdf   sdf sdf  </d>
    </bds>
    <bds> 
        <d>sdf      </d>
        <d>sdf   sdf sdf  </d>
    </bds>
</root>

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...