Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am looking for fast way to deserialize xml, that has special characters in it like ?.

I was using XMLReader and it fails to deserialze such characters.

Any suggestion?

EDIT: I am using C#. Code is as follows:

XElement element =.. //has the xml
XmlSerializer serializer =   new XmlSerializer(typeof(MyType));
XmlReader reader = element.CreateReader();
Object o= serializer.Deserialize(reader);
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
184 views
Welcome To Ask or Share your Answers For Others

1 Answer

I'd guess you're having an encoding issue, not in the XMLReader but with the XmlSerializer.

You could use the XmlTextWriter and UTF8 encoding with the XmlSerializer like in the following snippet (see the generic methods below for a way nicer implementation of it). Works just fine with umlauts (??ü) and other special characters.

class Program
{
    static void Main(string[] args)
    {
        SpecialCharacters specialCharacters = new SpecialCharacters { Umlaute = "?ü?" };

        // serialize object to xml

        MemoryStream memoryStreamSerialize = new MemoryStream();
        XmlSerializer xmlSerializerSerialize = new XmlSerializer(typeof(SpecialCharacters));
        XmlTextWriter xmlTextWriterSerialize = new XmlTextWriter(memoryStreamSerialize, Encoding.UTF8);

        xmlSerializerSerialize.Serialize(xmlTextWriterSerialize, specialCharacters);
        memoryStreamSerialize = (MemoryStream)xmlTextWriterSerialize.BaseStream;

        // converts a byte array of unicode values (UTF-8 enabled) to a string
        UTF8Encoding encodingSerialize = new UTF8Encoding();
        string serializedXml = encodingSerialize.GetString(memoryStreamSerialize.ToArray());

        xmlTextWriterSerialize.Close();
        memoryStreamSerialize.Close();
        memoryStreamSerialize.Dispose();

        // deserialize xml to object

        // converts a string to a UTF-8 byte array.
        UTF8Encoding encodingDeserialize = new UTF8Encoding();
        byte[] byteArray = encodingDeserialize.GetBytes(serializedXml);

        using (MemoryStream memoryStreamDeserialize = new MemoryStream(byteArray))
        {
            XmlSerializer xmlSerializerDeserialize = new XmlSerializer(typeof(SpecialCharacters));
            XmlTextWriter xmlTextWriterDeserialize = new XmlTextWriter(memoryStreamDeserialize, Encoding.UTF8);

            SpecialCharacters deserializedObject = (SpecialCharacters)xmlSerializerDeserialize.Deserialize(xmlTextWriterDeserialize.BaseStream);
        }
    }
}

[Serializable]
public class SpecialCharacters
{
    public string Umlaute { get; set; }
}

I personally use the follwing generic methods to serialize and deserialize XML and objects and haven't had any performance or encoding issues yet.

public static string SerializeObjectToXml<T>(T obj)
{
    MemoryStream memoryStream = new MemoryStream();
    XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
    XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);

    xmlSerializer.Serialize(xmlTextWriter, obj);
    memoryStream = (MemoryStream)xmlTextWriter.BaseStream;

    string xmlString = ByteArrayToStringUtf8(memoryStream.ToArray());

    xmlTextWriter.Close();
    memoryStream.Close();
    memoryStream.Dispose();

    return xmlString;
}

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(StringToByteArrayUtf8(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));

        using (StreamReader xmlStreamReader = new StreamReader(memoryStream, Encoding.UTF8))
        {
            return (T)xmlSerializer.Deserialize(xmlStreamReader);
        }
    }
}

public static string ByteArrayToStringUtf8(byte[] value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetString(value);
}

public static byte[] StringToByteArrayUtf8(string value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetBytes(value);
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...