Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm writing a crawler which uses Hpricot. It downloads a list of strings from some webpage, then I try to write it to the file. Something is wrong with the encoding:

"xC3" from ASCII-8BIT to UTF-8

I have items which are rendered on a webpage and printed this way:

D??veloppement

the str.encoding returns UTF-8, so force_encoding('UTF-8') doesn't help. How may I convert this to readable UTF-8?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
676 views
Welcome To Ask or Share your Answers For Others

1 Answer

Your string seems to have been encoded the wrong way round:

"D??veloppement".encode("iso-8859-1").force_encoding("utf-8")
#=> "Développement"

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...