python3: different charset support

I am using python 3.3 in Windows 7. if "iso-8859-1" in str(source): source = source.decode('iso-8859-1') if "utf-8" in str(source): source = source.decode('utf-8') So, currently my application is valid for the above two charsets only ... but I want to cover every possible charset. Actually, I'm finding these charsets manually from the source of the website, and I have experienced that all the websites in the world are not just from these two. Sometimes websites do not show their charset in their HTML source! So, my application fails to move ahead there! What should I do to detect a charset automatically and decode according to it? Please try to make me aware in-depth and with examples if possible. You can suggest important links too.
The chardet module tries to divine the encoding of its input, but it obviously gets it wrong sometimes.

以上就是python3: different charset support的详细内容,更多请关注web前端其它相关文章!

赞(0) 打赏
未经允许不得转载:web前端首页 » HTML5 答疑

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址

前端开发相关广告投放 更专业 更精准