Monday, 15 June 2015

Encoding error using Python -



Encoding error using Python -

i wrote code connect imap , parse body info , insert database. having problems accents.

from email header got information:

content-type: text/html; charset=iso-8859-1

but, not sure if can trust in information...

the email wrote in portuguese, have lot of words accents. example, extract next phrase email source code (using browser):

"...instalação de eletrônicos..."

so, connected imap , fetched emails:

... typ, info = m.fetch(num, '(rfc822)') ...

when print content, next word:

print data[0][1] instala+º+úo de eletr+¦nicos

i tried utilize .decode('utf-8') had no success.

instalação de eletrônicos

how can create human readable? database in utf-8.

the header says using "iso-8859-1" charset. need decode string encoding.

try this:

data[0][1].decode('iso-8859-1')

python encoding character-encoding non-ascii-characters

No comments:

Post a Comment