Saturday, 15 January 2011

python - Why is the same utf-8 string fine in print and failing in logging? -



python - Why is the same utf-8 string fine in print and failing in logging? -

is there need hand logging print me under covers in order log utf-8 strings?

for line in unicodecsv.reader(cfile, encoding="utf-8"): in line: print "process_clusters: csv: %s" % print "repr: %s" % repr(i) log.debug("process_clusters: csv: %s", i)

my print statement works fine whether string latin-based or russian cyrillic.

process_clusters: csv: escuchan repr: u'escuchan' process_clusters: csv: говоритъ repr: u'\u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044a'

however, log.debug not allow me pass in same variable. error:

traceback (most recent phone call last): file "/system/library/frameworks/python.framework/versions/2.6/lib/python2.6/logging/__init__.py", line 765, in emit self.stream.write(fs % msg.encode("utf-8")) file "/system/library/frameworks/python.framework/versions/2.6/lib/python2.6/codecs.py", line 686, in write homecoming self.writer.write(data) file "/system/library/frameworks/python.framework/versions/2.6/lib/python2.6/codecs.py", line 351, in write data, consumed = self.encode(object, self.errors) unicodedecodeerror: 'ascii' codec can't decode byte 0xd0 in position 28: ordinal not in range(128)

my log, formatter , handler is:

log = logging.getlogger(__name__) loglvl = getattr(logging, loglevel.upper()) # convert text log level numeric log.setlevel(loglvl) # set log level handler = logging.filehandler('inflection_finder.log', 'w', 'utf-8') handler.setformatter(logging.formatter('[%(levelname)s] %(message)s')) log.addhandler(handler)

i'm using python 2.6.7.

reading through traceback, appears log module attempting encode message before writes it. message presumed ascii string, can't because contains utf-8 characters. if convert message unicode before passing logger might work.

log.debug(u"process_clusters: csv: %s", i)

edit, noticed parameter string decoded unicode updated illustration accordingly.

also based on latest edit, want utilize unicode string in setup:

handler.setformatter(logging.formatter(u'[%(levelname)s] %(message)s')) --^--

python unicode utf-8

No comments:

Post a Comment