Saturday, 15 September 2012

html5 - Python CGI - UTF-8 doesn't work -



html5 - Python CGI - UTF-8 doesn't work -

for html5 , python cgi:

if write utf-8 meta tag, code doesn't work. if don't write, works.

page encoding utf-8.

print("content-type:text/html") print() print(""" <!doctype html> <html> <head> <meta charset="utf-8"> </head> <body> şöğıçü </body> </html> """)

this codes doesn't work.

print("content-type:text/html") print() print(""" <!doctype html> <html> <head></head> <body> şöğıçü </body> </html> """)

but codes works.

for cgi, using print() requires right codec has been set output. print() writes sys.stdout , sys.stdout has been opened specific encoding , how determined platform dependent and can differ based on how script run. running script cgi script means pretty much not know encoding used.

in case, web server has set locale text output fixed encoding other utf-8. python uses locale setting produce output in in encoding, , without <meta> header browser correctly guesses encoding (or server has communicated in content-type header), <meta> header telling utilize different encoding, 1 wrong info produced.

you can write straight sys.stdout.buffer, after explicitly encoding utf-8. create helper function create easier:

import sys def enc_print(string='', encoding='utf8'): sys.stdout.buffer.write(string.encode(encoding) + b'\n') enc_print("content-type:text/html") enc_print() enc_print(""" <!doctype html> <html> <head> <meta charset="utf-8"> </head> <body> şöğıçü </body> </html> """)

another approach replace sys.stdout new io.textiowrapper() object uses codec need:

import sys import io def set_output_encoding(codec, errors='strict'): sys.stdout = io.textiowrapper( sys.stdout.detach(), errors=errors, line_buffering=sys.stdout.line_buffering) set_output_encoding('utf8') print("content-type:text/html") print() print(""" <!doctype html> <html> <head></head> <body> şöğıçü </body> </html> """)

python html5 utf-8 python-3.x cgi

No comments:

Post a Comment