Tuesday, June 06, 2006

Python note: unicode example (switch utf to gb2312)

This is a script to switch files from unicode to gb2312

# gb2utf - switch encoding between text
# Yue Zhang 2006
import sys
iFile = open(sys.argv[1])
oFile = open(sys.argv[2], "w")
sLine = iFile.readline()
while sLine:
...try:
......uLine = sLine.decode("gb2312")
...except UnicodeDecodeError:
......sLine = iFile.readline()
......continue
...oFile.write(uLine.encode("utf8")) # note this.
...sLine = iFile.readline()
iFile.close()
oFile.close()

No comments: