Friday, August 26, 2005

Python note: unicode

When you get some error output like this:

'ascii' encoding can not encode ...

The first thing to check is python unicode object.

Python chooses a separate type of object to support unicode, in order to keep string compatibility. Thus python has two kinds of strings: str and unicode. Str objects are the same as the standard C string - an array of chars. It is used in most function calls.

Each char in a computer system can represent 128 different values. For languages like English, the alphabet is below 30. Therefore we can find a mapping between each letter and each char value. Such a mapping is called encoding. ASCII is the most common encoding to map char values into letters.

For languages with many more than 128 letters, such as Chinese and Japanese, many chars need to be combined to represent one character. A problem arises. Because different languages have different interpretations of char values, the same string can be mapped into different letters / characters by different encodings. For example, when viewing one webpage, you can switch your browser to different encodings, and the page will be displayed differently (of course there is only one encoding that is 'correct') Unicode is proposed to solve the encoding clash, and it includes all possible characters / letters in languages. Interestingly, there are also many different UTF encoding versions, include utf-8, utf-16, etc.

Unicode objects in Python are actually strings encoded in utf-8. It can be seen as the abstract representation of the real character / letters, which can be encoded into different computer strings by different encodings. In other words, if strings are viewed as the outside form, Unicode can be viewed as the inside meaning.

Unicode objects can be changed to str object by the method 'encode'. It will translate the meaning to raw strings with certain encodings.

On the contrary, raw strings can be changed to unicode, using method 'decode'. When you know the 'correct' encoding of a raw string, you can tell it to the system and make it an unicode object.

There are methods to help you determine the os encoding. They are sys.getdefaultencoding() and sys.getfilesystemencoding(). Which are self explanatory.

Some methods in python work with str while other work with Unicode. You have no difficulty with those taking both types, but you need to be careful when calling a method that take only str or Unicode params. Also, the return type of a method us often neglected. For example, file.readline() would return a string. If a file is a unicode file, it's still a string encoded in 'utf-8'.

When a unicode object is passed to a method taking string params, or vice versa, the system will try to switch beween them automatically. However because we did not specify encodings beforehand, it will use ascii by default. When the real encoding can't be interpreted by the ascii char set, the exception at the beginning of this article will occur. The steps to take to fix the problem might be: first check the type of the string, using type() method, then try to convert it to the correct type by using encode() or decode, specifying the encoding.

Monday, August 22, 2005

wxPython note: process tab end enter key events for TextCtrl in dialogs

When you place a TextCtrl in a dialog and catch the Key events for it, you will find that enter and tab key events are not processed. When you press tab key, the focus will be switched to the next widget.

The solution is setting the style for the TextCtrl. There are two styles, wx.TE_PROCESS_ENTER and wx.TE_PROCESS_TAB, which default to unset. They will help in processing events.

Thursday, August 18, 2005

Haskell note: turorials

I am completely new to Haskell functional programming. I started to play with it simply because of the need in my MSc course. But it seems more and more interesting now.

Functional programming is quite different from "common", i.e. imperative programming, mainly in that it's not executed from the beginning to the end. A functional program can be taken as a set of equations, when calculated together yielding output.

I find this introduction succinct and helpful

Here are some good summary of the language.

About the grammar

About the operators

wxPython note: how to select many rows in Grid?

This problem puzzled me some time ago, and I forgot the solution again today. Thus I feel it necessary to take it down here.

There is absolutely no way of setting a style like wx.CB_MULTIPLE to specify multiple selection here. wxGrid support multiple row selection by itself, see a reference

The only thing you need to have is specifying the second (hidden!) param of SelectRow - bAppend (I am vague about the name). When it's true you will see the row selected without cleaning other rows.

Of course, this implies that selecting many rows can only be done from program. Thus in order to respond to mouse and keyboard behaviors, event catching and processing are needed. Anyway, it's not uncommon to process events for a grid.

Sunday, August 14, 2005

Introduction to MVC

Found a good introduction to the MVC pattern.

Also see a general explanation at