Thursday, July 10, 2008

Python note: swapping objects

The best way to swap two objects is:

a, b = b, a

It swaps the names of two object by the use of a tuple, without altering the objects.

The way to swap objects in many languages involves copying. However, this method can be tricky with Python. This is particularly because object assignment in Python does not copy objects. It simply links a variable name to the existing object. Suppose

class X(object):
__init__(self, x)
self.x = x

a = X([1, 2, 3])
b = X([])

Then we want this:

b = a
a.x = []

The above will not work, because b.x will become [] too. To avoid copying, use

a, b = b, a
a.x = []

In another occasion, if a will be kept while b becomes a copy of a's value, define a copying function or use the copy module.

With the new-style class, everything is an object. So the rule for object assignment applies to lists, dicts etc. To make a copy of the original list, use l2 = l1[:].

Monday, July 07, 2008

Python note: functional programming saves code when dealing with lists

Another note on using functional programming. It makes code with lists shorter and clearer sometimes.

For example, there is a list l. We want to add one to each element:

l = map(lambda x: x+1, l)

LIBSVM note: problems

1. Why does svm-scale run forever, while keeping writing the output file (.scale)?

It might be because the original training file contains [0, 1] range features, but the scale output requires [-1, 1] range. This is the default option. Add the option -l 0.

2. Why does svm-train run forever?

It might be because of the epsilon value. The default parameter (-e 0.001) sets this value. The smaller the value is, the more accurate will the trained model be, but the more iterations will be taken. Consider setting epsilon to 1 and try.

If there are a lot of features, consider trying LIBLINEAR instead. It does not use a kernel, but runs faster than LIBSVM for a linear model.

Another param, -m, sets the memory cache. Make it as large as possible within RAM.