Thursday, January 20, 2011

Python note: set equality test

Given a class with an __eq__ function, would a python set contain only members that are equal? The answer is not necessarily.

Consider the following class:

class C(object):
def __init__(self, x):
self.a = x
def __eq__(self, o):
return self.a == o.a
def __str__(self):
return str(self.a)

It defines a member which controls its equality. However, the following code will add four members to the set

s = set()
a = C(1)
b = C(2)
c = C(3)
s.add(a)
s.add(b)
s.add(c)
s.add(C(1))

The main reason is that set is implemented as a hash map, and without a hash function defined in the class C, python will use the object identity itself for hashing, and members of the class will be hashed into different places. The following code will fix the problem.

class C(object):
def __init__(self, x):
self.a = x
def __eq__(self, o):
return self.a == o.a
def __str__(self):
return str(self.a)
def __hash__(self):
return self.a

No comments: