3 Essential Questions About Hashable in Python

Original article was published on Artificial Intelligence on Medium

How Can We Customize Hashability?

The flexibility of Python as a general-purpose programming language mainly comes from its support of creating custom classes. With your own classes, many related data and operations can be grouped in a much more meaningful and readable way.

Importantly, Python has evolved to be smart enough to make our custom objects hashable by default in most cases.

Consider the following example. We created a custom class, Person, which would allow us to create instances by specifying a person’s name and social security number.

Notably, we overrode the default __repr__() function using the f-string method, which would allow us to display the object with more readable information, as shown in the last line of the code snippet.

As shown in the above code, we can find out the hash value for the created object person0 by using the built-in hash() function. Importantly, we’re able to include the person0 object as an element in a set object, which is good.

However, what will happen if we want to add more Person instances to the set? A more complicated, but probable scenario is that we construct multiple Person objects of the same person and try to add them to the set object.

See the following code. I created another Person instance, person1, which has the same name and social security number — essentially the same natural person.

However, when we added this person to the set object, persons, both Person objects are in the set, which we would not want to happen.

Because, by design, we want the set object to store unique natural persons. Consistent with both persons included in the set object, we found out that these two Person instances are indeed different.

I’ll show you the code of how we can make the custom class Person smarter so that it knows which persons are the same or different, for that matter.

In the above code, we updated the custom class Person by overriding the __hash__ and __eq__ functions.

We have previously mentioned that the __hash__() function is used to calculate an object’s hash value. The __eq__() function is used to compare the object with another object for equality and it’s also required that objects that compare equal should have the same hash value.

By default, custom class instances are compared by comparing their identities using the built-in id() function (learn more about the id() function by referring to this article).

With the updated implementation, we can see that when we were trying to create a set object that consisted of the two Person objects, the __hash__() function got called such that the set object only kept the objects of unique hash values.

Another thing to note is that when Python checks whether the elements in the set object have unique hash values, it will make sure that these objects aren’t equal as well by calling the __eq__() function.

Answer to the section’s question

Customization: To provide customized behaviors in terms of hashability and equality, we need to implement the __hash__ and __eq__ functions in our custom classes.