# Machine Learning on Graphs: Why Should you Care?

## A basic overview of graphs and their intersection with machine learning.

A few years ago, “Balboa Creole French” was considered as one language that is to disappear [1]. Balboa Island is located in Newport Beach,California. People their speak their modified version of French because many French families moved there after the first world war and started to learn English, German, and Spanish until the language was formed. There are around 20 people who still speak that language.

Of course, everything I said was a complete hoax, but people did not believe so until someone actually went to the island to learn and the language and ended up finding that the language did not exist in the first place(at least that’s what the rumors say).

Now, you might ask what does this have to do with machine learning on graphs? Well, around 4 years ago, research [2] done at Stanford University came up with classifiers that managed to detect such hoaxes on Wikipedia that had an accuracy of 86% compared to the human-level accuracy of 66%!

The classifier they used was an ensemble of decision trees called Random Forests. The interesting part was how they crafted the features.

One of the key ideas in the paper was how real articles link more coherently than false ones. In a Wikipedia article, you would have markup pointing to some other Wikipedia article. For real articles, the markups are linked together more than in a hoax and this turned out as a key factor in figuring out Wikipedia hoaxes.

Now, go to google, and type a question like “When did Leonardo Da Vinci die?”. You will get a lot of results for your search, but at the top, you will see a small box with the answer inside. How did Google know what we wanted?Back in 2012, Google released its Knowledge Graph which models entities in the world and relationships between them as a graph. So the string you input is not a string, rather a node in a huge graph. Leonardo Da Vinci is one node of this graph. The other node is May 2, 1519 which is his death date. There is a link connecting these two nodes. The link’s name or relation is Date of Death.