Original article was published on Artificial Intelligence on Medium
How does it work?
Currently, the code is not available on Github, as it is still very chaotic and only little commented. I will revise it and publish it there.
The project consists of 3 parts:
- Data basis
- Twitter data
- Artificial neural network
The data basis
The Rijksmuseum offers a public API that can be used after registration. With this interface to the metadata of the art collection I was able to download the titles and descriptions of all paintings with a python script. Unfortunately there were rather too few English descriptions to get a large data base. In total there were only about 4.000 words. So I was not sure if the artificial neural network could recognize enough structures in it. For a beta version it was enough, but I plan to extend the data base with titles and descriptions of other museums.
The finished data basis is now a text file with such contents:
=== Woman Reading a Letter === by Johannes Vermeer
In a quiet, private moment, a young woman stands, engrossed in reading a letter. It is morning, and she is still wearing her blue nightrobe. All the other colours are subordinate to its radiant lapis lazuli; yellow and red hardly make an appearance. Vermeer rendered the different effects of the cool light precisely. For example, he was the first artist to try soft grey for flesh, and to paint the shadows on the wall light blue.
The title is delimited with ===, the artist with „by“ to create a structure that is different from the description. You can already see the later problem here: The descriptions are longer than 280 characters. So later I have to shorten the artificially generated results. I do this by concatenating sentences until the number of characters would exceed 280. So mostly only the beginning is weathered.
The Twitter data
To be able to give the titles of the artwork in your tweets to the network as input, I use the Twitter API (and Tweepy). With it I can read the tweets for #DescribeMyArt and distinguish the twitter handle from the tweet and also delete the hashtag, because it’s not part of the title. The whole thing happens every 15 minutes and in a tiny database, I compare the latest tweet with the second last one to know if there is a new tweet from you. If there is the neural net starts, if not the script simply aborts and waits 15min for a new attempt.
The artificial neural network
Open AI’s GPT2 network is one of the best neural networks for Natural Language Processing, i.e. the processing of non-structured data such as words. I work with a small version and the module of Buzzfeed’s Chief Data Scientist, Max Woolf, called gpt2_simple, which makes it „very easy“ to start a tensorflow session and fine-tune the neural network with the new data base. Since there are so few words, I tried it once with 8 sessions (quasi run through), but the generated texts were not good enough. Now I’m using 50 sessions and I think the results are quite good — but I’d like to extend the data base and the number of sessions later.
I let the training take place directly on a virtual machine in the Google Cloud, because the finished script must also run there. This worked well, because I could already set up similar projects there.
The fully trained network is now stored in the cloud and starts every 15 minutes to generate new descriptions — matching your titles — of which I can only twitter a fraction (because of the 280 character limit of Twitter).
The descriptions therefore often seem half-finished — but that can’t be changed on Twitter — and many of the „short“ results work quite well. But of course you should not expect perfect descriptions! One time a description even came out in korean — I HAVE NO IDEA WHY. But without such coincidences it would be only half as exciting.