Original article can be found here (source): Artificial Intelligence on Medium
Can You Teach a Computer to Write Like Stephen King?
With the help of Deep Learning for NLP
Stephen King has been one of the most influential writers for the last four decades. As of writing this, he is the only author in history to have more than 30 books become Number 1 best-sellers. He is particularly popular for writing horror, supernatural fiction, suspense, and fantasy stories.
In this post, I’ll teach you with the help of Deep Learning how you can teach a computer to write stories like Stephen King. The dataset consists of Stephen King’s five short stories.
- Strawberry Spring
- Graveyard Shift
- The Woman in the Room
- I am the Doorway
I’ll be using Recurrent Neural Networks which are particularly used for Natural Language Processing and Python as the programming language.
1. Importing all the necessary libraries
2. Importing data
The text file contains five stories that are mentioned above. Also, note that I have lowercased all the words to reduce the vocabulary size.
3. Creating Vocabulary
Here the first line will create a vocabulary of all the unique characters. Since Neural Networks understands only numeric values we have to map each character to an integer. The second and third line creates a dictionary to map each character to an integer.
4. Creating Sequences & Targets
Here we will create sequences of 100 character lengths that will predict the next 101st character.
5. Encoding Sequences & Targets
As mentioned earlier Neural Networks only understands numeric values, we will have to encode the sequences and targets.
6. Building the Model
One might think that the implementation of RNN is quite complex, but with the help of Keras, it’s pretty simple to implement. The model basically consists of the 4 following layers:
1. Embedding: An Embedding layer transforms a vector of higher dimensional space to a lower dimensional space. It also does in a way that similar meaning words have similar values.
2. LSTM: Long short-term memory (LSTM) are a type of Recurrent Neural Networks. They are capable of learning order dependencies in sequence prediction problems.
3. Dropout: Dropout helps prevent overfitting.
4. Dense: This is the output layer where predictions happen. Softmax function is used to output the probability values of each character.
7. Training the network
Using the fit function we can train our network. I have taken batch size as 64 and trained for 40 epochs.
sample() will sample an index from the output probability array. The parameter
temperature defines the freedom the function has when creating text.
A seed sentence is used to predict the next character. Then we can simply update the seed sentence with the predicted character and trim the first character to keep the sequence length constant.
Input: people clustered in small groups that had a tendency to break up and re-form with surprising speed.Output: people clustered in small groups that had a tendency to break up and re-form with surprising speed. looking out to be an undergrad named donald morris work and i passed all the look and looked at him. he was supposed to stay wiping across the water,
but they were working and shittered back of its an
appla night. the corpse in the grinder, expecting the crap outs.the counted in the second. jagged sallow sound over the way back on the way — at least aloud.'i suspect there,’ hall said softly. he was supposed to be our life counting fingers.
The generated text seems quite readable. Although it doesn’t make sense, some sentences are not grammatically correct and some words are spelled wrong, but it has used correct punctuation like full stop after each sentence and quotation marks when it’s quoting a person.
I admit the generated text is nowhere near how Stephen King writes. But seeing the recent advancement in Deep Learning and Natural Language Processing I have my hopes up. Maybe in a couple of years, computers will become better writers than humans.
Also, a great blog by Andrej Karpathy if you are looking to read more on Recurrent Neural Networks.
The performance of our model can be improved by using a much larger training data and tinkering with the hyperparameters.
Feel free to play with the code. The complete project can be found in the Github.
Thanks for reading and have a great day. 🙂