(This article is not for promotional purposes)
If you are familiar with Deep learning, you know that you need to train the models, with really big data sets to make sure that your model performs really well on test data (high accuracy & not over-fit or under-fit). Well recently I was trying to train a Deep CNN model on handwritten data-sets (NIST) to predict the handwritten characters in government/official documents.
Guess what? My laptop just hung up on me multiple times (core i7, Nvidia GeForce 940M — 2GB). Although the dataset is not that big (Around 800,000 images) but the biggest problem was manipulating the data and then feeding it to the model. What do I do? It hit me : Use AWS. I have heard about EC2 provided by AWS, but it always scared me, thinking it’s like a black box, not knowing how to handle!
After some google search, I came across AWS DLAMI. Basically, its a cloud server specifically build for deep learning. It comes with pre-configured Cuda, Tensorflow, Keras, MXNet or any Deep learning framework that you can ask for.
So I created an EC2 instance in AWS, with some really good computing power. But now, how do I log into this cloud server? After a lot of reading tutorials and stuff, I came to know that its really easy. Connection code is one liner (literally). Type this in your local terminal (Ubuntu):
$ ssh -i your-key-pair.pem ubuntu@your-public-dns
You will get the .pem file once you create your EC2 instance. Also, you will get your public DNS in the section of your EC2 instance (in AWS console). If all the details are correct, you will get this screen:
It was so exciting to see this screen. I will not post any tutorial kind of stuff here, but I will mention the most important commands that I found very useful when training models in AWS DLAMI:
- You will always need the .pem file. So keep it handy, and remember the name. Also, you need to change permission of the file to 400 so that no one can assess it by doing:
chmod 400 key-pair.pem
Without this, you won’t be able to login to the server.
- To copy files/folder from Cloud server to local PC:
scp -i key-pair.pem server/filename /local/path (Single file)
scp -i key-pair.pem -r server/folder_name/ /local/path (Folder copy)
To copy files/folder from Local PC to Cloud server, just reverse the order.Example:
scp -i key-pair.pem /local/path server/filename (Single file)
- When training your model, you will need a lot of time. So it’s better to make sure that your process is running even if you logout of your Cloud server from your PC. To achieve this, use tmux in the Cloud terminal:
Then command normally on the tmux, then press Ctrl + B and then D. You will come back to your terminal, but tmux will be continuing in the server. To go back to tmux, type:
And you’ll go back to your screen.
I hope all these shortcuts, and methods help you in successfully train a deep learning model that will change the world. 🙂
This is my first Post. Hit Claps if you like it, comment your thoughts.
I will post a tutorial in deep learning and aws in the future.
Source: Deep Learning on Medium