Self driving Grand Theft Auto car in 100 lines of code

Source: Deep Learning on Medium

UDACITY term 1 covers a lot of basics of Self-driving car technology. There are 5 important projects in the course

1. Lane Detection

2. Traffic sign detection

3. Behavioral Cloning

4. Advance Lane detection

5. Vehicle detection

All these projects have some sample pictures and sample videos to train and to test the implementation. There was no single environment in which all these can work together. UDACITY is coming up with their own world combing self-driving car and Flying car but I am not patient for that. I looked for an environment to run all the projects simultaneously to have the feel for a self-driving car scenario. I found one with Grand Theft Auto San Andreas.

GTA San Andreas has been sleeping on my laptop for quite some time. While I was doing the third UDACITY project on Behavioral cloning I was working on the UDACITY simulator. Then an idea stuck in my mind why can’t we use the same on a GTA. I took my time finished term 1 and started my work on integrating all the Udacity Term1 projects to GTA San Andreas.

Project 1: Behavioral cloning:

I first took up behavioral cloning as it was interesting to start off with. I want to make it modular not only specific to GTA and also as a sandbox to try different architecture on top of it. For this project in Udacity, I took the NVIDIA end to end learning architecture which I will also use here,


1. GTA San Andreas (put d3d9.dll in the installation path so that GTA will start in a minimized window)

2. Anaconda

Getting the inputs and outputs:

In the Udacity simulator, it was quite easy to give input and get an output. Drive the car on a test track. The simulator will take pictures of three cameras. It will produce a .csv file of throttle brake and steering angle. This communication happens via an IOsocket. To keep the implementation modular I used keyboard key logging and screenshot from the game directly

Key logging
Images of the front

Along with the keylogging images of the windowed view is also taken

Training the model:

Now the pictures were taken and the corresponding keystrokes are known we will compile them as input and output of a model. I have put an NVIDIA end to end learning model. This can be replaced by any other model of choice.

Driving the model:

The screenshot of the game is sent to the pre-trained model.The output of the model is converted to corresponding keystrokes and are sent to the game

Steps to follow:

1. Run “” and drive the car in a zoomed mode for 100 seconds

2. Run “” to train the model

3. Run “” and be ready in the car

Next to come Lane finding in GTA:

Github link:

Youtube link: