Behavioural Cloning

A challenging project in hindsight

In this project, we were given a simulator - a racing game, in which you had to drive a car around a track, much like in Need for Speed in the old days. It looked like this:

You could drive around, record your laps and use this data to train a convolutional neural network, which was then used to drive the car autonomously, with the hope that the network learned the patterns of good driving behaviour - keeping the car on the road…


The first task was to gather data: when you recorded your laps, images were being saved of what the car sees in a given moment, like if there were a camera mounted on the hood of the car. For each of these images, a steering angle was paired in a .csv file. So if the car saw something like this:

Then the corresponding steering angle was likely a large positive number (clockwise turn is considered positive), since I wanted to get back on the center of the road from the left side. And if it was something like the following:

Then it’s a left turn, so the steering angle was a large negative number.

The steering angles in a normal run are plotted below. They are normalized between -1 and 1:

It can be seen that very sharp turns are infrequent, most of the data is garthered around 0 (nearly straight streches of road).

Then, I recorded some situations where I tried to prepare the model for more extreme conditions. This involved starting recording facing the edge of the road in a large angle, and steering the car hard into the direction of the center of the road. This recovery driving style was suggested by Udacity and by people on the course’s Slack channel as well. I also had success with it - the model drove much better after having learned these large turns.

The distribution of steering angles in this driving style can be seen below - notice how the large steering angles dominate it:

I then cropped the images to filter out some noise: trees, hills, the sky, and the hood of the car. A cropped image example can be seen below:


After gathering the data, the network had to be trained: this was extremely time-consuming, as I didn’t have access to a high-end GPU. I trained the model overnight, usually on a few thousand images at once.

The architecture of the neural network I used was a simplified NVIDIA model.

In hindsight, I could have (should have) used a more precise region mask on the images, in order to filter out some additional noise by the side of the road (like signs, poles, etc), which would have probably improved the accuracy of the model.

All in all, there are improvement areas, but I will take a bit of a step back before proceeding, if I ever will. It was both frustrating and fun working on this project.

Special thanks to c0derabbit for their constant support!

Sources:

Written on December 17, 2017

If you notice anything wrong with this post (factual error, rude tone, bad grammar, typo, etc.), and you feel like giving feedback, please do so by contacting me at samubalogh@gmail.com. Thank you!