Behavioral Cloning

Summary

This project is the result of a Udacity project in which a deep learning model is trained to drive a vehicle autonomously. The simulation environment is provided by Udacity to train and test the models. The video of the result can be viewed here. The numbers scrolling on the right represent the angle of the steering wheel.
In this project, I used the Keras Deep Learning library to train the car to follow the track.

Data Collection

The training data was collected using the training mode in the simulator developed by Udacity. The settings on the simulator were set to the smallest size and fastest graphics. While manually driving the car around the track, 3 images were saved every 0.1 seconds: one from the center camera, one from the left camera and one from the right camera. A driving_log.csv was generated during this process that contains the path to the files, as well as the angle of the steering wheel, which was the data label. The acceleration is kept constant. The approach that I took was to first do 4 laps trying my best to drive in the middle of the road. I then did 2 laps where I purposely drove the car to the left side of the track with the recording turned off. As soon as I got to the edge of the road, I would turn on the recording and recover back to the center of the road. I did 2 more laps training the car to recover from the right side of the road. This allowed the model to learn to not only drive in the center of the road, but also recover from both the left and right sides of the track.

Image Preprocessing

In order to eliminate noise, the following steps were performed to preprocess each image. First, the image was reduced in size and flattened. Then each image was normalized by using greyscale normalization. The resized, flattened and normalized images were saved as features and the steering angles as labels. The data was then split into training and validation sets.

# Suppress all warning
import warnings
warnings.filterwarnings('ignore', category=DeprecationWarning)

# Load libraries
import argparse
import os
import sys
import csv
import base64
import numpy as np
import pickle
import math
import tensorflow as tf
import matplotlib.pyplot as plt
from tqdm import tqdm
import json
from sklearn.cross_validation import train_test_split

# Keras libraries
from keras.models import Sequential, model_from_json
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
from keras import backend as K
import json
import os
import h5py

# Preprocess Data

# Define paths
path_to_folder = "/Users/Aman/Documents/SelfDrivingCarCourse"
path_to_files = "{}/driving_log.csv".format(path_to_folder)

# Import the data
raw_data = []
with open(path_to_files) as F:
    reader = csv.reader(F)
    for i in reader:
        raw_data.append(i) 

# Raw data file stats
data_len = len(raw_data)
print("Length of data file:", data_len)

# Initialize features and labels
features = ()
labels = ()

# Function to resize and flatten each image
def flatten_image(data, j):
    img = plt.imread(data[j].strip())[65:135:4,0:-1:4,0]
    flattened_img = img.flatten().tolist()
    return flattened_img

# Store features
for i in tqdm(range(int(len(raw_data))), unit='it'):
    for j in range(3):
        features += (flatten_image(raw_data[i],j),)

#Get length of features
item_len = len(features)

# Reshape each image to by 18 x 80
features = np.array(features).reshape(item_len, 18, 80, 1)


# Store labels    
for i in tqdm(range(int(len(raw_data))), unit='it'):
    for j in range(3):
        labels += (float(raw_data[i][3]),)
labels = np.array(labels)

# Get randomized datasets for training and test
X_train, X_test, y_train, y_test = train_test_split(
    features,
    labels,
    test_size=0.10,
    random_state=832289)

# Get randomized datasets for training and validation
X_train, X_valid, y_train, y_valid = train_test_split(
    X_train,
    y_train,
    test_size=0.30,
    random_state=832289)

# Implement Min-Max scaling for greyscale image data
def normalize_greyscale(image_data):
    """
    Normalize the image data with Min-Max scaling to a range of [0.1, 0.9]
    :param image_data: The image data to be normalized
    :return: Normalized image data
    """
    a = 0.1
    b = 0.9
    greyscale_min = 0
    greyscale_max = 255
    return a + (((image_data - greyscale_min)*(b - a))/(greyscale_max - greyscale_min))

#Normalize data
X_train = X_train.astype('float32')
X_valid = X_valid.astype('float32')
X_test  = X_test.astype('float32')
X_train = normalize_greyscale(X_train)
X_valid = normalize_greyscale(X_valid)
X_test = normalize_greyscale(X_test)

  0%|          | 9/9739 [00:00<01:49, 88.52it/s]

Length of data file: 9739


100%|██████████| 9739/9739 [01:36<00:00, 101.15it/s]
100%|██████████| 9739/9739 [00:01<00:00, 6358.32it/s]

Model

The model consists of 4 convolutional layers, 1 max pooling layer and 3 dense layers. With each progressing convolutional layer, the channel size was decreased by 50% (i.e. 16, 8, 4, 2). After the fourth convolutional layer, a max pooling was then applied with a dropout of 25%. Then after flattening the matrix, dense layers were used with 16 features and the ReLu activation was applied. I used 20 epochs in the training of the model. In addition, the pool size was (2,2) and the convolutional kernal was (3,3). The Sequential convolutional model was used with an Adam Optimizer and the mean_squared_error loss function.

How I decided the number and type of layers:
I tested the model with 2, 4 and 6 layers and found the performance to be best with 4.

How I tuned the hyperparameter:
In terms of the dropout rate after the first layer, I felt that 25% was enough so that most of the data is retained to train the model but enough is dropped out as to not cause overfitting. After the second step, a dropout of 50% was applied as to further prevent overfitting. I noticed that increasing the dropout rate at this step was more apprpriate to reduce overfitting. I tried 5, 10, 20, 25 and 30 epochs and ended up using 20 epochs with a batch size of 128.

How I trained my model:
I trained my model first driving the car 4 times around the course trying to remain in the middle of the track as much as possible. Then I did 2 laps while training the model to recover from each side by turning the recoding on and off. By using this technique, the model was able to learn to keep the car in the middle of the track as much as possible but also recover when it got off the track. During training, the best model parameters were determined using the validation set.

How I evaluated my model:
I evaluated the model by checking its performance while in autonomous mode. I observed when the car when off the track and I retrained it once again in the portion of the track. During training, the test set was used to determine the accuracy of the model.

# Set seed for reproducibility
np.random.seed(1337) 

# Initialize model parameters
batch_size = 128
nb_classes = 1 
nb_epoch = 20

# Define convolutional filters
nb_filters1 = 16
nb_filters2 = 8
nb_filters3 = 4
nb_filters4 = 2

# Define size of pooling area for max pooling
pool_size = (2, 2)

# Define convolution kernel size
kernel_size = (3, 3)

# Construct model with convolutional layers
model = Sequential()

input_shape = X_train.shape[1:]

# Layers with ReLu activation
model.add(Convolution2D(nb_filters1, kernel_size[0], kernel_size[1],
                        border_mode='valid',
                        input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters3, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters4, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))

# Max Pooling 
model.add(MaxPooling2D(pool_size=pool_size))

# Apply dropout of 25%
model.add(Dropout(0.25))

# Flatten matrix
model.add(Flatten())

# Dense and apply ReLu activation
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(16))
model.add(Activation('relu'))

# Apply dropout of 50%
model.add(Dropout(0.5))
model.add(Dense(nb_classes))

# Print summary of the model
model.summary()

# Compile model using Adam optimizer 
# Get loss using mean squared error
model.compile(loss='mean_squared_error',
              optimizer=Adam(),
              metrics=['accuracy'])

# Model training
history = model.fit(X_train, y_train,
                    batch_size=batch_size, nb_epoch=nb_epoch,
                    verbose=1, validation_data=(X_valid, y_valid))
score = model.evaluate(X_test, y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
convolution2d_1 (Convolution2D)  (None, 16, 78, 16)    160         convolution2d_input_1[0][0]      
____________________________________________________________________________________________________
activation_1 (Activation)        (None, 16, 78, 16)    0           convolution2d_1[0][0]            
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 14, 76, 8)     1160        activation_1[0][0]               
____________________________________________________________________________________________________
activation_2 (Activation)        (None, 14, 76, 8)     0           convolution2d_2[0][0]            
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D)  (None, 12, 74, 4)     292         activation_2[0][0]               
____________________________________________________________________________________________________
activation_3 (Activation)        (None, 12, 74, 4)     0           convolution2d_3[0][0]            
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D)  (None, 10, 72, 2)     74          activation_3[0][0]               
____________________________________________________________________________________________________
activation_4 (Activation)        (None, 10, 72, 2)     0           convolution2d_4[0][0]            
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, 5, 36, 2)      0           activation_4[0][0]               
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 5, 36, 2)      0           maxpooling2d_1[0][0]             
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 360)           0           dropout_1[0][0]                  
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 16)            5776        flatten_1[0][0]                  
____________________________________________________________________________________________________
activation_5 (Activation)        (None, 16)            0           dense_1[0][0]                    
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 16)            272         activation_5[0][0]               
____________________________________________________________________________________________________
activation_6 (Activation)        (None, 16)            0           dense_2[0][0]                    
____________________________________________________________________________________________________
dense_3 (Dense)                  (None, 16)            272         activation_6[0][0]               
____________________________________________________________________________________________________
activation_7 (Activation)        (None, 16)            0           dense_3[0][0]                    
____________________________________________________________________________________________________
dropout_2 (Dropout)              (None, 16)            0           activation_7[0][0]               
____________________________________________________________________________________________________
dense_4 (Dense)                  (None, 1)             17          dropout_2[0][0]                  
====================================================================================================
Total params: 8023
____________________________________________________________________________________________________
Train on 18406 samples, validate on 7889 samples
Epoch 1/20
18406/18406 [==============================] - 31s - loss: 0.2418 - acc: 0.4669 - val_loss: 0.1472 - val_acc: 0.5031
Epoch 2/20
18406/18406 [==============================] - 30s - loss: 0.1629 - acc: 0.5104 - val_loss: 0.1295 - val_acc: 0.5221
Epoch 3/20
18406/18406 [==============================] - 31s - loss: 0.1495 - acc: 0.5198 - val_loss: 0.1203 - val_acc: 0.5409
Epoch 4/20
18406/18406 [==============================] - 30s - loss: 0.1436 - acc: 0.5244 - val_loss: 0.1245 - val_acc: 0.5368
Epoch 5/20
18406/18406 [==============================] - 31s - loss: 0.1420 - acc: 0.5296 - val_loss: 0.1162 - val_acc: 0.5463
Epoch 6/20
18406/18406 [==============================] - 32s - loss: 0.1380 - acc: 0.5331 - val_loss: 0.1136 - val_acc: 0.5562
Epoch 7/20
18406/18406 [==============================] - 31s - loss: 0.1361 - acc: 0.5332 - val_loss: 0.1127 - val_acc: 0.5584
Epoch 8/20
18406/18406 [==============================] - 31s - loss: 0.1341 - acc: 0.5361 - val_loss: 0.1095 - val_acc: 0.5595
Epoch 9/20
18406/18406 [==============================] - 31s - loss: 0.1354 - acc: 0.5360 - val_loss: 0.1094 - val_acc: 0.5567
Epoch 10/20
18406/18406 [==============================] - 31s - loss: 0.1324 - acc: 0.5394 - val_loss: 0.1126 - val_acc: 0.5530
Epoch 11/20
18406/18406 [==============================] - 31s - loss: 0.1326 - acc: 0.5362 - val_loss: 0.1158 - val_acc: 0.5522
Epoch 12/20
18406/18406 [==============================] - 32s - loss: 0.1311 - acc: 0.5413 - val_loss: 0.1078 - val_acc: 0.5674
Epoch 13/20
18406/18406 [==============================] - 31s - loss: 0.1308 - acc: 0.5415 - val_loss: 0.1035 - val_acc: 0.5733
Epoch 14/20
18406/18406 [==============================] - 31s - loss: 0.1270 - acc: 0.5462 - val_loss: 0.1080 - val_acc: 0.5615
Epoch 15/20
18406/18406 [==============================] - 31s - loss: 0.1260 - acc: 0.5469 - val_loss: 0.1173 - val_acc: 0.5466
Epoch 16/20
18406/18406 [==============================] - 31s - loss: 0.1257 - acc: 0.5457 - val_loss: 0.1065 - val_acc: 0.5646
Epoch 17/20
18406/18406 [==============================] - 31s - loss: 0.1244 - acc: 0.5487 - val_loss: 0.1099 - val_acc: 0.5562
Epoch 18/20
18406/18406 [==============================] - 31s - loss: 0.1258 - acc: 0.5461 - val_loss: 0.1101 - val_acc: 0.5593
Epoch 19/20
18406/18406 [==============================] - 31s - loss: 0.1250 - acc: 0.5486 - val_loss: 0.1028 - val_acc: 0.5736
Epoch 20/20
18406/18406 [==============================] - 31s - loss: 0.1232 - acc: 0.5486 - val_loss: 0.1057 - val_acc: 0.5651
Test score: 0.103015983609
Test accuracy: 0.581793292266

# Save model as json file
json_string = model.to_json()

with open('model.json', 'w') as outfile:
    json.dump(json_string, outfile)

    # Save weights
    model.save_weights('./model.h5')

Conclusion

The final accuracy from the test set is 0.58. In practical terms, although the car does a good job in autonomous mode, there is still much room for improvement. The car does swerve around the road quite a bit and has the tendancy to make sharp and sudden movements. This car is not yet ready to hit the real streets.

Summary

Data Collection

Image Preprocessing

Model

Conclusion

social