In this example we will demonstrate the (deep) convolutional neural networks and their power. We fill first start with a pre-processing script that resizes our images.

Pre-processing: Resizing of images

First, we install and load EBImage library

source("http://bioconductor.org/biocLite.R")
biocLite("EBImage")
require(EBImage)
path="~/Dropbox/MST_COURSES/SS16/practicals/"
setwd(path)

Now let’s load the data in their primitive form. Neural network structures perform complex operations, so they need the data format in specific structures, usually called tensors.

X <- read.csv(paste0(path,"olivetti_X.csv"), header = F)
labels <- read.csv(paste0(path,"olivetti_y.csv"), header = F)

The X array we load here is a tensor (tensor is a fancy name for a multidimensional matrix) of shape (400, 64, 64): this means that the x array contains 400 samples of 64x64 matrices (read images). When in doubt about these things, just print out the first elements of the tensor and try to figure out the structure of the data with what you know. For instance, we know from the dataset description that we have 400 samples and that each image is 64 x 64 pixel. We flatten the X tensor into a matrix of shape 400x4096. That is, each 64 x 64 matrix (image) is now converted (flattened) to a row vector of length 4096.

As for y, y is already a simple column vector of size 400. No need to reshape it.

Then let’s prepare the dataframe for each image and then using a for loop we will resize it and set it to greyscale. We also set names: First columns are the labels and the other columns are the pixels.

rs_df <- data.frame()

# Main loop: for each image, resize and set it to greyscale
for(i in 1:nrow(X))
{
    # Try-catch
    result <- tryCatch({
    # Image (as 1d vector)
    img <- as.numeric(X[i,])
    # Reshape as a 64x64 image (EBImage object)
    img <- Image(img, dim=c(64, 64), colormode = "Grayscale")
    # Resize image to 28x28 pixels
    img_resized <- resize(img, w = 28, h = 28)
    # Get image matrix
    img_matrix <- img_resized@.Data
    # Coerce to a vector
    img_vector <- as.vector(t(img_matrix))
    # Add label
    label <- labels[i,]
    vec <- c(label, img_vector)
    # Stack in rs_df using rbind
    rs_df <- rbind(rs_df, vec)
    # Print status
    print(paste("Done",i,sep = " "))},
    # Error function (just prints the error). Btw you should get no errors :)
    error = function(e){print(e)})
}

# Set names. The first columns are the labels, the other columns are the pixels.
names(rs_df) <- c("label", paste("pixel", c(1:784)))

Then, as per-usual for any classification system we need to split our dataset to training and testing. The following code does that.

# Set seed for reproducibility purposes
set.seed(100)

# Shuffled df
shuffled <- rs_df[sample(1:400),]

# Train-test split
train_14 <- shuffled[1:360, ]
test_14 <- shuffled[361:400, ]

# Save train-test datasets
write.csv(train_28, paste0(path,"train_28.csv"), row.names = FALSE)
write.csv(test_28, paste0(path,"test_28.csv"), row.names = FALSE)

# Done!
print("Done!")

Create, train and test a convolutional neural network

Now the fun starts! Let’s load some necessary packages first.

install.packages("drat", repos="https://cran.rstudio.com")
drat:::addRepo("dmlc")

First step is to load the datasets (in the format we already transformed them to

train <- read.csv("train_28.csv")
test <- read.csv("test_28.csv")

The next step is to transform the datasets into the necessary format.

train <- data.matrix(train)
train_x <- t(train[, -1])
train_y <- train[, 1]
train_array <- train_x
dim(train_array) <- c(28, 28, 1, ncol(train_x))

test_x <- t(test[, -1])
test_y <- test[, 1]
test_array <- test_x
dim(test_array) <- c(28, 28, 1, ncol(test_x))

Check the dimensions of train_array and test_array.

The next steps of the code present how we can build a convolutional neural network layer-by-layer. Notice that the names of the layers are descriptive enough and each layer has specific parameters (already discussed during the lecture). Try to identify the structure of the network and link it with the models we discussed. Also notice how each layer is connected to the consecutive layer and by this way in the end we have the \(NN\_model\) variable which can be used to summarize the whole neural network.

data <- mx.symbol.Variable('data')
# 1st convolutional layer
conv_1 <- mx.symbol.Convolution(data = data, kernel = c(4, 4), num_filter = 20)
tanh_1 <- mx.symbol.Activation(data = conv_1, act_type = "tanh")
pool_1 <- mx.symbol.Pooling(data = tanh_1, pool_type = "max", kernel = c(2, 2), stride = c(2, 2))
# 2nd convolutional layer
conv_2 <- mx.symbol.Convolution(data = pool_1, kernel = c(4, 4), num_filter = 50)
tanh_2 <- mx.symbol.Activation(data = conv_2, act_type = "tanh")
pool_2 <- mx.symbol.Pooling(data=tanh_2, pool_type = "max", kernel = c(2, 2), stride = c(2, 2))
# 1st fully connected layer
flatten <- mx.symbol.Flatten(data = pool_2)
fc_1 <- mx.symbol.FullyConnected(data = flatten, num_hidden = 500)
tanh_3 <- mx.symbol.Activation(data = fc_1, act_type = "tanh")
# 2nd fully connected layer
fc_2 <- mx.symbol.FullyConnected(data = tanh_3, num_hidden = 40)
# Output. Softmax output since we'd like to get some probabilities.
NN_model <- mx.symbol.SoftmaxOutput(data = fc_2)

Next, we are ready to train such a model (be prepared that this might take some time). Let’s first declare some parameters.

# Set seed for reproducibility
mx.set.seed(100)

# Device used. CPU in this case.
devices <- mx.cpu()

# Train the model
model <- mx.model.FeedForward.create(NN_model,
                                     X = train_array,
                                     y = train_y,
                                     ctx = devices,
                                     num.round = 500,
                                     array.batch.size = 40,
                                     learning.rate = 0.01,
                                     momentum = 0.9,
                                     eval.metric = mx.metric.accuracy,
                                     epoch.end.callback = mx.callback.log.train.metric(100))

Finally, as in every model, we can perform a testing on the accuracy, based on our training set.

# Predict labels
predicted <- predict(model, test_array)
# Assign labels
predicted_labels <- max.col(t(predicted)) - 1
# Get accuracy
sum(diag(table(test[, 1], predicted_labels)))/40

Your task

Lets try to use a deep ConvNet in order to train on the MNIST dataset. In the code from the previous network you need to adjust the output so as to give us 10 classes. Also use 5 by 5 kernels (and not 4 by 4). This resembles the LeNet neural network (as proposed by LeCun).

Remember, that as with the Olivetti faces, you need to pre-process your data so as to be in tensor format (28, 28, 1).