AlexNet - Proving CNNs Can Do It (2012)

David Landup
David Landup

AlexNet, written by Alex Krizhevsky, Ilya Sutskever and Geoffrey Hinton, was released in 2012. At the time of writing, it's been a full decade since its release! It was successor to LeNet5 and competed in the 2012 ILSVRC challenge, beating the rest of the competitors by more than 10 percentage points in the top-5 error rate! While LeNet5 used a single convolution block, followed by average pooling, AlexNet used multiple stacked convolution layers. They highlighted how non-saturating relu helps train faster and produces more accurate networks than saturating tanh, after which relu has been used extensively.

This depth of the network was essential to the performance, at the cost of longer training with more parameters. It starts out with a fairly large kernel size (11, 11) and stride size (4, 4), and ends up with a much more common (3, 3) kernel size with a much smaller stride. The second convolutional block takes a normalized and pooled representation of the first, so we'll add a MaxPooling2D and BatchNormalization in between them.

The third, fourth and fifth convolutional layers are stacked on top of each other without any normalization or pooling. Finally, the maps are flattened and a dense classifier on top, with large dropouts (0.5) sprinkled in, is used. Since it was written for ImageNet - it has 1000 output classes, but for our dataset, we'll use an output of 10 classes.

Start project to continue
Lessson 3/14
You must first start the project before tracking progress.
Mark completed

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms