LeNet5 - A Blast from the Past (1998)
LeNet5 - A Blast from the Past (1998)
LeNet5 was created for recognizing hand-written digits, with a 28x28 size. It used tanh
activations, and radial basis function for the final activation, instead of a more modern softmax
, which outputs probabilities of each class. Distinctly, the network was originally written to use the Mean-Squared Error function as a loss function!
Other than these choices - the paper looks like a pretty standard modern deep learning paper. It discusses the data, the preprocessing steps taken, the architecture of the network, hyperparameters, and results. Now, the network itself was made for digit recognition, not classifying churches from cassette players, and it was tuned to work well for that dataset. At the time, classifying digits was a daunting task, and we couldn't expect the architecture to perform too well on something like Imagenette:
lenet5 = keras.Sequential([
keras.layers.InputLayer(input_shape=(None, None,3)),
# Preprocessing made as part of the model itself
# 'tanh' didn't do well with data augmentation
# Resizing images down to 28, 28, since our input is 224, 224
keras.layers.experimental.preprocessing.Resizing(28, 28),
keras.layers.Conv2D(6, (5,5), padding='same', activation='tanh'),
keras.layers.AveragePooling2D((2,2)),
keras.layers.Conv2D(16, (5,5), padding='same', activation='tanh'),
keras.layers.AveragePooling2D((2,2)),
keras.layers.Conv2D(120, (5,5), padding='same', activation='tanh'),
keras.layers.Flatten(),
keras.layers.Dense(84, activation='tanh'),
# LeCunn used Radial Basis Function, which isn't built into Keras
# Modern networks use 'softmax', so we'll use that instead to
# avoid having to define a custom activation function for now
keras.layers.Dense(10, activation='softmax')
])
lenet5.compile(loss='sparse_categorical_crossentropy', optimizer=keras.optimizers.SGD(), metrics=['accuracy'])