Convolutional Neural Networks - Beyond Basic Architectures
So far, we've been working with a very distinctive, very exemplary architecture. I've noted that it's fairly similar to the VGG architecture that used to reign supreme for a pretty short while, but which is slowly being phased out.
This sort of network is easy to understand because it's practically a 1-to-1 mapping to the most intuitive explanation of how CNNs work - through convolutional layers, pooling layers, flattening layers and a fully-connected layer. It's also the most intuitive to understand with a limited understanding of how the visual cortex works. If you're a neuroscientist - you've likely aggressively cringed at the simplification of the inner-workings of the visual cortex from earlier lessons. The concept of hierarchical representations is there - but that's where our implementation and the cortex part ways.
The architecture used so far is, in a sense, the most natural and gentle introduction to CNNs - conceptually and implementation-wise. It provides fairly decent performance (in terms of accuracy) but bottlenecks with the number of parameters. At this point, you've built multiple classifiers, all of them very capable. You've gotten introduced to the inner-workings of the classifiers, got exposed to latent space visualization, biases, challanged the notion that overfitting is bad, explored the implications of data augmentation and context, implemented a custom loss function and metric, explored class imbalance, and even wrote a research-grade classifier for Invasive Ductal Carcinoma!