What is Alex-Net ? Beginners guide to Alex-Net
Introduction to the architecture of Alex-Net. The 101 Guide to Alex-Net Neural Network
It’s a type of Deep Convolutional Neural Network that has a large impact on Machine Learning and specifically in application of deep learning to computer Vision.
It was a research work of Alex Krizhesvsky, Geoffrey Hinton and IIya Sutskever at the University of Toronto.
It had architecture very similar to the LeNet proposed by Yann LeCun in 1998 .
What Makes it different ?
Earlier before 2000s mainly two activation functions were used which were sigmoid and tanh ,but here it used ReLu activation Function as it solved the problem of vanishing Gradient .
Implementation of techniques like Dropout in order to reduce over fitting .
Used Multiple GPU’s (use two GTX 580 GPU of 3GB RAM ) by parallelism .
Data Augmentation .
LRM ( local response normalization ) - But it didn’t made much difference
How it gained popularity ?
After it famously won the ImageNet ILSVRC-2012 competition by a very large margin .
Dataset
The Image-Net (http://www.image-net.org/) is a dataset of over 15 million labelled high resolution images belonging to roughly 22,000 categories.
The dataset was collected by human labelers using Amazon’s Mechanical Turk crowd-sourcing tool.
Architecture
Number of convolutional Layer - 5
Number of Max-Pool Layer - 3
Number of Fully connected Layers (dense) - 2
Output layer with softmax activation function.
You can use any Weight Initializer , But it’s advisable to use to use :-
glorot _nomal or glorot_uniform with tanh and sigmoid function.
He_normal and He_uniform with ReLu function.
Training
Epochs = 90
Used SGD with Momentum with following hyperparameter values .
SGD = 0.01 Learning rate
Momentum = 0.9
It was trained for 6 days continuously on two Nvidia Geforce 580 GPUs.
Results
AlexNet achieved top-1 and top-5t est set error rates of 37.5% and 17.0%5. The best performance achieved during the ILSVRC-2010 competition was 47.1% and 28.2%.
Which shows the winning with great margin .
References
- Original Research paper - (https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)
- Architecture Image- (https://www.shutterstock.com/image-vector/vector-tech-icon-alexnet-convolutional-neural-1330331141)
- Code in Keras - (http://euler.stat.yale.edu/~tba3/stat665/lectures/lec18/notebook18.html).
About the author
Atharv Shah , is an undergrad pursuing Btech in computer science , he is fascinated by the artificial intelligence and it’s application used to solve real world problems .
Reviews
If You find it interesting!! we would really like to hear from you.
Ping us at Instagram/@the.blur.code
If you want articles on Any topics dm us on insta.
Thanks for reading!!
Happy Coding