What is Alex-Net ? Beginners guide to Alex-Net

Introduction to the architecture of Alex-Net. The 101 Guide to Alex-Net Neural Network

June 4, 2020 Atharv Shah

2 minute read

It’s a type of Deep Convolutional Neural Network that has a large impact on Machine Learning and specifically in application of deep learning to computer Vision.

It was a research work of Alex Krizhesvsky, Geoffrey Hinton and IIya Sutskever at the University of Toronto.

It had architecture very similar to the LeNet proposed by Yann LeCun in 1998 .

What Makes it different ?

Earlier before 2000s mainly two activation functions were used which were sigmoid and tanh ,but here it used ReLu activation Function as it solved the problem of vanishing Gradient .
Implementation of techniques like Dropout in order to reduce over fitting .
Used Multiple GPU’s (use two GTX 580 GPU of 3GB RAM ) by parallelism .
Data Augmentation .
LRM ( local response normalization ) - But it didn’t made much difference

How it gained popularity ?

After it famously won the ImageNet ILSVRC-2012 competition by a very large margin .

Dataset

The Image-Net (http://www.image-net.org/) is a dataset of over 15 million labelled high resolution images belonging to roughly 22,000 categories.

The dataset was collected by human labelers using Amazon’s Mechanical Turk crowd-sourcing tool.

Architecture

Source ShutterStock

Number of convolutional Layer - 5

Number of Max-Pool Layer - 3

Number of Fully connected Layers (dense) - 2

Output layer with softmax activation function.

You can use any Weight Initializer , But it’s advisable to use to use :-

glorot _nomal or glorot_uniform with tanh and sigmoid function.
He_normal and He_uniform with ReLu function.

Training

Epochs = 90

Used SGD with Momentum with following hyperparameter values .

SGD = 0.01 Learning rate

Momentum = 0.9

It was trained for 6 days continuously on two Nvidia Geforce 580 GPUs.

Results

AlexNet achieved top-1 and top-5t est set error rates of 37.5% and 17.0%5. The best performance achieved during the ILSVRC-2010 competition was 47.1% and 28.2%.

Which shows the winning with great margin .

References

Original Research paper - (https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)
Architecture Image- (https://www.shutterstock.com/image-vector/vector-tech-icon-alexnet-convolutional-neural-1330331141)
Code in Keras - (http://euler.stat.yale.edu/~tba3/stat665/lectures/lec18/notebook18.html).

About the author

Atharv Shah , is an undergrad pursuing Btech in computer science , he is fascinated by the artificial intelligence and it’s application used to solve real world problems .

Atharv Shah

Reviews

If You find it interesting!! we would really like to hear from you.

Ping us at Instagram/@the.blur.code

If you want articles on Any topics dm us on insta.

Thanks for reading!! Happy Coding

Blur Code