ML | Getting Started With AlexNet

AlexNet is a deep convolutional neural network used for image classification. It consists of multiple convolutional and fully connected layers designed to extract features and perform classification efficiently. It's features are:

ReLU activation enables faster training and better gradient flow.
Dropout reduces overfitting in fully connected layers.
Data augmentation helps in improving model generalization on image data.

Architecture

5 convolutional layers with max pooling after the 1st, 2nd, and 5th layers.
Overlapping max pooling (3×3 filter, stride 2) improves performance.
2 fully connected layers with dropout for regularization.
Softmax layer for final classification output.

Implementation

1. Importing Libraries

Import libraries like

tensorflow for building and training neural networks
matplotlib for visualizing results

Python

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Activation, Dropout, BatchNormalization
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt

2. Loading and Preprocessing CIFAR-10 Dataset

CIFAR-10 contains 60,000 32×32 RGB images across 10 classes.
Pixel values are scaled to [0, 1].
Labels are one-hot encoded for softmax classification.

Python

# Load CIFAR-10 data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Normalize pixel values
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# One-hot encode the labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

3. Defining the AlexNet Model (Adjusted for CIFAR-10)

Adapted for CIFAR-10: Handles 32×32 images with 10 output classes.
Reduced FC layers: Prevents overfitting on small datasets.
Uses ReLU, Dropout, BatchNorm and softmax in the final layer.

Python

model = Sequential()

# Layer 1
model.add(Conv2D(96, kernel_size=(3,3), strides=(1,1), input_shape=(32,32,3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(BatchNormalization())

# Layer 2
model.add(Conv2D(256, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(BatchNormalization())

# Layer 3
model.add(Conv2D(384, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(Activation('relu'))

# Layer 4
model.add(Conv2D(384, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(Activation('relu'))

# Layer 5
model.add(Conv2D(256, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))

# Flatten
model.add(Flatten())

# Fully Connected Layer 1
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))

# Fully Connected Layer 2
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))

# Output Layer
model.add(Dense(10))
model.add(Activation('softmax'))

4. Compiling the Model

Using adam optimizer and categorical_crossentropy for multi-class classification.

Python

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

5. Training the Model

Train for 15 epochs, with 20% validation split.
You can increase epochs for better accuracy.

Python

history = model.fit(x_train, y_train,
                    batch_size=128,
                    epochs=15,
                    validation_split=0.2,
                    verbose=1)

Output:

6. Evaluating the Model

Evaluates the trained model on test data to measure accuracy and performance.

Python

test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f'Test Accuracy: {test_acc:.4f}')

Output:

Test Accuracy: 0.7387

7. Plotting Training & Validation Accuracy

Plots training and validation accuracy to visualize model performance over epochs.

Python

plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('AlexNet on CIFAR-10 (GPU)')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True)
plt.show()

Output:

Advantages

Uses ReLU activation for faster training compared to traditional tanh/sigmoid.
Applies dropout to reduce overfitting during training.
Utilizes GPU-based parallel computation for faster processing.
Uses overlapping max pooling to improve generalization and performance.

Disadvantages

Has a large number of parameters, making it memory-intensive.
Requires high computational resources for training.
Lacks modular and automated architecture design.
Tends to overfit on small datasets.
Does not include modern architectural improvements.

Applications

Used for image classification of objects in images.
Acts as a feature extractor for transfer learning tasks.
Serves as a backbone for object detection models.
Applied in medical imaging for detecting abnormalities.
Used in facial recognition and emotion detection systems.
Helps in identifying objects in autonomous driving systems.

ML | Getting Started With AlexNet

Architecture

Implementation

1. Importing Libraries

2. Loading and Preprocessing CIFAR-10 Dataset

3. Defining the AlexNet Model (Adjusted for CIFAR-10)

4. Compiling the Model

5. Training the Model

6. Evaluating the Model

7. Plotting Training & Validation Accuracy

Advantages

Disadvantages

Applications

Explore