CASE STUDY 09

Fashion-MNIST Classification

PythonPyTorchDeep LearningCNNVision Transformer

Role

ML Engineer

Year

2024

Context

Academic Project

01.
The Problem

Understanding the trade-offs between different machine learning approaches is crucial. This project aimed to rigorously benchmark simple MLPs, CNNs, and Vision Transformers on a standard dataset to visualize performance vs. complexity.

I implemented a modular PyTorch pipeline to train and evaluate multiple architectures: PCA + MLP, standard CNNs, and patch-based Vision Transformers. I used consistent validation/test splits and metrics (Accuracy, Macro F1) to ensure fair comparison.

02.
The Approach

03. The Result

The study highlighted that while Transformers are powerful, CNNs remain highly efficient for specific image scales. The project serves as a solid flexible codebase for future computer vision experiments.

Source Code

PREVIOUS FILESolvit: AI-Powered Service Marketplace NEXT FILE ImgFS: Image File System