Skip to content

foodinsect/MNIST-CNN-FPGA

Repository files navigation

MNIST CNN Accelerator Design

Overview

This project focuses on designing a low-power CNN accelerator tailored for the MNIST dataset. By implementing efficient memory access and resource management techniques, the design minimizes power consumption while achieving high inference performance.

System Block Diagram

Below is a simplified version of the overall system block diagram:

Overall Block Diagram

Key Design Features

  1. Memory Access Minimization in PE Array
    To reduce power consumption, the design minimizes external memory access by efficiently utilizing on-chip buffers and PE arrays.

    PE Array Memory Access Minimization

  2. FIFO, MaxPooling, and ReLU Integration
    A tightly coupled FIFO, MaxPooling, and ReLU module ensures streamlined data processing while maintaining flexibility for hardware optimization.

    FIFO & MaxPooling, ReLU

  3. Shift Buffer Utilization
    Shift buffers are used for managing input data in convolution operations, reducing redundant memory reads and improving computational efficiency.

    Shift Buffer Utilization

  4. Fully Connected (FC) Layer Implementation
    The FC layer is implemented with a dedicated computation module that leverages efficient resource allocation and parallelism.

    FC Layer


Performance Results

Inference on 1 Image

The accelerator achieves efficient inference on a single MNIST image with minimal latency.

1 Image Inference

Inference on 1000 Images

The system demonstrates consistent performance when processing 1000 images, showcasing its scalability and robustness.

1000 Images Inference


Key Design Differentiators

  1. Low-Power Design

    • Efficient memory access techniques (PE Array + Shift Buffers).
    • Optimized control logic for idle-cycle reduction in processing elements.
  2. Resource Utilization

    • Reuse of FIFO buffers and PE arrays across multiple operations.
    • Minimal external memory bandwidth usage through data locality exploitation.
  3. Scalable Architecture

    • Modular design supports easy extension to larger datasets or different model architectures.
    • Lightweight implementation suitable for resource-constrained environments.
  4. Hardware-Software Co-Design

    • Integration of software control logic for flexible CNN model configuration.
    • Custom AXI4 interface for seamless communication between hardware and software.

Conclusion

This project demonstrates a well-optimized hardware accelerator for MNIST CNN inference with a focus on low-power and high-efficiency design. The techniques implemented here can be extended to more complex deep learning models, making it a valuable reference for future hardware design projects.