Developed a CNN architecture from scratch to classify Cats and Dogs using their audio spectrograph images. Used the Adam optimizer and Data Augmentation to improve generalization. Dataset size of 277 images was divided into training, validation, and test data sets using the 80/10/10 rule The network was able to achieve a training and validation performance close to 99% with data augmentation and hyperparameter tuning. The network achieved approximately an accuracy for: Training: 98.19 % Validation: 96.30% Figure 1 and 2 below shows plots of training and validation accuracy and loss respectively