A bridge between sign language users and the world.
HandTalk is a machine learning-based solution to bridge the communication gap between sign language users and non-users. It uses the Sign Language MNIST dataset to classify static hand gestures into their corresponding alphabets. The project explores various models and concludes with a Convolutional Neural Network (CNN) as the best-performing model with an accuracy of 93.52%.
- Dataset Used: Sign Language MNIST
- Dataset comprises grayscale 28x28 images representing 24 static hand gestures (excluding 'J' and 'Z').
-
Data Preprocessing:
- Normalization: Pixel values are normalized to the range [0, 1].
- Dimensionality Reduction: Principal Component Analysis (PCA) reduces features from 784 to 50 for efficient training.
-
Model Training:
- Each model is trained on the dataset, followed by GridSearchCV for hyperparameter optimization.
- Key hyperparameters include penalty, max iterations, learning rate, and activation functions, tuned to improve performance.
-
Evaluation Metrics:
- Models are evaluated using accuracy, precision, recall, and F1-score metrics on the test dataset.
-
CNN Model Architecture: The CNN consists of convolutional layers for feature extraction, pooling layers for dimensionality reduction, and fully connected layers for classification.
Layer | Parameters |
---|---|
Conv2D | Filters: 32, Kernel: (3,3), Activation: ReLU |
MaxPooling2D | Pool size: (2,2), Strides: 2 |
Conv2D | Filters: 128, Kernel: (3,3), Activation: ReLU |
MaxPooling2D | Pool size: (2,2), Strides: 2 |
Dense | Units: 128, Activation: Tanh |
Output Layer | Units: 25, Activation: Softmax |
-
Model Selection:
- Based on performance metrics, CNN is identified as the best model with 93.52% accuracy, showcasing its robustness in image-based classification.
-
Prediction:
- The trained CNN model is used to predict unseen gestures.
Model | Hyperparameter | Best Value |
---|---|---|
Logistic Regression | Penalty | L2 |
Solver | Saga | |
C | 0.01 | |
Max Iterations | 200 | |
Decision Tree | Criterion | Entropy |
Max Depth | None | |
Min Samples Leaf | 1 | |
Min Samples Split | 2 | |
Splitter | Best | |
Random Forest | Criterion | Gini |
Max Depth | None | |
Min Samples Leaf | 1 | |
Min Samples Split | 2 | |
Number of Estimators | 100 | |
Perceptron | Penalty | L1 |
Alpha | 0.001 | |
Max Iterations | 1000 | |
Tolerance | 0.001 | |
Multi-Layer Perceptron | Activation | ReLU |
Alpha | 0.001 | |
Hidden Layer Sizes | (128,) | |
Learning Rate | Constant | |
Max Iterations | 500 | |
Solver | SGD | |
Support Vector Machine | Kernel | Linear |
C | 0.1 | |
Gamma | Scale | |
CNN | Optimizer | Adam |
Learning Rate | 0.0005 | |
Loss Function | Sparse Categorical Crossentropy |
Model | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
Logistic Regression | 69.38% | 72.46% | 69.38% | 69.88% |
Decision Tree | 47.87% | 49.81% | 47.87% | 48.35% |
Random Forest | 81.19% | 83.53% | 81.19% | 81.59% |
Perceptron | 26.55% | 65.13% | 26.55% | 29.85% |
Multi-Layer Perceptron | 76.14% | 76.61% | 76.14% | 75.83% |
Support Vector Machine | 84.19% | 85.68% | 84.19% | 84.44% |
Convolutional Neural Network | 93.52% | 93.76% | 93.52% | 93.47% |
Name | GitHub | |
---|---|---|
Aakash | GitHub | aakash21002@iiitd.ac.in |
Parveen | GitHub | parveen21079@iiitd.ac.in |
Shubham Sharma | GitHub | shubham21099@iiitd.ac.in |
Pourav Surya | GitHub | pourav21271@iiitd.ac.in |