Skip to content

This is a self-thought Octization Method, that is substancially better then Quantization.

Notifications You must be signed in to change notification settings

renatokuipers/Octization-vs.-Quantization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quantization and Octization Methodologies

This repository contains implementations and comparisons of two methodologies for quantizing neural network weights: Standard Quantization and a self-made approach called "Octization". The aim is to provide insights into their effectiveness in terms of accuracy, model size, inference time, and loss metrics.

Methodologies

Standard Quantization

Standard quantization converts floating-point weights into lower bit representations. Typically, the conversion is done using an 8-bit representation. The conversion function scales the weights to a discrete range, reducing the precision but potentially saving memory and speeding up computations.

Code Snippet for Quantization:

def float_to_quantized(value, num_bits):
    max_val = 2**num_bits - 1
    quantized_val = int(value * max_val)
    return quantized_val

def quantize_weights(weights, num_bits=8):
    max_val = 2**num_bits - 1
    quantized_weights = np.round(weights * max_val).astype(np.int32)
    return quantized_weights

Octization

Octization is a self-made approach that represents weights in an octal system. This approach aims to offer a different granularity in quantization, which can sometimes (but most of the time) provide better performance due to the unique representation scheme.

Code Snippet for Octization:

def float_to_octal(value, num_bits):
    max_val = 2**num_bits - 1
    octal_val = int(value * max_val)
    octal_str = oct(octal_val)
    return octal_str

def quantize_to_octal(weights, num_bits=8):
    quantized_weights = np.vectorize(lambda x: float_to_octal(x, num_bits))(weights)
    return quantized_weights

Results

Quantization Results

Quantized representation of 0.75 with 8 bits:

191

Original Weights:

[[0.03279644, 0.37225065, 0.03883055, 0.32002673, 0.37683514, ...]]

Quantized Weights:

[[ 8,  95,  10,  82,  97, ...]]

Training Metrics:

Epoch 1/50: Loss: 382.4389
Epoch 50/50: Loss: 156.6920

Predictions:

[[ -1.8169721 ], [ 14.134164  ], [-10.419181  ], [ -9.413474  ], [ -7.104429  ], ...]

Octization Results

Octal representation of 0.75 with 8 bits:

0o277

Original Weights:

[[0.03279644, 0.37225065, 0.03883055, 0.32002673, 0.37683514, ...]]

Quantized Weights:

[['0o10', '0o136', '0o11', '0o121', '0o140', ...]]

Training Metrics:

Epoch 1/50: Loss: 98.0795
Epoch 50/50: Loss: 42.9256

Predictions:

[[ 26.467916 ], [  1.4521854], [ 14.643928 ], [ -3.8951225], [-12.918487 ], ...]

MNIST Dataset Results

Standard Quantization

Training:

Epoch 1/20: Accuracy: 81.56%, Loss: 1.9228
Epoch 20/20: Accuracy: 99.36%, Loss: 0.0242

Evaluation:

Test Accuracy: 96.73%, Test Loss: 0.2253, Inference Time: 27.97s, Model Size: 101870

Octization

Training:

Epoch 1/20: Accuracy: 80.83%, Loss: 2.0764
Epoch 20/20: Accuracy: 99.19%, Loss: 0.0288

Evaluation:

Test Accuracy: 96.93%, Test Loss: 0.1858, Inference Time: 27.99s, Model Size: 101870

CIFAR-10 Dataset Results

5 Epochs

Standard Quantization

Training:

Epoch [1/5]: Loss: 2.2742
Epoch [5/5]: Loss: 2.1364

Evaluation:

Accuracy: 29.14%, Inference Time: 53.67s, Model Size: 282250

Octization

Training:

Epoch [1/5]: Loss: 2.1202
Epoch [5/5]: Loss: 1.7040

Evaluation:

Accuracy: 40.40%, Inference Time: 54.19s, Model Size: 282250

20 Epochs

Standard Quantization

Training:

Epoch [1/20]: Loss: 2.2671
Epoch [20/20]: Loss: 1.9121

Evaluation:

Accuracy: 33.32%, Inference Time: 214.58s, Model Size: 282250

Octization

Training:

Epoch [1/20]: Loss: 2.1327
Epoch [20/20]: Loss: 1.3204

Evaluation:

Accuracy: 51.89%, Inference Time: 217.92s, Model Size: 282250

Conclusion

Overall, the Octization method achieved substantially higher accuracy, showing an improvement of 18.57 percentage points. This is a very significant difference.

Inference Time:

Standard Quantization: 214.58s Octization: 217.92s

The inference times are still very close, with Octization being slightly slower (by about 3.34 seconds over 20 epochs, or about 0.167 seconds per epoch).

Model Size:

Both models still have the same size (282,250 parameters).

Analysis:

Learning Dynamics: The Octization method shows much better learning dynamics. Its loss decreases more rapidly and continues to decline steadily throughout all 20 epochs. In contrast, the standard quantization method's loss decreases more slowly and seems to be plateauing towards the end.

Final Performance: The Octization method achieves a remarkably higher accuracy (51.89% vs 33.32%). This is a substantial improvement, especially considering that this is for the CIFAR-10 dataset, which is a challenging multi-class classification problem.

Computational Efficiency: Despite the significantly better performance, the Octization method only incurs a minimal computational overhead (about 1.5% increase in inference time).

Scalability: The performance gap between the two methods widened with more epochs (from about 11 percentage points at 5 epochs to 18.57 at 20 epochs), suggesting that the Octization method may have better scalability and could potentially achieve even higher performance with further training.

Conclusions:

Based on these extended results, the Octization method is clearly superior to the standard quantization method for the CIFAR-10 task:

It achieves much higher accuracy (51.89% vs 33.32%), which is a very significant improvement for CIFAR-10. It shows consistently better training dynamics, with lower loss values throughout the training process. The computational overhead remains minimal, even with extended training. The widening performance gap suggests that Octization might have a higher performance ceiling than standard quantization.

The Octization method appears to be providing a more effective way of representing and processing information within the network. This will be due to: - Better preservation of important feature information during the quantization process. - More effective gradient flow during backpropagation, leading to better weight updates. - A quantization scheme that may be particularly well-suited to the types of features and patterns present in natural images.

In conclusion, the Octization method demonstrates remarkably better performance than standard quantization in this extended experiment. It offers substantially improved accuracy with minimal computational overhead, and shows better learning dynamics and scalability. This is a very promising result as far as I can see, and it suggests that the Octization approach could be a significant contribution to the field of quantized neural networks.

Acknowledgments

  • TensorFlow and PyTorch for providing the frameworks used in this project.

!! Feel free to clone the repository and run the provided scripts to further explore and analyze the quantization and octization methodologies. !!

About

This is a self-thought Octization Method, that is substancially better then Quantization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages