|
| 1 | +# Quantization Python Programming API Examples |
| 2 | + |
| 3 | +Content: |
| 4 | +* [Goal](#goal) |
| 5 | +* [Prerequisites](#prerequisites) |
| 6 | +* [Step-by-step Procedure for ResNet50 Quantization](#step-by-step-procedure-for-resnet50-quantization) |
| 7 | +* [More verified models](#more-verified-models) |
| 8 | +* [Docker support](#docker-support) |
| 9 | + |
| 10 | + |
| 11 | + |
| 12 | +## Goal |
| 13 | + |
| 14 | +The Quantization Python programming API is to: |
| 15 | +* Unify the quantization tools calling entry, |
| 16 | +* Transparent the model quantization process, |
| 17 | +* Reduce the quantization steps, |
| 18 | +* Seamlessly adpat to inference with python script. |
| 19 | + |
| 20 | +This feature is under active development, and more intelligent features will come in next release. |
| 21 | + |
| 22 | + |
| 23 | + |
| 24 | +## Prerequisites |
| 25 | + |
| 26 | +* TensorFlow build and install from source knowledge are required, as the Quantization Python Programming API extends the transform functions of [Graph Transform Tool](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md) in TensorFlow. |
| 27 | +* The source release repo of Intel® AI Quantization Tools for TensorFlow. |
| 28 | +```bash |
| 29 | +$ cd ~ |
| 30 | +$ git clone https://github.com/IntelAI/tools.git quantization && cd quantization |
| 31 | +$ export PYTHONPATH=${PYTHONPATH}:${PWD} |
| 32 | +``` |
| 33 | + |
| 34 | + |
| 35 | + |
| 36 | +## Step-by-step Procedure for ResNet50 Quantization |
| 37 | + |
| 38 | +In this section, the frozen pre-trained model and ImageNet dataset will be required for fully automatic quantization. |
| 39 | + |
| 40 | +```bash |
| 41 | +$ cd ~/quantization/api/models/resnet50 |
| 42 | +$ wget https://storage.googleapis.com/intel-optimized-tensorflow/models/resnet50_fp32_pretrained_model.pb |
| 43 | +``` |
| 44 | + |
| 45 | +If want to enable the example of ResNet50 v1.5, please download the frozen pre-trained model from the link below. |
| 46 | + |
| 47 | +```bash |
| 48 | +$ cd ~/quantization/api/models/resnet50v1_5 |
| 49 | +$ wget https://zenodo.org/record/2535873/files/resnet50_v1.pb |
| 50 | +``` |
| 51 | + |
| 52 | +The TensorFlow models repo provides [scripts and instructions](https://github.com/tensorflow/models/tree/master/research/slim#an-automated-script-for-processing-imagenet-data) to download, process, and convert the ImageNet dataset to the TF records format. |
| 53 | + |
| 54 | +1. Download TensorFlow source, patch Graph Transform Tool and install the TensorFlow. |
| 55 | +```bash |
| 56 | +$ cd ~/ |
| 57 | +$ git clone https://github.com/tensorflow/tensorflow.git |
| 58 | +$ cd tensorflow |
| 59 | +$ git checkout v1.14.0 |
| 60 | +$ cp ../quantization/tensorflow_quantization/graph_transforms/* tensorflow/tools/graph_transforms/ |
| 61 | +``` |
| 62 | +And then [build and install TensorFlow from Source with Intel® MKL](https://software.intel.com/en-us/articles/intel-optimization-for-tensorflow-installation-guide). |
| 63 | + |
| 64 | + |
| 65 | + |
| 66 | +2. Run demo script |
| 67 | +```bash |
| 68 | +$ python api/quantize_model.py \ |
| 69 | +--model=resnet50 \ |
| 70 | +--model_location=path/to/resnet50_fp32_pretrained_model.pb \ |
| 71 | +--data_location=path/to/imagenet |
| 72 | +``` |
| 73 | + |
| 74 | +Check the input parameters of pre-trained model, dataset path to match with your local environment. And then execute the python script, you will get the fully automatic quantization conversion from FP32 to INT8. |
| 75 | + |
| 76 | + |
| 77 | + |
| 78 | +3. Performance Evaluation |
| 79 | + |
| 80 | +Finally, verify the quantized model performance: |
| 81 | + * Run inference using the final quantized graph and calculate the model accuracy. |
| 82 | + * Typically, the accuracy target is the optimized FP32 model accuracy values. |
| 83 | + * The quantized `INT8` graph accuracy should not drop more than ~0.5-1%. |
| 84 | + |
| 85 | + Check [Intelai/models](https://github.com/IntelAI/models) repository and [ResNet50](https://github.com/IntelAI/models/tree/master/benchmarks/image_recognition/tensorflow/resnet50) README for TensorFlow models inference benchmarks with different precisions. |
| 86 | + |
| 87 | + |
| 88 | + |
| 89 | +## More verified models |
| 90 | + |
| 91 | +The following models are also verified: |
| 92 | + |
| 93 | +- [SSD-MobileNet](#ssd-mobilenet) |
| 94 | +- [SSD-ResNet34](#ssd-resnet34) |
| 95 | + |
| 96 | + |
| 97 | + |
| 98 | +### SSD-MobileNet |
| 99 | + |
| 100 | +Download and extract the pre-trained SSD-MobileNet model from the [TensorFlow detection model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models). The downloaded .tar file includes a `frozen_inference_graph.pb` which will be used as the input graph for quantization. |
| 101 | + |
| 102 | +```bash |
| 103 | +$ wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_2018_01_28.tar.gz |
| 104 | +$ tar -xvf ssd_mobilenet_v1_coco_2018_01_28.tar.gz |
| 105 | +``` |
| 106 | + |
| 107 | + |
| 108 | + |
| 109 | +Follow the [instructions](https://github.com/IntelAI/models/blob/master/benchmarks/object_detection/tensorflow/ssd-mobilenet/README.md#int8-inference-instructions) to prepare your local environment and build ssd_mobilenet_callback_cmds() command to generate the min. and max. ranges for the model calibration. |
| 110 | + |
| 111 | +```python |
| 112 | +_INPUTS = ['image_tensor'] |
| 113 | +_OUTPUTS = ['detection_boxes', 'detection_scores', 'num_detections', 'detection_classes'] |
| 114 | + |
| 115 | + |
| 116 | +def ssd_mobilenet_callback_cmds(): |
| 117 | + # This command is to execute the inference with small subset of the training dataset, and get the min and max log output. |
| 118 | + |
| 119 | +if __name__ == '__main__': |
| 120 | + c = convert.GraphConverter('path/to/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb', None, _INPUTS, _OUTPUTS, excluded_ops=['ConcatV2'], per_channel=True) |
| 121 | + c.gen_calib_data_cmds = ssd_mobilenet_callback_cmds() |
| 122 | + c.convert() |
| 123 | +``` |
| 124 | + |
| 125 | + |
| 126 | + |
| 127 | + |
| 128 | + |
| 129 | +### SSD-ResNet34 |
| 130 | + |
| 131 | +Download the pretrained model: |
| 132 | + |
| 133 | +```bash |
| 134 | +$ wget https://storage.googleapis.com/intel-optimized-tensorflow/models/ssd_resnet34_fp32_bs1_pretrained_model.pb |
| 135 | +``` |
| 136 | + |
| 137 | + |
| 138 | + |
| 139 | +Follow the [instructions](https://github.com/IntelAI/models/blob/master/benchmarks/object_detection/tensorflow/ssd-resnet34/README.md#int8-inference-instructions) to prepare your int8 accuracy commands to generate the min. and max. ranges for the model calibration. |
| 140 | + |
| 141 | +```python |
| 142 | +_INPUTS = ['input'] |
| 143 | +_OUTPUTS = ['v/stack', 'v/Softmax'] |
| 144 | + |
| 145 | + |
| 146 | +def ssd_resnet34_callback_cmds(): |
| 147 | + # This command is to execute the inference with small subset of the training dataset, and get the min and max log output. |
| 148 | + |
| 149 | + |
| 150 | +if __name__ == '__main__': |
| 151 | + c = convert.GraphConverter('path/to/ssd_resnet34_fp32_bs1_pretrained_model.pb', None, _INPUTS, _OUTPUTS, excluded_ops=['ConcatV2']) |
| 152 | + c.gen_calib_data_cmds = ssd_resnet34_callback_cmds() |
| 153 | + c.convert() |
| 154 | +``` |
| 155 | + |
| 156 | + |
| 157 | + |
| 158 | + |
| 159 | + |
| 160 | +## Docker support |
| 161 | + |
| 162 | +* For docker environment, the procedure is same as above. |
| 163 | + |
0 commit comments