TF Lite quantization

qiuzhangTiTi · Feb 2, 2021

Screen Shot 2021-02-01 at 6.52.01 PM.png

I used this example to quantize my model. However, after I print( input_details[0]['dtype'] ), the input type is still float32. So how to get uint8 input type? Thanks.

Andrey Ignatov · Feb 2, 2021

Hi @qiuzhangTiTi,

1. What TensorFlow version are you using?
2. Is the experimental_new_converter / experimental_new_quantizer enabled?

qiuzhangTiTi · Feb 2, 2021

Andrey Ignatov said:
Hi @qiuzhangTiTi,

1. What TensorFlow version are you using?
2. Is the experimental_new_converter / experimental_new_quantizer enabled?

1. TF2.3
2. Yes.
I found the problem and it seems "depth_to_space" op doesn't support uint8 input quantization. But I can feed float32 (0~255) input to the quant model. But it is extremely slow on PC.

Andrey Ignatov · Feb 2, 2021

qiuzhangTiTi said:
1. TF2.3

You need to update to TensorFlow 2.4 first.

Almost every TF version contains a number of critical model quantization issues. Since TF / TFLite teams have some great problems with a meaningful strategy here, they prefer to change the entire quantization logic in almost every TF release. As a result, the same quantization code might produce a model with floating-point input / output nodes, inserted quantize / dequant ops, or even completely corrupted network when you use different TF builds. The provided code was developed and tested with TF 2.4 as this is the latest official build supporting the majority of key TFLite ops, and it is possible to get almost fully-quantized model using it. If you have some issues with updating to this version - you can just create a separate python virtual environment with tf-cpu-2.4 and use it only for model conversion.

qiuzhangTiTi said:
2. Yes.

As was mentioned in the tutorial, you should try to convert your model both with the new MLIR and the old TOCO converters as the resulting model will also be different in these two cases for the majority of architectures. I guess the old TOCO converter (experimental_new_converter = False) might be also a solution to this problem.

qiuzhangTiTi said:
However, after I print( input_details[0]['dtype'] )

One additional tip - you can use Netron for visualizing your TFLite model and checking the type of all its nodes. It's really convenient.

qiuzhangTiTi · Feb 2, 2021

Andrey Ignatov said:
You need to update to TensorFlow 2.4 first.

Almost every TF version contains a number of critical model quantization issues. Since TF / TFLite teams have some great problems with a meaningful strategy here, they prefer to change the entire quantization logic in almost every TF release. As a result, the same quantization code might produce a model with floating-point input / output nodes, inserted quantize / dequant ops, or even completely corrupted network when you use different TF builds. The provided code was developed and tested with TF 2.4 as this is the latest official build supporting the majority of key TFLite ops, and it is possible to get almost fully-quantized model using it. If you have some issues with updating to this version - you can just create a separate python virtual environment with tf-cpu-2.4 and use it only for model conversion.

As was mentioned in the tutorial, you should try to convert your model both with the new MLIR and the old TOCO converters as the resulting model will also be different in these two cases for the majority of architectures. I guess the old TOCO converter (experimental_new_converter = False) might be also a solution to this problem.

One additional tip - you can use Netron for visualizing your TFLite model and checking the type of all its nodes. It's really convenient.

Sounds great. Thanks.

deepernewbie · Feb 26, 2021

Same here

TFlite post quantization fails with

Quantization not yet supported for op: 'DEPTH_TO_SPACE'

using tensorflow 2.4.1

Edit:
installed tf-nightly as well now it works however whenever I select int8 from ai-benchmark application and try to run the model the application crashes

Andrey Ignatov · Mar 2, 2021

deepernewbie said:
whenever I select int8 from ai-benchmark application and try to run the model the application crashes

Can you please send me the link to your model by private message here?

deepernewbie · Mar 3, 2021

Andrey Ignatov said:
Can you please send me the link to your model by private message here?

Please see my other post in "Real-Time Image Super-Resolution Challenge" topic, where I included a simple model with Depth_to_space

TF Lite quantization

qiuzhangTiTi

New member

Andrey Ignatov

Administrator

qiuzhangTiTi

New member

Andrey Ignatov

Administrator

qiuzhangTiTi

New member

deepernewbie

New member

Andrey Ignatov

Administrator

deepernewbie

New member