TF Lite runtime

qiuzhangTiTi · Feb 9, 2021

According to my current knowledge, the float32 or float16 TF Lite model should run faster than the uint8 model on a GPU device. So there is no need to perform int8 quantization for image denoising task which running on Exynos mail GPU. Is this correct? How about the image super-resolution task (NPU)? Thanks.

Andrey Ignatov · Feb 11, 2021

qiuzhangTiTi said:
the float32 or float16 TF Lite model should run faster than the uint8 model on a GPU device

The inference time of float16 and uint8 models will be almost identical as the latter models are automatically dequantized to fp16 format when using the TFLite GPU delegate.

qiuzhangTiTi said:
So there is no need to perform int8 quantization for image denoising task which running on Exynos mail GPU.

No, in this challenge it is explicitly stated that you need to submit the standard floating-point TFLite models.

qiuzhangTiTi said:
How about the image super-resolution task (NPU)?

In the image super-resolution track, one should submit uint8 TFLite models as specified in the challenge submission rules.

TF Lite runtime

qiuzhangTiTi

New member

Andrey Ignatov

Administrator