According to my current knowledge, the float32 or float16 TF Lite model should run faster than the uint8 model on a GPU device. So there is no need to perform int8 quantization for image denoising task which running on Exynos mail GPU. Is this correct? How about the image super-resolution task...
1. TF2.3
2. Yes.
I found the problem and it seems "depth_to_space" op doesn't support uint8 input quantization. But I can feed float32 (0~255) input to the quant model. But it is extremely slow on PC.
I used this example to quantize my model. However, after I print( input_details[0]['dtype'] ), the input type is still float32. So how to get uint8 input type? Thanks.