Can't run very a basic model with GPU delegate. Problem with my conversion?

operator45

New member
Phone Model: Samsung Galaxy S21
APP version: 5.0.3. I also tried the nightly version (24.12.2022).
TF version: 2.10, tried 2.12 and 2.5 as well

I can run the model on CPU without any errors. However, once I choose the TFLite GPU Delegate, I am getting this error:
screenshot_simple.jpg
The model consists of just conv2d, split, add and floor ops.
1679052410315.png
I also used the TFLite Analyzer to check and it says that the model is compatible with the GPU delegate.
Here is the link for the tflite model and the code to reproduce: https://drive.google.com/drive/folders/10DyxR6JfBn2PM8-S6JIT_MidOCQxoxzw?usp=share_link

I can run some other complex models, that I have downloaded (not converted by me) without any problems on GPU. So seems like there is an issue with conversion on my end. Do you have any suggestions to which TF version I should use? What about the conversion parameters?

When I follow the official MNIST tutorial, it fails to run on GPU as well: https://www.tensorflow.org/lite/performance/post_training_float16_quant
screenshot_mnist.jpg
1679054992025.png
 
Last edited:

operator45

New member
I further removed the split op, but this still gives the following error. So there is no broadcast support for GPU delegate?

simple_model_nosplit.jpg1679054717982.png
I tried using tile, repeat, tf.broadcast_to to make the shapes of the tensors equal. But it didn't help. The TFlite model ignores the tile op. For the repeat op, it creates unnecessary expand_dims op and the app still can't run the model.
1679054771059.png
 

operator45

New member
If I make the input tensors equal size, the app runs with GPU delegate without errors. But it is not suitable for my model, I would always have some kind of broadcasting, tiling somewhere.
1679054905647.png
 

Andrey Ignatov

Administrator
Staff member
If I remove the initial reshape layer in the MNIST tutorial model

Yes, your model should have only one input layer in order to be executed successfully. The easiest workaround here would be to stack two input tensors together into a single input layer and then unstack them during inference.
 

operator45

New member
I have solved most of the issues that I have described in the original post. Mostly I got rid of Split OPs and forced manual broadcast by applying tile and reshape. Surprisingly, I can even use models with multiple inputs. I am still using your advice with having only 1 input and slicing during the inference, just for safety. Now I can run simple models on the GPU successfully.

However, I have another error when trying to run the model with Floor and GatherND OPs. The model below without Floor OP runs just fine on the GPU. The gather OP doesn't cause any problems, since it falls back to CPU during runtime.
1679952404111.png

However, once I add the Floor OP, I ran into errors. So there are bunch of messages and it is unclear which one is really causing the error.
1679952584116.png1679952852189.png

I also found a workaround by substituting the Floor OP with casting to int and exploiting the fact that I clip negative values to zero. This model gives the same output and runs on the GPU.
1679953362690.png
Even though I have solved my problems for now, it would be nice to find out the root of the problem. Moreover, the workaround with the clip and cast for Floor OP only works if I don't have negative values in the tensor, so it's not ideal.
 
Top