Transformer-based architectures not working with GPU delegate

MFP32 · Jan 30, 2023

------------------------------
Phone Model: Pixel 3a
Version of app: 5.0.3
Link to tflite model: https://tfhub.dev/tensorflow/lite-model/albert_lite_base/squadv1/metadata/1
------------------------------

Hey everyone,

so this has been causing me headaches for like two months already. I just want to get a BERT-encoder model running on GPU acceleration. Built an app, tried it, failed. Tried the official TF benchmarking app, ran the model, failed. Finally discovered this awesome forum, tried running it, failed. edit: of course I tried running several different transformer models and even tested transformer blocks in isolation, everything seems to fail with some variation of the error in the screenshot.

Is it just generally not supported to run transformer architectures (2017!!!!) with the GPU delegate in 2023??

Here is a screenshot of an albert model I tried running in PRO mode. CPU and NNAPI (although slower than CPU lol) works, just GPU crashes.

edit: Here is an issue I created for TF GitHub describing the problem in-depth: https://github.com/tensorflow/tensorflow/issues/59232

Andrey Ignatov · Feb 6, 2023

MFP32 said:
Is it just generally not supported to run transformer architectures (2017!!!!) with the GPU delegate in 2023??

TFLite GPU delegate supports only a subset of TFLite / TF ops (mainly related to computer vision), thus it's very common that many NLP models cannot be executed with it. In your case, the problem is with the following 3 layers: GATHER, RESHAPE and SLICE. In principle, it should be possible to replace them with functional alternatives that are supported by the GPU delegate, but this will require a slight model re-design.

Transformer-based architectures not working with GPU delegate

MFP32

New member

Andrey Ignatov

Administrator