I was looking at the ranking available on your website-
https://ai-benchmark.com/ranking_detailed.html and observed that on
Snapdragon 888 for multiple models, NNAPI shows better latency numbers
compared to CPU for both int and float models.
Recently, I tested some DL models on Snapdragon 888 using TfLite
framework on CPU and NNAPI, but observed higher latency numbers with
NNAPI compared to CPU for FP16 and A16W8 models.
I read in TfLite documentation that fallback to CPU is disabled by
default for Android 10 (API level 29). The 888 device I use has a
higher Android version (11) and API level (30).
Still I am doubtful if NNAPI is being correctly utilized, due to the
latency numbers.
I have 2 questions-
1. Are there any drivers that are required by NNAPI, to work
correctly? If yes, can you please specify?
2. In TfLite, how to verify if the delegates are invoked correctly?
https://ai-benchmark.com/ranking_detailed.html and observed that on
Snapdragon 888 for multiple models, NNAPI shows better latency numbers
compared to CPU for both int and float models.
Recently, I tested some DL models on Snapdragon 888 using TfLite
framework on CPU and NNAPI, but observed higher latency numbers with
NNAPI compared to CPU for FP16 and A16W8 models.
I read in TfLite documentation that fallback to CPU is disabled by
default for Android 10 (API level 29). The 888 device I use has a
higher Android version (11) and API level (30).
Still I am doubtful if NNAPI is being correctly utilized, due to the
latency numbers.
I have 2 questions-
1. Are there any drivers that are required by NNAPI, to work
correctly? If yes, can you please specify?
2. In TfLite, how to verify if the delegates are invoked correctly?