General Questions about the AI Benchmark implementation

shi_ath · Dec 22, 2021

I have been reading some papers on ML Benchmarking. I went through your papers on AI Benchmark and have gotten a general idea about the benchmark and calculation of the summary AI score.
1. A higher K score (AI summary) is a better score, is that correct? I was curious to know more about how to interpret all the results you have obtained in the details of the summary AI score.
2. How to interpret the terms 'Target', 'Per-label error', 'Accuracy, digits'?
3. Further, while setting the tests I was curious to know more about the meaning of 'Limit Max Initialization Time'.
4. Furthermore, could you please guide me to any resources about the different acceleration methods in the settings and how they are related? Whether they can work together or not and why?
5. Is there any method to implement the benchmark on a SoC through the ADB interface? (If apk can not be installed and there is no python interface)

Andrey Ignatov · Dec 24, 2021

shi_ath said:
A higher K score (AI summary) is a better score, is that correct?

Yes, K here states for thousands (10K = 10000).

shi_ath said:
How to interpret the terms 'Target', 'Per-label error', 'Accuracy, digits'?

Target: max error not affecting the results (accuracy / visual)
Per-label error: average per-label L1 loss (=mean absolute error) between the produced outputs and goldens
Accuracy, digits: the number of accurately predicted digits after the decimal point (compared to the goldens)

shi_ath said:
meaning of 'Limit Max Initialization Time'.

On some device, several models might fail to initialize due to outdated NN HAL. Thus, after waiting for 30-120s depending on the model, the test is automatically terminated if the network was not initialized. The above option disables this functionality.

shi_ath said:
resources about the different acceleration methods in the settings and how they are related

NNAPI: https://www.tensorflow.org/lite/performance/nnapi
GPU delegate: https://www.tensorflow.org/lite/performance/gpu
Hexagon NN delegate: https://www.tensorflow.org/lite/performance/hexagon_delegate
Neuron delegate:

(video is in Chinese, but the slides are in English).

shi_ath said:
Whether they can work together or not and why?

Only one particular option can be used (TFLite restriction).

shi_ath said:
If apk can not be installed and there is no python interface

No, in all cases one needs to install the APK first.

shi_ath · Dec 26, 2021

Hello Andrey,

I would like to sincerely thank you for your guidance!

shi_ath · Jan 11, 2022

Hello Andrey,

I was trying to implement this benchmark on some Android devices. I had a couple of more questions regarding this.

1. For which of the Qualcomm chipsets can this benchmark use Qualcomm Hexagon NN? I tried a couple of chipsets and it did not seem to work for them.

2. For the custom tflite models as benchmark, how are the test cases generated? If I use a audio detection model for eg, will it have audio samples to test the model?

3. In the tutorial on the AI Benchmark website, Is the python interface to the OS through ADB or CLI?

4. I was reading on score calculation in AI Benchmark and it said, 'The result of the memory test introduces a multiplicative contribution to the final score', does that mean higher the memory score, the K score is multiplied with some proportional coefficient to that?

5. It also said 'The normalization coefficients for each test are computed based on the best results of the current SoC generation', what are normalization coefficients and what does normalization mean in this context?

6. For the information on the tests, Object Recognition/ Classification (Mobilenet- v2) (INT8 + FP32), does it mean this test was carried out using INT8 precision and FP32 precision separately?

7. Is the AI Benchmark apk open source?

Would be glad if you/ your team could guide me on this.

Andrey Ignatov · Jan 31, 2022

Hi @shi_ath,

shi_ath said:
For which of the Qualcomm chipsets can this benchmark use Qualcomm Hexagon NN?

There were some known issues with the Hexagon NN delegate in the beta build. Please download the final AI Benchmark V5 version where all problems are fixed.

The Hexagon NN delegate is generally compatible with the Snapdragon 820-865, 710-765, 660-690 and 460-480. Note, however, that the access to the Hexagon DSP might be blocked by some vendors.

shi_ath said:
will it have audio samples to test the model?

It will generate random input data using the specified input value boundaries.

shi_ath said:
Is the python interface to the OS through ADB or CLI

There is no python interface, but you can launch the benchmark using ADB and use touch events to navigate through menus / options.

shi_ath said:
does that mean higher the memory score, the K score is multiplied with some proportional coefficient to that?

shi_ath said:
does it mean this test was carried out using INT8 precision and FP32 precision separately

Yes.

shi_ath said:
Is the AI Benchmark apk open source?

AI Benchmark Mobile is not open source, AI Benchmark alpha is open source.

shi_ath · Feb 1, 2022

Thank you for all the guidance!

ibai · Feb 23, 2022

Hi,

First of all thank you for this nice tool/app. I have tested the latest version (V5) with a Snapdragon 888 phone with a custom model (YOLOv5 int8), and it was really fast when running with the QNN delegate (HTP) with about 10ms inference for the large model, which is almost 10x faster than running the fp16 version of the same model in TFLite with the GPU delegate.

My question is, is the QNN delegate using SNPE SDK? Or, is it some custom delegate wrapping that library?

Thanks,
Ibai

General Questions about the AI Benchmark implementation

shi_ath

New member

Andrey Ignatov

Administrator

shi_ath

New member

shi_ath

New member

Andrey Ignatov

Administrator

shi_ath

New member

ibai

New member