General Questions about the AI Benchmark implementation

shi_ath

New member
I have been reading some papers on ML Benchmarking. I went through your papers on AI Benchmark and have gotten a general idea about the benchmark and calculation of the summary AI score.
1. A higher K score (AI summary) is a better score, is that correct? I was curious to know more about how to interpret all the results you have obtained in the details of the summary AI score.
2. How to interpret the terms 'Target', 'Per-label error', 'Accuracy, digits'?
3. Further, while setting the tests I was curious to know more about the meaning of 'Limit Max Initialization Time'.
4. Furthermore, could you please guide me to any resources about the different acceleration methods in the settings and how they are related? Whether they can work together or not and why?
5. Is there any method to implement the benchmark on a SoC through the ADB interface? (If apk can not be installed and there is no python interface)
 

Andrey Ignatov

Administrator
Staff member
A higher K score (AI summary) is a better score, is that correct?
Yes, K here states for thousands (10K = 10000).

How to interpret the terms 'Target', 'Per-label error', 'Accuracy, digits'?
Target: max error not affecting the results (accuracy / visual)
Per-label error: average per-label L1 loss (=mean absolute error) between the produced outputs and goldens
Accuracy, digits: the number of accurately predicted digits after the decimal point (compared to the goldens)

meaning of 'Limit Max Initialization Time'.
On some device, several models might fail to initialize due to outdated NN HAL. Thus, after waiting for 30-120s depending on the model, the test is automatically terminated if the network was not initialized. The above option disables this functionality.

resources about the different acceleration methods in the settings and how they are related
NNAPI: https://www.tensorflow.org/lite/performance/nnapi
GPU delegate: https://www.tensorflow.org/lite/performance/gpu
Hexagon NN delegate: https://www.tensorflow.org/lite/performance/hexagon_delegate
Neuron delegate:
(video is in Chinese, but the slides are in English).

Whether they can work together or not and why?
Only one particular option can be used (TFLite restriction).

If apk can not be installed and there is no python interface
No, in all cases one needs to install the APK first.
 

shi_ath

New member
Hello Andrey,

I was trying to implement this benchmark on some Android devices. I had a couple of more questions regarding this.

1. For which of the Qualcomm chipsets can this benchmark use Qualcomm Hexagon NN? I tried a couple of chipsets and it did not seem to work for them.

2. For the custom tflite models as benchmark, how are the test cases generated? If I use a audio detection model for eg, will it have audio samples to test the model?

3. In the tutorial on the AI Benchmark website, Is the python interface to the OS through ADB or CLI?

4. I was reading on score calculation in AI Benchmark and it said, 'The result of the memory test introduces a multiplicative contribution to the final score', does that mean higher the memory score, the K score is multiplied with some proportional coefficient to that?

5. It also said 'The normalization coefficients for each test are computed based on the best results of the current SoC generation', what are normalization coefficients and what does normalization mean in this context?

6. For the information on the tests, Object Recognition/ Classification (Mobilenet- v2) (INT8 + FP32), does it mean this test was carried out using INT8 precision and FP32 precision separately?

7. Is the AI Benchmark apk open source?

Would be glad if you/ your team could guide me on this.
 
Last edited:
Top