Dear AI Benchmark team,
The app was super helpful in benchmarking our NN on Android with the Float16 inference on HTP showing the best performance.
Our question however is, how are you guys running Float16 models on the HTP chip?
The "old" Qualcomm SNPE SDK does not offer the HTP runtime at all while the newer QNN SDK only seems to be able to run Int8 models on HTP (Float16 models get stuck.)
Any pointers how to reproduce this runtime would be appreciated!
Thank you and best regards,
Manuel
The app was super helpful in benchmarking our NN on Android with the Float16 inference on HTP showing the best performance.
Our question however is, how are you guys running Float16 models on the HTP chip?
The "old" Qualcomm SNPE SDK does not offer the HTP runtime at all while the newer QNN SDK only seems to be able to run Int8 models on HTP (Float16 models get stuck.)
Any pointers how to reproduce this runtime would be appreciated!
Thank you and best regards,
Manuel