Difference between the HTP and DSP delegate?

mr-robot · Jul 10, 2023

Hi,

In the footnote of the rankings the following abbreviations are provided:

qh - Qualcomm QNN HTP Delegate
qd - Qualcomm QNN DSP Delegate

What is the difference between the HTP and DSP delegate? In Qualcomm jargon, I thought that the HTP = DSP = HMX + HVX. Therefore I'm confused about whether there is any performance differences between the HTP and DSP delegate, or whether this is just based on versioning. I see that recent Qualcomm devices exclusively use the qh.qh combination.

Thanks

Andrey Ignatov · Jul 10, 2023

mr-robot said:
or whether this is just based on versioning

Yes, partly: the Hexagon 6xx family is denoted as DSPs in QNN, while the Hexagon 7xx family - as HTPs. Here you can find the full list of Hexagon processors.

There are also large architectural differences between these two families - the latest HTPs, for instance, are able to accelerate both FP16 and INT8 models, while all older Hexagon DSPs can run only quantized inference.

mr-robot · Jul 10, 2023

Andrey Ignatov said:
Yes, partly: the Hexagon 6xx family is denoted as DSPs in QNN, while the Hexagon 7xx family - as HTPs. Here you can find the full list of Hexagon processors.

There are also large architectural differences between these two families - the latest HTPs, for instance, are able to accelerate both FP16 and INT8 models, while all older Hexagon DSPs can run only quantized inference.

Thanks for the quick reply.

Are you familiar with the internal components of the HTP? Eg the difference between the HTA, HVX and HMX: HVX being the "Hexagon Vector Extensions", HMX the "Hexagon Matrix Extension" and HTA the "Hexagon Tensor Acceleration" (seem to be deprecated in later devices). Do you have any idea which underlying component in the HTP is executing during the benchmarks?

Andrey Ignatov · Jul 10, 2023

mr-robot said:
Eg the difference between the HTA, HVX and HMX: HVX being the "Hexagon Vector Extensions", HMX the "Hexagon Matrix Extension" and HTA the "Hexagon Tensor Acceleration" (seem to be deprecated in later devices)

This is a very brief answer, but the general idea is as follows:

HTP = rebranded compute DSP (since Snapdragon 888 / Hexagon v68): contains HVX and HMX co-processors / modules.

Note that both HVX and HMX modules are also present in other Hexagon DSPs without HTP.

HTA = additional co-processor dedicated for fixed-point NN inference in the Hexagon V66 (Snapdragon 855, 865, 870), not present in any other Hexagon DSPs.

mr-robot said:
Do you have any idea which underlying component in the HTP is executing during the benchmarks?

It's up to the Hexagon / QNN delegate to parse the model and run different layers on the most appropriate and performant HTP hardware.

Difference between the HTP and DSP delegate?

mr-robot

New member

Andrey Ignatov

Administrator

mr-robot

New member

Andrey Ignatov

Administrator