From the 2019 AI benchmark paper:
"For each corresponding test, the L1 loss is computed between the target and actual outputs produced by the deep learning models."
Ideally, when computing error of a [rounded] result (whatever metric you like), it should be compared to an exact result, not to...