Mobile AI Workshop General Organization Questions

Andrey Ignatov

Administrator
Staff member
I want to konw why the web of "Mobile AI & AIM 2025 4K Quantized Image Super-Resolution Challenge" is different to "Mobile AI & AIM 2025 4K Image Super-Resolution Challenge"
Why is the equipment required for the "Mobile AI & AIM 2025 4K Image Super-Resolution Challenge" race different from the equipment required for the "Mobile AI & AIM 2025 4K Quantized Image Super-Resolution Challenge" race?
Because these are two different challenges (quantized vs. floating-point) with different requirements (target platforms, min. required accuracy).
 

Zachary

New member
Hello,

My submission for 4K Efficient Image Super-Resolution failed with the following error:

1750418600704.png1750418620557.png

The submission ID is #315531. Could you help me check if there are any issues with the submission file or the server?

Additionally, during the current development phase, are we required to submit only the image results, or should we also include a script or TFLite model?

Thank you for your support and clarification.
 
Last edited:

zhaoyu

New member
Dear Organizers,

I have some questions regarding the Efficient Stable Diffusion Challenge

● The final competition score will take into account both the quality of the generated images and the runtime. Can you provide detailed image quality evaluation metrics to provide us with self-evaluation during the development process?
● Since there is no automatic validation in Codalab, how can we evaluate the score of our algorithms during the competition? Do we have other submission opportunities to evaluate the algorithm's score during development before the final submission?

Thank you for your efforts in the competition.
 

Aeri

New member
Hello,

I have a question regarding the Efficient LLM Challenge:

- When evaluating accuracy on 50 sentences, what are the max output length and max prompt length?
To determine the trade-off between runtime and accuracy when selecting a model,
the KV cache size depending on the number of tokens cannot be ignored.
I'm wondering if there is any standard or guideline regarding this.
 

Re4Zen

New member
Dear Organizers,
I have a question regarding the "Mobile AI & AIM 2025 4K Quantized Image Super-Resolution Challenge",
  • The goal of this competition is to score 4K, but the given training and validation sets are 2K data sets, and I am concerned that the last test in the competition requirements is 4K run time and PSNR, so I'd like to ask how this 4K test is done, or is 4K just for speed and PSNR is 2k?
Thank you for your efforts in the competition.
 
Last edited:

Andrey Ignatov

Administrator
Staff member
Can you provide detailed image quality evaluation metrics to provide us with self-evaluation during the development process?
As is stated in the challenge description:

Each image will be assessed based on the following three metrics:
  • Perceptual / artistic quality of the generated image
  • Accuracy / relevance with respect to the input prompt
  • Presence of various generation artifacts
The results for all 50 generated images will then be averaged, providing the model with a single "accuracy" score.

how can we evaluate the score of our algorithms during the competition? Do we have other submission opportunities to evaluate the algorithm's score during development before the final submission?
Unfortunately, there are still no automatic tools allowing to provide you with a meaningful image perceptual score, therefore we have to rely on MOS results in this competition, making it impossible to perform any evaluation during the development phase. However, you can easily track the regression in the generated image quality by using the same input prompts and comparing model outputs. The same also applies to model's runtime.

When evaluating accuracy on 50 sentences, what are the max output length and max prompt length?
You can assume that all 50 questions would be independent, the length of each question will not exceed 500 words / 4000 characters.

I'd like to ask how this 4K test is done, or is 4K just for speed and PSNR is 2k?
Yes, 2K is used for fidelity evaluation, 4K - for runtime measurements.
 

aimf_ys

New member
Dear organizers, I have a question regarding the "Mobile AI & AIM 2025 4K Quantized Image Super-Resolution Challenge"

After submitting these error logs occured

Traceback (most recent call last):
File "/tmp/codalab/tmpAqpyAb/run/program/evaluation.py", line 95, in <module>
raise Exception('Expected %d .png images'%len(ref_pngs))
Exception: Expected 100 .png images

— i've uploaded a zip file with 100 png images, but this error keep occuring
— do you have any suggestions for solving this problem?
 

Zachary

New member
Dear organizers,

My submission for 4K Efficient Image Super-Resolution failed with the following error:

1750418600704.png
1750418620557.png



The submission ID is #315531. Could you help me check if there are any issues with the submission file or the server?

Additionally, during the current development phase, are we required to submit only the image results, or should we also include a script or TFLite model?

Thank you for your support and clarification.
 

zhaoyu

New member
Dear Organizers,

I have some questions regarding the "Efficient Stable Diffusion Challenge"

● As described on the competition website, we need to submit a pre-trained model and an inference script for T2I testing. Is it convenient to provide more detailed test requirements, such as
(1) The input and output specifications of the test script; The input is a list of prompts or just one prompt?
(2) Whether the 50 prompts are tested one-by-one (batch=1) or in batches (batch>1) ?
(3) How to consider the initialization time of the model (load weight)? Will it be counted in the final runtime or excluded?

More detailed testing requirements will be very helpful for our development. Thank you for your efforts in the competition.
 

Aeri

New member
Dear Organizers,

I have a question regarding the Efficient LLM Challenge:

- I want to know the exact evaluation metric of the challenge.
It says runtime is measured in tokens/s, and tokens/s is generally better when higher.
But in the formula, runtime appears in the denominator.
To increase the score according to the formula, I'd need to reduce runtime, meaning tokens/s would decrease.
Does this imply that slower inference is advantageous?
The metric definition and formula don't seem to align clearly.
Please clarify the exact evaluation formula.
 

Andrey Ignatov

Administrator
Staff member
(1) The input and output specifications of the test script; The input is a list of prompts or just one prompt?
One prompt (string).

(2) Whether the 50 prompts are tested one-by-one (batch=1) or in batches (batch>1) ?
With a batch size of 1.

(3) How to consider the initialization time of the model (load weight)? Will it be counted in the final runtime or excluded?
No, weights initialization time will not be counted.

But in the formula, runtime appears in the denominator.
[Tokens / s] ~ [1 / runtime], so there are no issues with the formula: higher tokens/s rate leads to a higher score.
 

rlghksdbs

New member
1751460594254.png1751460571194.png
Dear organizers,

We are currently participating in the MAI 2025 Image Denoising Challenge.

While reviewing the challenge guidelines, we noticed a potential contradiction and would like to kindly ask for clarification regarding the expected format of the submitted model:

- In the challenge summary (as shown on the MAI website), the "Runtime evaluation mode" is listed as "TFLite (LiteRT) FP32 models".
- However, in the runtime validation instructions, it is stated that we should evaluate our models on-device using "AI Benchmark (FP16 mode + TFLite GPU delegate)".

Given this, could you please clarify the following:
1. Are participants required to submit **FP32 TFLite models**, or is it acceptable (or even recommended) to submit **FP16-quantized TFLite models**?
2. Will **FP16 models** be evaluated properly on the final runtime evaluation devices (Snapdragon 8 Elite Adreno / Dimensity 9400 Mali GPU)?

We would greatly appreciate your guidance so we can ensure our model is submitted in the correct format.

3. Regarding the runtime evaluation and ranking:
➤ Which GPU is used for runtime scoring in the final ranking — **Snapdragon 8 Elite (Adreno GPU)** or **Dimensity 9400 (Mali GPU)**?
➤ Or is the slowest runtime among the two used as the official score?
 

Andrey Ignatov

Administrator
Staff member
Are participants required to submit **FP32 TFLite models**, or is it acceptable (or even recommended) to submit **FP16-quantized TFLite models**?
TFLite GPU delegate automatically casts FP32 models to FP16 format, therefore no additional conversion is needed, it won't bring any latency improvements.

Which GPU is used for runtime scoring in the final ranking — **Snapdragon 8 Elite (Adreno GPU)** or **Dimensity 9400 (Mali GPU)**?
Average of two runtimes.
 

szyang

New member
1751507673992.png
Mobile AI & AIM 2025 Real Image Denoising Challenge consult
Hello:

I am a participant in the Mobile AI&AIM 2025 Real Image Denoising Challenge. May I ask if it is possible to proceed to the final testing phase if I have not participated in the Development phase? And how can I obtain noisy images of the test data?
When I submitted, I encountered the following problems:
1751508135736.png

Thank you!
 
Last edited:

zhaoyu

New member
Dear Organizers,

I have some questions regarding the "Efficient Stable Diffusion Challenge"

The submision file is too big to upload to codalab,can i submit it by email?


Thank you for your efforts in the competition.
 
Last edited:

Zachary

New member
Dear Organizers,

I have a question regarding the Efficient Image Super-Resolution Challenge. Is it allowed to use datasets other than DIV2K for training purposes?

Thank you for your clarification.
 

Chanson

New member
Dear Organizers,

I have three questions regarding the AIM 2025 4K Image Super-Resolution Challenge.

The Get Data section on codalab suggests input shape is [1, 720, 1280, 3], which is NHWC. Does this mean it's a permute operation on the original RGB image ([1, 3, 720, 1280])?

You say articipants don't need to submit “model_none.tflite” filles, does this mean that only model.tflite needs to be submitted?

In the final test phase, no test data is provided, but my submission status shows failed: Expected 100 .png images but found 0 in the results directory. Can I ignore this error, and can you get the zip archive with this status failed?

Thank you!
 

Andrey Ignatov

Administrator
Staff member
The submision file is too big to upload to codalab,can i submit it by email?
Please upload your executables to a separate shared storage platform and provide the corresponding link in the factsheet submitted to Codalab.

I have a question regarding the Efficient Image Super-Resolution Challenge. Is it allowed to use datasets other than DIV2K for training purposes?
Yes.
 

szyang

New member
1751768328067.png
Dear Organizers,
I have a question regarding the Mobile AI & AIM 2025 Real Image Denoising Challenge. Is it mandatory to submit papers due on July 9th?
Thank you for your support and clarification!
 

Andrey Ignatov

Administrator
Staff member
I have a question regarding the Mobile AI & AIM 2025 Real Image Denoising Challenge. Is it mandatory to submit papers due on July 9th?
No, this is just an opportunity for you to have a paper describing your solution being published in ICCV proceedings.
 
Top