Real Image Denoising Challenge

Andrey Ignatov · Jan 28, 2021

All questions related to Real Image Denoising Challenge can be asked in this thread.

Msss · Feb 1, 2021

Hi,
I have a question regarding the input image.
Is a tflite model required to work on images with different shapes?
Is it enough to work with 3000x4000 images like the images in the given data?
Thanks.

Andrey Ignatov · Feb 1, 2021

Minsu said:
Is a tflite model required to work on images with different shapes?
Is it enough to work with 3000x4000 images like the images in the given data?

No, the input size of the TFLite model should be constant. It will be equal to ~HD resolution, we will send the exact instructions soon when we open the runtime validation submission.

Alpha · Feb 3, 2021

Hi,
How to balance the PSNR and runtime?
Which metric is the final ranking based on?
Is there a formula such as: final_score = a * PSNR_score + b * runtime_score
Looking forward to your reply.
Thanks.

Andrey Ignatov · Feb 3, 2021

Alpha said:
How to balance the PSNR and runtime?
Which metric is the final ranking based on?

The final submission score will be proportional to the fidelity scores (PSNR) and inversely proportional to the runtime. The exact scoring formula will be announced a bit later.

Msss · Feb 5, 2021

The evaluation system has been stopped with the message "Warning: The compute queues are congested, please wait for existing jobs to process before submitting more. Thanks for your patience!". I have been waiting for a long time but it still doesn't work.

Radu Timofte · Feb 5, 2021

Codalab is working to solve this issue.
It may take some more hours.

Si-Phi · Feb 9, 2021

Does a failed submission due to any processing error is also counted as a valid submission?

Radu Timofte · Feb 9, 2021

The failed submissions do not count towards the maximum number of allowed submissions.

Si-Phi · Feb 10, 2021

How to report the runtime values for the submission for now?
As runtime is dependent on the hardware and currently the leaderboard reports the runtime mentioned in the readme.txt file.

Andrey Ignatov · Feb 11, 2021

Si-Phi said:
How to report the runtime values for the submission for now?

For now, you can just report the results obtained on your own smartphone.

Alpha · Feb 19, 2021

Hi, is the model quantization allowed?

Andrey Ignatov · Feb 19, 2021

Alpha said:
is the model quantization allowed?

No, in this challenge you can submit only floating-point models. Running quantized models on Mali GPUs anyway won't be faster, please refer to the results of the Exynos 990 SoC here.

mako · Feb 25, 2021

Hi, the valid data is now unavailable.

"There was an error retrieving the file. Please try again later or report the issue."

Andrey Ignatov · Feb 25, 2021

mako said:
"There was an error retrieving the file. Please try again later or report the issue."

Unfortunately, Codalab is down again now, we will try to upload validation data to external server.

Andrey Ignatov · Feb 25, 2021

@mako,

Codalab is back again, you should be able to download the validation data now.

Msss · Mar 3, 2021

What is the shape of input for the tflite model?
Is it 1 x 2432 x 3200 x 3 ?

Andrey Ignatov · Mar 3, 2021

Minsu said:
What is the shape of input for the tflite model?

The size of the input tensor should be [1, 480, 720, 3], just sent an email with clarifications to all challenge participants.

Msss · Mar 3, 2021

Andrey Ignatov said:
The size of the input tensor should be [1, 480, 720, 3], just sent an email with clarifications to all challenge participants.

Thanks !
Does the evalution system measure the latency if I submit a tflite model?

Andrey Ignatov · Mar 3, 2021

Minsu said:
Does the evalution system measure the latency if I submit a tflite model?

As was mentioned in the email sent yesterday, all runtime evaluation results will be published here:

https://docs.google.com/spreadsheet...fEmAbbZ2fWntgZS5uigJos9gawt2-0bq1pLRa/pubhtml

haha · Mar 4, 2021

Andrey Ignatov said:
The final submission score will be proportional to the fidelity scores (PSNR) and inversely proportional to the runtime. The exact scoring formula will be announced a bit later.

I think the score is not reasonable.When using this score, every one tend to do faster instead of getting reasonable PSNR. Like an extreme situation：my model just has one identity block, is the score very high？

chengshen · Mar 4, 2021

haha said:
I think the score is not reasonable.When using this score, every one tend to do faster instead of getting reasonable PSNR. Like an extreme situation：my model just has one identity block, is the score very high？

The score indicates that the PSNR is 1dB higher, which is equivalent to half of the running time. Obviously, if the model size is reduced to approximately 0, the output PSNR will be equal to the input PSNR， the score would be very high. A better solution should be like that, half the runtime is equivalent to an increase >1dB PSNR.

Andrey Ignatov · Mar 5, 2021

@haha, @chengshen,

You are right, that was actually our mistake (we were previously planning to have one additional loss component increasing the impact of fidelity scores). It's fixed now.

haha · Mar 6, 2021

Andrey Ignatov said:
@haha, @chengshen,

You are right, that was actually our mistake (we were previously planning to have one additional loss component increasing the impact of fidelity scores). It's fixed now

emmmmm... is this score right? I think it can't solve the problem i mentioned before.

This score is NTIRE workshop on image deblurring in 2020. As for inference time, i think it is better to do "add" rather than "divide"

Andrey Ignatov · Mar 6, 2021

haha said:
I think it can't solve the problem i mentioned before.

PSNR of the trivial identity mapping solution is ~33dB. Thus, if showing the same runtime, the score of this solution will be ~2^10 =1024 times lower compared to the score of the solutions with 38dB PSNR. Since the runtime of the above trivial solution will anyway be around several millis, there won't be any problems with the scores.

chengshen · Mar 7, 2021

Sorry I didn't understand some points yesterday.

"For test phase submission, we need to submit results, TFlite model, and a script to reproduce the results. "

So does it means,

1), our scripts run the TFLite model and we will submit the output of the converted TFlite model instead of the original model.

2) If the 1) is right, our scripts should crop the HD images first, and then feed patches to the model for processing a whole HD image, because the input size of TFlite model is [480,720].

chengshen · Mar 7, 2021

Andrey Ignatov said:
PSNR of the trivial identity mapping solution is ~33dB. Thus, if showing the same runtime, the score of this solution will be ~2^10 =1024 times lower compared to the score of the solutions with 38dB PSNR. Since the runtime of the above trivial solution will anyway be around several millis, there won't be any problems with the scores.

I think there should be a threshold for PSNR to make sure our model actually makes sense. For example, the score should be calculated if and only if PSNR > THRES. THRES can be dependent on some baseline such as a bilateral filter.

Andrey Ignatov · Mar 8, 2021

chengshen said:
our scripts run the TFLite model and we will submit the output of the converted TFlite model instead of the original model

Yes, you can use this script to produce the results using the obtained TFLite model. Since you need to submit a normal floating-point TFLite network, its results should be identical to the ones obtained with the original TF model.

chengshen said:
our scripts should crop the HD images first

chengshen said:
because the input size of TFlite model is [480,720].

No, you should process the original full-resolution images without any cropping. The above resolution is used only to validate the runtime of your model. During the final phase, you will need to submit two TFLite models: processing the original resolution images and 480x720 crops.

Andrey Ignatov · Mar 8, 2021

chengshen said:
I think there should be a threshold for PSNR to make sure our model actually makes sense.

We don't want to put any hard constraints here as there might be, e.g., some solutions using perceptual or adversarial losses for getting better visual results, but not achieving high PSNR values.

However, note that all final models will be evaluated by us manually. If there will be a solution not producing any meaningful results and just trying to overfit the scoring formula - it will simply be discarded and won't get any prize. The rule of thumb: if the model demonstrated PSNR less than 36-35 - it should be doing some perceptual results enhancement, otherwise most likely it won't be awarded.

Msss · Mar 13, 2021

Will the test inputs be released?
If it is released, some could make denoised images with a very large network and use them for training small networks.

Andrey Ignatov · Mar 15, 2021

Minsu said:
If it is released, some could make denoised images with a very large network and use them for training small networks.

Yes, one can do this. However, all top-rated submissions will be evaluated on our private test data. If we see a discrepancy between the results on these two test subsets - the corresponding submission will be disqualified and the participant(s) will be banned from this and all future competitions. So, everyone should really think twice before cheating.

Idlefish · Mar 16, 2021

Andrey Ignatov said:
As was mentioned in the email sent yesterday, all runtime evaluation results will be published here:

https://docs.google.com/spreadsheet...fEmAbbZ2fWntgZS5uigJos9gawt2-0bq1pLRa/pubhtml

Could the organizers update the runtime evaluation results ? The results after the Mar. 8th have not been released so far. Or could we get runtime from other way ?

Msss · Mar 16, 2021

Hi!
I thought test dataset would be the size of 480 x 720 as you mentioned above.
But the test dataset shape is the same as validation dataset (the first dataset we used).
Could you please tell us what happened?

chengshen · Mar 16, 2021

Andrey Ignatov said:
you will need to submit two TFLite models: processing the original resolution images and 480x720 crops

Minsu said:
Hi!
I thought test dataset would be the size of 480 x 720 as you mentioned above.
But the test dataset shape is the same as validation dataset (the first dataset we used).
Could you please tell us what happened?

Andrey said the model with input size 480x720 is only used to calculate runtime.

myungje.lee · Mar 17, 2021

Hi there,
I am an participant of the competition and have received an email regarding submission.
In the e-mail,

"model_none.tflite" should be:

- a floating-point FP32 model (no FP16 / INT16 / INT8 quantization);
- with an input tensor of size [1, None, None, 3] taking RGB images as an input;
- with an output tensor of size [1, None, None, 3] producing the final denoised image results. "

It seems to me that 'tf_lite model' input size should be '1 x None x None x 3'
I have the following inquiries:
1. Is the input shape restriction correct?
2. If so, how do you create such tf-lite model? Do you have any docs related to that?

Thanks

Msss · Mar 17, 2021

Andrey Ignatov said:
Yes, you can use this script to produce the results using the obtained TFLite model. Since you need to submit a normal floating-point TFLite network, its results should be identical to the ones obtained with the original TF model.

No, you should process the original full-resolution images without any cropping. The above resolution is used only to validate the runtime of your model. During the final phase, you will need to submit two TFLite models: processing the original resolution images and 480x720 crops.

Dear organizer,
It is hard to understand how the tflite model for [1, None, None, 3] input could be made since only the tflite models with "static size input" work on mobile GPUs.
Also, the resize_bilinear function requires the shape of a resized tensor.
What about submitting the tflite model for [1,2432, 3200, 3] instead if you need a model for checking denoising performance on the original non-cropped images?

Andrey Ignatov · Mar 17, 2021

myungje.lee said:
2. If so, how do you create such tf-lite model?

Just add a TF / Keras input placeholder with shape [1, None, None, 3] before converting your model.

Minsu said:
tflite models with "static size input" work on mobile GPUs.

TFLite model with dynamic input shape will only be used for checking fidelity scores.

Idlefish · Mar 17, 2021

Hi!
I can not find the runtime results of updated submissions after Mar. 8th from here:
https://docs.google.com/spreadsheet...fEmAbbZ2fWntgZS5uigJos9gawt2-0bq1pLRa/pubhtml
Or have organizers published these results by some other means ?
please give some advices, thanks a lot.

Msss · Mar 17, 2021

Andrey Ignatov said:
Just add a TF / Keras input placeholder with shape [1, None, None, 3] before converting your model.

TFLite model with dynamic input shape will only be used for checking fidelity scores.

Thank you for response!

https://github.com/aiff22/MAI-2021-Workshop/blob/main/tensorflow_to_tflite.py
Here, with the official tflite converter code, I've just replaced the place holder like you said.
x_ = tf.compat.v1.placeholder(tf.float32, [1, None, None, 3], name="input")
However, the conversion gives this kind of error with both TF 1.15 and TF 2.4.1.
ValueError: None is only supported in the 1st dimension. Tensor 'input' has invalid shape '[1, None, None, 3]'.
I still couldn't understand how resize_bilinear layers work with [1, None, None, 3] shape since the resize_bilinear function requires the shape of a resized tensor.
Is it okay to submit model for [1, 2432, 3200, 3] input instead of "model_none" for the fidelity verification if it's light enough to work on smartphones or small servers?
How would you check the model for runtime and the model for fidelity are exactly same? Still couldn't understand why two different tflite models should be submitted even when this kind of issue comes up

Andrey Ignatov · Mar 17, 2021

Idlefish said:
Or have organizers published these results by some other means ?
please give some advices, thanks a lot.

Samsung is working on this, will hopefully add new results shortly. For now, please test the models on your own smartphone's GPU with AI Benchmark as the obtained runtime is usually proportional to the one on the Exynos Mali GPU.

Msss said:
ValueError: None is only supported in the 1st dimension. Tensor 'input' has invalid shape '[1, None, None, 3]'.

As was written in the email, set the experimental_new_converter option to True when converting a model with None dimensions.

Msss said:
How would you check the model for runtime and the model for fidelity are exactly same?

It's not a problem to check if two TFLite models are the same except for the input dimensions.

Msss said:
Is it okay to submit model for [1, 2432, 3200, 3] input instead of "model_none" for the fidelity verification

No, the input size of the model should be None. You may assume that the resolution of the test images will be 3000x4000px, though we cannot guarantee you that the model won't be tested on images of different size.

Msss · Mar 17, 2021

Andrey Ignatov said:
Samsung is working on this, will hopefully add new results shortly. For now, please test the models on your own smartphone's GPU with AI Benchmark as the obtained runtime is usually proportional to the one on the Exynos Mali GPU.

As was written in the email, set the experimental_new_converter option to True when converting a model with None dimensions.

It's not a problem to check if two TFLite models are the same except for the input dimensions.

No, the input size of the model should be None. You may assume that the resolution of the test images will be 3000x4000px, though we cannot guarantee you that the model won't be tested on images of different size.

Thank you very much!! It seems working.

chengshen · Mar 18, 2021

Could you update the runtimes every day, please?

Idlefish · Mar 18, 2021

Andrey Ignatov said:
Just add a TF / Keras input placeholder with shape [1, None, None, 3] before converting your model.

TFLite model with dynamic input shape will only be used for checking fidelity scores.

when we convert tflite with [1, None, None, 3] input sucessfully, is it correct that the input and output of .tflite are both [1,1,1,3] checked by Netron ?

Andrey Ignatov · Mar 18, 2021

Idlefish said:
can not find the runtime results of updated submissions after Mar. 8th from here:

The results were updated today.

Idlefish said:
when we convert tflite with [1, None, None, 3] input sucessfully, is it correct that the input and output of .tflite are both [1,1,1,3] checked by Netron ?

Yes.

Alpha · Mar 18, 2021

Dear organizer,
When I submitted the final result on codalab, I encountered some problems, can you give me some suggestions?
My user name is q935970314 and the error log is:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
Execution time limit exceeded!
Thanks.

Andrey Ignatov · Mar 18, 2021

Alpha said:
When I submitted the final result on codalab, I encountered some problems, can you give me some suggestions?

As far as I can see, all your submissions are successful now.

Idlefish · Mar 19, 2021

Andrey Ignatov said:
As far as I can see, all your submissions are successful now.

I submitted one same results by three times, but all are warned by 'Execution time limit exceeded' and marked by 'failed'. Today, I found that platform generated a submission marked by 'finished', does that mean I have sucessfully submmited one result and still remain 2 chances ?

Andrey Ignatov · Mar 19, 2021

Idlefish said:
does that mean I have sucessfully submmited one result and still remain 2 chances

Yes.

chengshen · Mar 19, 2021

The speed of our model with submission ID 833911 is 0.06s, however, it is a very big model ran on my own device with 6~7s. Please check the problem and the subsequential submissions following the ID 833911. Thanks very much.

The model was submitted before, it ran in ~2.5s.

Msss · Mar 19, 2021

chengshen said:
The speed of our model with submission ID 833911 is 0.06s, however, it is a very big model ran on my own device with 6~7s. Please check the problem and the subsequential submissions following the ID 833911. Thanks very much.

The model was submitted before, it ran in ~2.5s.

I guess the runtime measurement is extremely inaccurate. I got the 4x slower result even when I submitted the same model.

Real Image Denoising Challenge

Administrator

New member

Administrator

New member

Administrator

New member

New member

New member

New member

New member

Administrator

New member

Administrator

New member

Administrator

Administrator

New member

Administrator

New member

Administrator

New member

New member

Administrator

New member

Administrator

New member

New member

Administrator

Administrator

New member

Administrator

New member

New member

New member

New member

New member

Administrator

New member

New member

Administrator

New member

New member

New member

Administrator

New member

Administrator

New member

Administrator

New member

New member