Question on PSNR/SSIM Calculation for 3-task and 5-task Results #17

Aitical · 2025-03-02T12:36:15Z

Thank you for sharing your excellent work on "AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation". I have a question regarding the evaluation process described in the paper.

I tested the provided models (adair3d.ckpt and adair5d.ckpt) and noticed that the PSNR and SSIM results I obtained are significantly different from the values in the paper's table. For example, when evaluating the pretrained results on the Rain100L/SOTS dataset, I found a notable gap between my results and reported ones in paper. (I utilized the calculate function in Basicsr for evaluation )

3 tasks

For 5 tasks

For the released test results, I observed that the test images appear to have their edges cropped by one pixel on Rain100L images. Specifically, the original image size is (481,321), but the test image is (480,320). Even after adjusting for this discrepancy by aligning the cropped images, the evaluation metrics are still approximately 0.14 lower than those reported.

How exactly are the PSNR and SSIM values computed for the 3-task and 5-task settings?
Are there any specific pre-processing or evaluation details (e.g., cropping strategy, alignment procedures) that might affect the final metrics?

Thank you very much for your time and assistance!

c-yn · 2025-03-02T14:17:24Z

No additional pre-processing or stricks are used to obtain scores.
Can you obtain the reported scores using our code as is, instead of BasicSR?

Aitical · 2025-03-03T07:26:13Z

Thank you for your prompt reply. Following your suggestion, I ran the tests using your test script instead of BasicSR, and the results were consistent with your reported scores.

Upon comparing the testing pipelines, I suspect the discrepancy arises from image format handling—specifically, whether the images are processed as uint8 data.

In your provided code, the restored results are obtained by clipping the model output to [0, 1] and then directly computing the metrics against the GT images. However, BasicSR converts the restored results to uint8 (essentially saving the images before computing metrics), which appears to lead to the differences observed.

This would also explain why the provided image results still show a discrepancy, especially with the Rain100L dataset, which seems particularly sensitive to the numerical differences induced by the format conversion.

Could you please confirm if this is the expected behavior?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on PSNR/SSIM Calculation for 3-task and 5-task Results #17

Question on PSNR/SSIM Calculation for 3-task and 5-task Results #17

Aitical commented Mar 2, 2025

c-yn commented Mar 2, 2025

Aitical commented Mar 3, 2025

Question on PSNR/SSIM Calculation for 3-task and 5-task Results #17

Question on PSNR/SSIM Calculation for 3-task and 5-task Results #17

Comments

Aitical commented Mar 2, 2025

c-yn commented Mar 2, 2025

Aitical commented Mar 3, 2025