Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on PSNR/SSIM Calculation for 3-task and 5-task Results #17

Open
Aitical opened this issue Mar 2, 2025 · 2 comments
Open

Question on PSNR/SSIM Calculation for 3-task and 5-task Results #17

Aitical opened this issue Mar 2, 2025 · 2 comments

Comments

@Aitical
Copy link

Aitical commented Mar 2, 2025

Thank you for sharing your excellent work on "AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation". I have a question regarding the evaluation process described in the paper.

I tested the provided models (adair3d.ckpt and adair5d.ckpt) and noticed that the PSNR and SSIM results I obtained are significantly different from the values in the paper's table. For example, when evaluating the pretrained results on the Rain100L/SOTS dataset, I found a notable gap between my results and reported ones in paper. (I utilized the calculate function in Basicsr for evaluation )

3 tasks

Image

For 5 tasks

Image

For the released test results, I observed that the test images appear to have their edges cropped by one pixel on Rain100L images. Specifically, the original image size is (481,321), but the test image is (480,320). Even after adjusting for this discrepancy by aligning the cropped images, the evaluation metrics are still approximately 0.14 lower than those reported.

Image

How exactly are the PSNR and SSIM values computed for the 3-task and 5-task settings?
Are there any specific pre-processing or evaluation details (e.g., cropping strategy, alignment procedures) that might affect the final metrics?

Thank you very much for your time and assistance!

@c-yn
Copy link
Owner

c-yn commented Mar 2, 2025

No additional pre-processing or stricks are used to obtain scores.
Can you obtain the reported scores using our code as is, instead of BasicSR?

@Aitical
Copy link
Author

Aitical commented Mar 3, 2025

Thank you for your prompt reply. Following your suggestion, I ran the tests using your test script instead of BasicSR, and the results were consistent with your reported scores.

Image

Upon comparing the testing pipelines, I suspect the discrepancy arises from image format handling—specifically, whether the images are processed as uint8 data.

In your provided code, the restored results are obtained by clipping the model output to [0, 1] and then directly computing the metrics against the GT images. However, BasicSR converts the restored results to uint8 (essentially saving the images before computing metrics), which appears to lead to the differences observed.

This would also explain why the provided image results still show a discrepancy, especially with the Rain100L dataset, which seems particularly sensitive to the numerical differences induced by the format conversion.

Could you please confirm if this is the expected behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants