Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieval accuracy different from official JAX/FLAX implementation #11

Open
cwq159 opened this issue Aug 11, 2021 · 1 comment
Open

Comments

@cwq159
Copy link

cwq159 commented Aug 11, 2021

I wonder why the Retrieval accuracy is almost 20% higher than the official JAX/FLAX implementation.
As the paper says, "While we achieve consistent results reported in (Tay et al. 2020) for most tasks in
our PyTorch reimplementation, the performance on Retrieval task is higher for all models following the hyperparameters in (Tay et al. 2020)."
Is there any difference aside from the hyperparameters?

@mlpen
Copy link
Owner

mlpen commented Aug 26, 2021

Hi, sorry for the late response.
We have actually asked the authors of LRA about this issue, but the problem is not completely resolved.
google-research/long-range-arena#18
We suspect that the difference in hyper-parameters might be one of the reasons.
However, when I checked the latest repo of LRA a few minutes ago, we are still not clear what hyper-parameters are exactly and how baselines are compared in the original paper.
We used the data processing code in LRA repo and only rewrote the implementation for the model and training.
So, the answer is still not clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants