Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about "context" #26

Open
1185307269 opened this issue Mar 2, 2023 · 2 comments
Open

A question about "context" #26

1185307269 opened this issue Mar 2, 2023 · 2 comments

Comments

@1185307269
Copy link

What does the 'context' parameter in this command represent?
If I change --context "1" to --context "2",the generated files differ in the number at the beginning of the sequence.

python3 sample.py --model ${model} --t 0.8 --p 0.9 --max-length 1024 --num-samples 2 --context "1"

@Sireesiru
Copy link

I have the same question.

@aeolianine
Copy link

I first thought it indicates whether the sequence should match sequences from RefSeq (1) or BFD (2). On a second thought, after looking at some of their examples, I think it signals "forward" vs "reverse". They fed both directions into the training, and with "1" and "2" you can preserve knowledge of that direction.

I find the training on reverse and forward at the same time a bit counter-intuitive. Only the "forward protein" is functional in nature. Now if we pretended everything runs in reverse and learned on that, it should be equivalent to "training in the forward direction". The model can learn the relationships regardless of the direction. But if you give both, then the model learns that both directions are possible (likely). That is not true. But maybe the "1" and "2" tokens are a way of dealing with that.

So yeah, why train in both directions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants