You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think the "readable" files are not proper "tsv"s. The question-answer space is not a tab (and sometimes its just a single space) which makes it hard to read (with a program or a tsv processor).
Thanks for the feedback—you're right, the .tsv extension is not technically right. But I did it that way so you can read them directly in a text editor or browser, e.g., in the link you gave above.
To read them in automatically, just chop off the first 50 characters for the question and trim whitespace. (50 characters was the length limit for the questions during annotation.)
Yes, some questions are repeated. In dev and test, we had multiple annotators write QA pairs for each set of target words, so duplicates will be especially common in those partitions. The example you give is one of those.
Hope that resolves your questions. I'll leave this issue open until I get around to updating the readme with clarifications.
Hello there,
Nitpicky issues:
The text was updated successfully, but these errors were encountered: