This repo contains the TextbookNER dataset.
The files are in CoNLL format: each line contains one token and its corresponding gender label separated by a tab. Document boundaries are denoted by blank lines.
Other
-Other
/Out
(not part of a named entity)Person
-Person
named entityLocation
-Location
named entityOrganization
- 'Organization` named entity
Annotated using the IO tagging scheme.
If you find this dataset useful, please cite:
@phdthesis
{das2023genderbias,
title={Automated Gender Bias Identification in Textbooks},
author={Das, Sudeshna},
year={2023},
school={IIT Kharagpur}
}