Skip to content

Commit 62d88a6

Browse files
authoredAug 17, 2023
Update README.md
1 parent a177a05 commit 62d88a6

File tree

1 file changed

+24
-0
lines changed

1 file changed

+24
-0
lines changed
 

‎scripts/README.md

+24
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,27 @@ The steps are as follows:
2525
where `ratio` specifies the ratio of the dataset to be shuffled, `tags_file` is the original annotation file created in Step 1 and `save_dir` specifies the directory where the shuffled file will be stored. After running the command, you'll see the shuffled file at `DirectProbe/data/final_dataset/gu_muril-large_SS-SR/100_perc/42/entities/train.txt`. For the sake of computing CRA scores, the ratio should be kept at 1. In order to replicate the trend plots in the paper where we consider different ratios of the dataset, you can change the ratio parameter. We used ratios of 0.05, 0.1, 0.25, 0.5, 0.75, and 1 in our plots. For each ratio, shuffles were created for each of the five seeds mentioned before.
2626
4. After generating the random shuffle file(s), we can rerun the DirectProbe with a changed config pointing to the shuffled data file. This updated config file will have a change in the `common2` field pointing to the directory containing the shuffled data. We can also update the `output_path` so the results are stored in a different folder. Just to help you, we have provided the [updated config](https://github.com/utahnlp/weak-verifiers/blob/main/scripts/misc/config_shuff.ini) as well. Run DirectProbe with the updated config as shown in Step 2 and find out the number of clusters. Find out the clusters for multiple random seeds and average them. This average will be `C_rand`.
2727
5. Once you have `C_org` and `C_rand`, CRA score = 1 - (`C_org`/`C_rand`). Unfortunately, the averaging to get `C_rand` and then computing the CRA score needs to be manually done.
28+
29+
30+
31+
# Running Baseline Models
32+
We release scripts to train two kinds of baseline models where i) gold adpositions are provided, and ii) gold adpositions are not provided. To run the training for all the representations mentioned in the paper, you can run:
33+
```console
34+
(<venv_name>)foo@bar: weak-verifiers/scripts/$ sh run_snacs_classifier.sh
35+
```
36+
and
37+
```console
38+
(<venv_name>)foo@bar: weak-verifiers/scripts/$ sh run_adpplussnacs_classifier.sh
39+
```
40+
respectively. Note that this runs for one single seed. To get the training done for other seeds, you can specific the global variable `SEED` in `adp_classifier.py` and `seq_tagger.py` respectively and re-run the shell script. <br>
41+
The dev and test metrics for the best model alongside the best model parameters will be stored in the model directory specified with sub-folders created appropriately for the representation, the label set and the seed. <br><br>
42+
More tweaks are possible for training. These can be passed as command line argumnents. See the [`parse_args` method](https://github.com/utahnlp/weak-verifiers/blob/a177a05820d43d7fe496188cd5a12ca30a13c71a/scripts/seq_tagger.py#L681) in either file for more details.
43+
44+
To evaluate a certain split (train/dev/test) using an existing model on disk, run:
45+
```console
46+
(<venv_name>)foo@bar: weak-verifiers/scripts/$ python adp_classifier.py --model_name <hf_model_name> --label_set <label_set> --mode eval --eval_model_path <path_to_model> --eval_split <split>
47+
```
48+
and
49+
```console
50+
(<venv_name>)foo@bar: weak-verifiers/scripts/$ python seq_tagger.py --model_name <hf_model_name> --label_set <label_set> --mode eval --eval_model_path <path_to_model> --eval_split <split>
51+
```

0 commit comments

Comments
 (0)
Please sign in to comment.