-
Notifications
You must be signed in to change notification settings - Fork 4
API reference: Models V0.2a
The Aligner contains several available models that one could use instantly, but you can also write your own model to run with the Aligner.
Here are the descriptions of the models
API, if you write your own model and wishes to use it with the Aligner, it will need to fit the API requirements described below.
The description here is of module version 0.2a.
The old model specs are now type 1 models, which are used for doing training without alignment types and POS tags.
In v0.2a we added a type 2 model, which uses POS tags and alignment types when doing training.
In src/models/
.
In your MyModel.py
, the only thing required is a class called AlignmentModel
, which contains the following methods. In the Aligner, before loading the model, Aligner will call the checkAlignmentModel
function in src/models/modelChecker.py
to check if the model meets the requirement. If you wish your model's API to look a bit different(for example more parameters) you'll need to modify checkAlignmentModel
as well.
Parameters:
-
bitext
:Bitext
, detail of this format. -
iterations
:int
, number of iterations to run.
This is the method that will be called to do the training. Currently the parameters has to be these two and these two only. If you wish otherwise you'll need to modify checkAlignmentModel
in src/models/modelChecker.py
.
Parameters:
-
bitext
:Bitext
, detail of this format.
Return:
-
Alignment
, the alignment of thebitext
generated by current model. detail of this format.
This is the method that will be called to do the training. Currently the parameters has to be these two and these two only. If you wish otherwise you'll need to modify checkAlignmentModel
in src/models/modelChecker.py
.
Parameters:
-
formTritext
:Tritext
, detail of this format This is the tritext of 1) source language text; 2) target language text; 3) gold alignment with type. -
tagTritext
:Tritext
, detail of this format This is the tritext of 1) source language text POS tags; 2) target language text POS tags; 3) gold alignment with type. -
iterations
:int
, number of iterations to run.
This is the method that will be called to do the training. Currently the parameters has to be these two and these two only. If you wish otherwise you'll need to modify checkAlignmentModel
in src/models/modelChecker.py
.
Parameters:
-
formBitext
:Bitext
, detail of this format This is the Bitext of 1) source language text; 2) target language text. -
tagBitext
:Bitext
, detail of this format This is the Bitext of 1) source language text POS tag; 2) target language text POS tag.
Return:
-
Alignment
, the alignment of thebitext
generated by current model. detail of this format.
This is the method that will be called to do the training. Currently the parameters has to be these two and these two only. If you wish otherwise you'll need to modify checkAlignmentModel
in src/models/modelChecker.py
.
The checkAlignmentModel
exists to make sure the aligner can at least call the model to run training and decoding without modification, it also gives hints on what doesn't fit in a model. It will if successful return the type of the model, otherwise return -1.
If you have multiple models to add you can add your their names to the supportedModels
list in modelChecker.py
, and run
python modelChecker.py
to check the APIs of all of the models at once.
In addition, you can also add an evaluator of your choosing should you wish to evaluate your model.
The provided evaluators are under src/evaluators/
. (You can also run them directly, see all options by python EVALUATOR.py -h
)
To do so, simply add the following line to your class:
class AlignmentModel():
def __init__(self):
... ...
self.evaluate = myEvaluationFunction
return
The requirement of myEvaluationFunction
:
Parameter:
- bitext:
Bitext
, detail of this format. - result:
Alignment
, the alignment of thebitext
generated by current model. detail of this format. - reference:
GoldAlignment
, the gold alignment used for reference. detail of this format.
Return:
-
dict
: containing the results(scores etc.).
For example:
return {
"Precision": precision,
"Recall": recall,
"AER": aer,
"F-score": fScore
}