-
Notifications
You must be signed in to change notification settings - Fork 4
API reference: FileIO V0.2a
FileIO
is located in src/fileIO.py
. It contains functions used to handle file operations in the Aligner.
The description here is of module version 0.2a.
Comparing to v0.1a, in v0.2a the export function can now export alignment with alignment types. Also, it adds support to Alignment Data Format v0.2a.
Parameters:
-
result
:Alignment
, detailed description of this format -
fileName
:str
, the file to export to
Parameters:
-
file1
:str
, the first file to read -
file2
:str
, the second file to read -
linesToLoad
:int
, the lines to read
Return:
-
Bitext
, detail of this format.
Parameters:
-
file1
:str
, the first file to read -
file2
:str
, the second file to read -
file3
:str
, the third file to read -
linesToLoad
:int
, the lines to read
Return:
-
Tritext
, detail of this format.
Parameters:
-
fileName
:str
, the Alignment file to read -
linesToLoad
:int
, the lines to read
Return:
-
GoldAlignment
, detail of this format.
UTF-8 text files. Each line contains one sentence, sentences are segmented in which words are separated by space
. One language each file.
UTF-8 text files. Each line contains one sentence. Alignments of words of in one sentence are separated by space
. Each alignment is represented in the following format:
-
"NN-MM"
, whereNN
andMM
are integers, means that there is a certain alignment between theNN
th word of the source sentence and theMM
th word of the target sentence. In addition,MM
could be of the format:"M1,M2,M3,..."
which means that there are certain alignments between theNN
th word of the source sentence and each of theMi
th words of the target sentence. -
"NN?MM"
, whereNN
andMM
are integers, means that there is a probable alignment between theNN
th word of the source sentence and theMM
th word of the target sentence. In addition,MM
could be of the format:"M1,M2,M3,..."
which means that there are probable alignments between theNN
th word of the source sentence and each of theMi
th words of the target sentence. -
"NN-MM-TT"
, whereNN
andMM
are integers,TT
is astr
representing the type of the alignment. It means that there is a certain alignment between theNN
th word of the source sentence and theMM
th word of the target sentence, both of which are ofTT
type.