-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME.txt
123 lines (86 loc) · 6.15 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
Python code for GMM-UBM and MAP adaptation based speaker verification
Citation:
[1] Z.-H. Tan, A.k. Sarkara and N. Dehakb, "rVAD: an unsupervised segment-based robust voice activity detection method," Computer Speech and Language, 2019.
where speaker verification is used as one down-stream application of VAD.
Code was tested on python 2.7
0/workflow of code:
feature extraction -->> GMM-UBM-training -->> GMM-UBM+MAP (target model) -->> Scoring [log likelihood ratio]
(1)/ Feature extraction(MFCC+rasta, vad, cmn):
=========================================================================
-1.1) First create the list file for feature extraction i.e. "feat.lst"
Contents:
[1st column] -> source wave file, [2nd column] - > destination feature file
e.g.
wav/reddots_r2015q4_v1/pcm/m0067/20150611185947809_m0067_840.wav,feat/reddots_r2015q4_v1/pcm/m0067/20150611185947809_m0067_840.htk
wav/reddots_r2015q4_v1/pcm/m0067/20150701164352946_m0067_39.wav,feat/reddots_r2015q4_v1/pcm/m0067/20150701164352946_m0067_39.htk
1.2) run the following command in "Bash shell"
>> OMP_NUM_THREADS=1 python featureExtract.py
[#] Change the following parameters as per your requirement for the feature extraction in "featureExtract.py"
e.g. (default)
winlen, ovrlen, pre_coef, nfilter, nftt = 0.025, 0.01, 0.97, 20, 512 #[window size (sec)], [frame shift(sec)], [pre-emp coeff],
#[no. of filter in MFCC], [N-point FFT]
[#] If you don't like to apply the "default RASTA filtering" on MFFC
-please "comment the following line in "mfcc.py"
t=rastaFilter(t).T
and make "t=t.T"
[#] Default vad: energy threshold i.e. "opts==1"
-To incorporate "rVAD label generated by matlab" ..
- please make "opts==0" and then follow the instruction to plugin the vad file inside the code "featureExtract.py"
[#] To discard VAD
- put "opts= value except 0 or 1" e.g. "opts==3"
[#] To discard "cmn", comment the folowing line in "featureExtract.py"
- f=cmvn(f) i.e. "#f=cmvn(f)"
(2)/GMM-UBM-training
================================================================================
2.1) First, create the list file for the "GMM/UBM" training data i.e. "UBM.lst"
e.g. [each row contents the feature file]
feat/TIMIT/TEST/DR1/FAKS0/SA1.htk
feat/TIMIT/TEST/DR1/FAKS0/SA2.htk
feat/TIMIT/TEST/DR1/FAKS0/SI1573.htk
feat/TIMIT/TEST/DR1/FAKS0/SI2203.htk
**Importante note: it first discards the "only single frame/feature vector" before start "UBM training".
-Due to the different way of indexing "array/matrix" element in python
2.2) run the following command in "Bash shell"
>> OMP_NUM_THREADS=1 python GMMtrn.py
[#] Default parameter(edit the following parameters as per your requirement, different way of training GMM) in "GMMtrn.py"
nmix, dsfactor, rmd, emIter =4, 10, 0, 5 #[mixture power of 2], dfactor= decimination of frames during itermediate UBM training/file (speed up),[EM i
ter]
# rmd =1 ; 1) randomize frames --> 2) decimination [llh may not increasing in EM for interm. model]
[#] Default directory of saving GMM (change it as per your requirement)
ubmDir= 'GMM' + str(nmix)
(3)/ GMM-UBM+MAP (target model)
=================================================================================
3.1) First, prepare the list file for the target model derived from UBM i.e. "target.ndx"
e.g.[1st column] --> target model id, [2nd column] --> feature file
m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084154554_m0001_31.htk
m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084155412_m0001_31.htk
m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150130084156114_m0001_31.htk
m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084156879_m0001_32.htk
m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084157752_m0001_32.htk
m0001_32,feat/reddots_r2015q4_v1/pcm/m0001/20150130084158439_m0001_32.htk
m0001_33,feat/reddots_r2015q4_v1/pcm/m0001/20150130084159156_m0001_33.htk
**Importante note: make sure none of the file contents "only single frame/feature vector".
Please "discard those files from the list" or "duplicate the frame at least twice"
-otherwise error will occur due to the different way of indexing "array/matrix" element in python
3.2) run the following command in "Bash shell"
>> OMP_NUM_THREADS=1 python TargetTRN.py
[#] Default parameter in [MAP] (please change it as per your requirement in "TragetTRN.py")
MapItr, Tau =3, 10.0 #[no of MAP iteration], [value of relevance factor]
[#] Default UBM model store in the current director with the folder name e.g GMM512 (change it per your requirement)
ubmDir= 'GMM' + str(nmix)
(4)/ Scoring [log likelihood ratio]
=============================================================================
4.1) First, prepare the trail list file i.e. "m_part_01.ndx"
e.g. [1st column] -claimant model id, [2nd column] --> test trial feature file
m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150129213253016_m0001_36.htk
m0001_31,feat/reddots_r2015q4_v1/pcm/m0001/20150129213254935_m0001_32.htk
m0060_40,feat/reddots_r2015q4_v1/pcm/m0067/20150611185843833_m0067_36.htk
**Importante note: make sure none of the file contents "only single frame/feature vector".
Please "discard those file from list" or "duplicate the frame at least twice"
-otherwise error will occur due to the different way of indexing "array/matrix" element in python
4.2) Set number of thread for parallel scoring (default)
CORES=2
4.3) Set the score file "name and directory" (default)
Scorefile='score.txt' #output file : scores
4.4) run the following command in "Bash shell"
>> OMP_NUM_THREADS=1 python Scoring.py