Skip to content

Commit cb678ab

Browse files
authored
[DOCS] Add adaptation example (#927)
1 parent 04b7c78 commit cb678ab

File tree

3 files changed

+242
-1
lines changed

3 files changed

+242
-1
lines changed

docs/source/first_steps/alignment_example.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -311,7 +311,7 @@ mfa align ~/mfa_data/librispeech-demo-1.0.0 english_us_mfa english_mfa ~/mfa_dat
311311

312312

313313
:::{code-block} bash
314-
mfa align ~/mfa_data/japanese-jvs-demo-1.0.0 japanese_mfa japanese_mfa ~/mfa_data/aligned_jva_demo_no_oovs --g2p_model_path japanese_mfa --clean
314+
mfa align ~/mfa_data/japanese-jvs-demo-1.0.0 japanese_mfa japanese_mfa ~/mfa_data/aligned_jvs_demo_no_oovs --g2p_model_path japanese_mfa --clean
315315
:::
316316

317317
::::
Lines changed: 240 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,240 @@
1+
2+
(remapping_example)=
3+
# Example: Adapting a model to a new language
4+
5+
6+
## Set up
7+
8+
```{important}
9+
Ensure you have installed MFA via {ref}`installation`. For comparing alignments to reference alignments from aligning via native language models, ensure you have completed the initital alignment for demo corpus in {ref}`alignment_example`.
10+
11+
You can see a more fully worked example of this with scripts for analyzing German, Czech, and Mandarin applied to an English corpus in the [mfa-adaptation GitHub repository](https://github.com/mmcauliffe/mfa-adaptation).
12+
```
13+
14+
::::{tab-set}
15+
16+
:::{tab-item} English
17+
:sync: english
18+
19+
For English, we will align the demo English corpus with the Mandarin pretrained acoustic model, and remap the English dictionary into the phone set that the Mandarin acoustic model uses.
20+
21+
1. Ensure you have downloaded the pretrained Mandarin model via {code}`mfa model download acoustic mandarin_mfa`
22+
2. Ensure you have downloaded the pretrained US English dictionary via {code}`mfa model download dictionary english_us_mfa`
23+
3. Download the [English LibriSpeech demo corpus](https://github.com/MontrealCorpusTools/librispeech-demo/archive/refs/tags/v1.0.0.tar.gz) and extract it to somewhere on your computer
24+
25+
:::
26+
27+
:::{tab-item} Japanese
28+
:sync: japanese
29+
30+
For Japanese, we will align the demo Japanese corpus with the English pretrained acoustic model, and remap the Japanese dictionary into the phone set that the English acoustic model uses.
31+
32+
1. Ensure you have downloaded the pretrained English model via {code}`mfa model download acoustic english_mfa`
33+
2. Ensure you have downloaded the pretrained Japanese dictionary via {code}`mfa model download dictionary japanese_mfa`
34+
3. Download the [Japanese JVS demo corpus](https://github.com/MontrealCorpusTools/japanaese-jvs-demo/archive/refs/tags/v1.0.0.tar.gz) and extract it to somewhere on your computer
35+
4. Install Japanese-specific dependencies via {code}`conda install -c conda-forge spacy sudachipy sudachidict-core`
36+
37+
:::
38+
39+
40+
:::{tab-item} Mandarin
41+
:sync: mandarin
42+
43+
For Mandarin, we will align the demo Mandarin corpus with the English pretrained acoustic model, and remap the Mandarin dictionary into the phone set that the English acoustic model uses.
44+
45+
1. Ensure you have downloaded the pretrained model via {code}`mfa model download acoustic english_mfa`
46+
2. Ensure you have downloaded the pretrained China Mandarin dictionary via {code}`mfa model download dictionary mandarin_china_mfa`
47+
3. Download the [Mandarin THCHS-30 demo corpus](https://github.com/MontrealCorpusTools/mandarin-thchs-30-demo/archive/refs/tags/v1.0.0.tar.gz) and extract it to somewhere on your computer
48+
4. Install Mandarin-specific dependencies via {code}`pip install spacy-pkuseg dragonmapper hanziconv`
49+
50+
:::
51+
52+
53+
::::
54+
55+
```{important}
56+
This example assumes you have a directory named ``mfa_data`` in your home directory in which the demo corpus was extracted.
57+
```
58+
59+
## Remapping the dictionary
60+
61+
62+
:::::{tab-set}
63+
64+
::::{tab-item} English
65+
:sync: english
66+
67+
First, download and save the contents of [english_to_mandarin_phone_mapping.yaml](https://raw.githubusercontent.com/mmcauliffe/mfa-adaptation/refs/heads/main/data/dictionary_mappings/english_to_mandarin_phone_mapping.yaml) to `~/mfa_data/english_to_mandarin_phone_mapping.yaml`. This is a file that maps phones in the English MFA phone set to phones in the Japanese MFA phone set, which we can use to create a new dictionary of English words with Mandarin MFA pronunciations.
68+
69+
:::{code-block} bash
70+
mfa remap dictionary english_us_mfa mandarin_mfa ~/mfa_data/english_to_mandarin_phone_mapping.yaml ~/mfa_data/english_mandarin.dict
71+
:::
72+
73+
If you open up `~/mfa_data/english_mandarin.dict` in a text editor, you'll now see pronunciations for English forms using Mandarin MFA phones. For example, any {ipa_inline}`ʒ` phones now have {ipa_inline}`ʐ` instead, as that's the closest phone in the Mandarin MFA phone set.
74+
75+
::::
76+
77+
::::{tab-item} Japanese
78+
:sync: japanese
79+
80+
81+
First, download and save the contents of [japanese_to_english_phone_mapping.yaml](https://raw.githubusercontent.com/mmcauliffe/mfa-adaptation/refs/heads/main/data/dictionary_mappings/japanese_to_english_phone_mapping.yaml) to `~/mfa_data/japanese_to_english_phone_mapping.yaml`. This is a file that maps phones in the Japanese MFA phone set to phones in the English MFA phone set, which we can use to create a new dictionary of Japanese words with English MFA pronunciations.
82+
83+
:::{code-block} bash
84+
mfa remap dictionary japanese_mfa english_mfa ~/mfa_data/japanese_to_english_phone_mapping.yaml ~/mfa_data/japanese_english.dict
85+
:::
86+
87+
If you open up `~/mfa_data/japanese_english.dict` in a text editor, you'll now see pronunciations for Japanese forms using English MFA phones. For example, any {ipa_inline}`` phones now have {ipa_inline}`` instead, as that's the closest phone in the English MFA phone set.
88+
89+
::::
90+
91+
::::{tab-item} Mandarin
92+
:sync: mandarin
93+
94+
95+
First, download and save the contents of [mandarin_to_english_phone_mapping.yaml](https://raw.githubusercontent.com/mmcauliffe/mfa-adaptation/refs/heads/main/data/dictionary_mappings/mandarin_to_english_phone_mapping.yaml) to `~/mfa_data/mandarin_to_english_phone_mapping.yaml`. This is a file that maps phones in the Mandarin MFA phone set to phones in the English MFA phone set, which we can use to create a new dictionary of Mandarin words with English MFA pronunciations.
96+
97+
:::{code-block} bash
98+
mfa remap dictionary mandarin_china_mfa english_mfa ~/mfa_data/mandarin_to_english_phone_mapping.yaml ~/mfa_data/mandarin_english.dict
99+
:::
100+
101+
If you open up `~/mfa_data/mandarin_english.dict` in a text editor, you'll now see pronunciations for Mandarin forms using English MFA phones. For example, any {ipa_inline}`` phones now have {ipa_inline}`` instead, as that's the closest phone in the English MFA phone set.
102+
103+
::::
104+
105+
:::::
106+
107+
108+
109+
## Alignment
110+
111+
### Aligning using pre-trained models
112+
113+
114+
:::::{tab-set}
115+
116+
::::{tab-item} English
117+
:sync: english
118+
119+
120+
:::{code-block} bash
121+
mfa align ~/mfa_data/librispeech-demo-1.0.0 ~/mfa_data/english_mandarin.dict english_mfa ~/mfa_data/aligned_librispeech_demo --clean
122+
:::
123+
124+
::::
125+
126+
::::{tab-item} Japanese
127+
:sync: japanese
128+
129+
130+
First, download and save the contents of [english_to_japanese_phone_mapping.yaml](https://raw.githubusercontent.com/mmcauliffe/mfa-adaptation/refs/heads/main/data/evaluation_mappings/english_to_japanese_phone_mapping.yaml) to `~/mfa_data/english_to_japanese_phone_mapping.yaml`. This file is similar to the previously downloaded `~/mfa_data/japanese_to_english_phone_mapping.yaml` except it maps phones in the opposite direction. This mapping says for every Japanese phone, what is an acceptable phone that counts as a "matching phone", allowing the overlap scoring algorithm to more correctly penalize issues in alignment.
131+
132+
If you have not aligned the Japanese demo corpus as the first step in {ref}`alignment_example`, you will have to omit the `--reference_directory` and `--custom_mapping_path` of the following command.
133+
134+
:::{code-block} bash
135+
mfa align ~/mfa_data/japanese-jvs-demo-1.0.0 ~/mfa_data/japanese_english.dict english_mfa ~/mfa_data/english_adapted/english_japanese_remapped_aligned --clean --reference_directory ~/mfa_data/aligned_jvs_demo --custom_mapping_path ~/mfa_data/english_to_japanese_phone_mapping.yaml --language japanese
136+
:::
137+
138+
```{note}
139+
140+
The `--language japanese` flag must be included to ensure that the Japanese text is properly tokenized by the Japanese morphological parser. When aligning using the Japanese MFA model, the language is set to Japanese by default, but we must override it here when using the English MFA model.
141+
```
142+
143+
The end output will give:
144+
145+
```{code}
146+
147+
INFO Evaluating alignments...
148+
INFO Exporting evaluation...
149+
INFO Average overlap score: 0.010834011956534382
150+
INFO Average phone error rate: 0.02820097244732577
151+
```
152+
153+
Which reports a mean phone boundary error of 10.8 ms (Average overlap score), and an average PER of 2.8% (percent of insertions, deletions and substitutions).
154+
155+
::::
156+
157+
::::{tab-item} Mandarin
158+
:sync: mandarin
159+
160+
161+
First, download and save the contents of [english_to_mandarin_phone_mapping.yaml](https://raw.githubusercontent.com/mmcauliffe/mfa-adaptation/refs/heads/main/data/evaluation_mappings/english_to_mandarin_phone_mapping.yaml) to `~/mfa_data/english_to_mandarin_phone_mapping.yaml`. This file is similar to the previously downloaded `~/mfa_data/mandarin_to_english_phone_mapping.yaml` except it maps phones in the opposite direction. This mapping says for every Mandarin phone, what is an acceptable phone that counts as a "matching phone", allowing the overlap scoring algorithm to more correctly penalize issues in alignment.
162+
163+
If you have not aligned the Mandarin demo corpus as the first step in {ref}`alignment_example`, you will have to omit the `--reference_directory` and `--custom_mapping_path` of the following command.
164+
165+
:::{code-block} bash
166+
mfa align ~/mfa_data/mandarin-thchs-30-demo-1.0.0 ~/mfa_data/mandarin_english.dict english_mfa ~/mfa_data/english_adapted/english_mandarin_remapped_aligned --clean --reference_directory ~/mfa_data/aligned_thchs_30_demo --custom_mapping_path ~/mfa_data/english_to_mandarin_phone_mapping.yaml --language chinese
167+
:::
168+
169+
::::
170+
171+
:::::
172+
173+
Once the files are aligned we can take a look at the alignment_analysis.csv file in the output directory to see if there are any glaring issues in alignment. This file is sorted initially by the `phone_duration_deviation` column, which is the maximum z-scored duration for phones in the utterance. High values indication much longer or shorter phones than we would expect given the phone, i.e., a {ipa_inline}`[ɾ]` lasting 100ms is very unlikely given the usual duration is typically around 10-20ms.
174+
175+
Additionally, there is are two files from the alignment evaluation triggered by having `--reference_directory` specified. As we're comparing alignments to reference alignments, we can look at a confusion matrix in `alignment_reference_confusions.csv` and find utterances with high errors by looking at `alignment_reference_evaluation.csv` and sorting on the `alignment_score` column.
176+
177+
### Adapting the acoustic model
178+
179+
In general, adapting a pretrained acoustic model to your specific data will improve alignments, but this is particularly so when using pretrained model that was trained on a different language than what you're aligning.
180+
181+
We can adapt our pretrained model via the {code}`mfa adapt` command:
182+
183+
:::::{tab-set}
184+
185+
::::{tab-item} English
186+
:sync: english
187+
188+
189+
```{warning}
190+
191+
Under construction
192+
```
193+
194+
::::
195+
196+
::::{tab-item} Japanese
197+
:sync: japanese
198+
199+
200+
:::{code-block} bash
201+
mfa adapt ~/mfa_data/japanese-jvs-demo-1.0.0 ~/mfa_data/japanese_english.dict english_mfa ~/mfa_data/english_adapted/english_adapted.zip --clean --language japanese
202+
:::
203+
204+
We can now use the adapted model to align the japanese-jvs-demo corpus. Note the change from ``english_mfa`` to ``~/mfa_data/english_adapted/english_adapted.zip`` below.
205+
206+
:::{code-block} bash
207+
mfa align ~/mfa_data/japanese-jvs-demo-1.0.0 ~/mfa_data/japanese_english.dict ~/mfa_data/english_adapted/english_adapted.zip ~/mfa_data/english_adapted/english_remapped_aligned_adapted --clean --reference_directory ~/mfa_data/aligned_jvs_demo --custom_mapping_path ~/mfa_data/english_to_japanese_phone_mapping.yaml --language japanese
208+
:::
209+
210+
The end output will give:
211+
212+
```{code}
213+
214+
INFO Evaluating alignments...
215+
INFO Exporting evaluation...
216+
INFO Average overlap score: 0.010524732208295882
217+
INFO Average phone error rate: 0.026904376012965966
218+
```
219+
220+
Which reports a mean phone boundary error of 10.5 ms, improving on the previous 10.8 ms error aligning by default, and an average PER of 2.7%, improving from 2.8%. So adaptation gives some modest gains for making the alignments generated from English MFA more similar to those generated by the Japanese MFA model. The benefit for adaptation is going to be a function of the size of the dataset, and the demo corpus here is pretty small, so only a little bit of improvement is to be expected and observed.
221+
222+
::::
223+
224+
::::{tab-item} Mandarin
225+
:sync: mandarin
226+
227+
228+
```{warning}
229+
230+
Under construction
231+
```
232+
233+
::::
234+
235+
:::::
236+
237+
238+
```{seealso}
239+
* {ref}`first_steps_adapt_pretrained`
240+
```

docs/source/getting_started.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,4 +67,5 @@ Installation
6767
installation
6868
first_steps/index
6969
first_steps/alignment_example
70+
first_steps/remapping_example
7071
first_steps/tutorials

0 commit comments

Comments
 (0)