git clone -b project https://github.com/CSTR-Edinburgh/ophelia.git
git clone https://github.com/AvashnaGovender/Merlyn.git
git clone https://github.com/AvashnaGovender/Tacotron.git
See recipe here.
Step 5a - Run forced alignment in:
https://github.com/AvashnaGovender/Tacotron/blob/master/PAG_recipe.md
Convert forced alignment labels to the forced alignment matrix:
Step 6 - Get durations and create guides:
https://github.com/AvashnaGovender/Tacotron/blob/master/PAG_recipe.md
To use FA as target in DCTTS see config file:
fa_as_target.cfg
Create attention guides from forced alignment matrix
cd ophelia/
python convert_alignment_to_guide.py fa_matrix.npy fa_guide.npy
To use FA as guide in DCTTS see config file:
fa_as_guide.cfg
Add phone level duration to transcript.csv using forced alignment matrix
cd ophelia/
python add_duration_to_transcript.py fa_matrix_dir transcript_file new_transcript_file
To use FA as attention in DCTTS see config file:
fa_as_attention.cfg
Convert state labels to 416 normalised label features (needs state labels and question file)
cd Merlyn/
python scripts/prepare_inputs.py
To use Labels-TE in DCTTS see config file:
labels_minus_te.cfg
To use Labels+TE in DCTTS see config file:
labels_plus_te.cfg
To use C-Labels+TE in DCTTS see config file:
c-labels_plus_te.cfg
Create new transcription file with phoneme sequence from labels by replace phone sequence of transcript file with phone sequence from HTS style labels
cd ophelia/
./labels2tra.sh labels_dir transcript_file new_transcript_file
To use PE&Labels+TE set MerlinTextEncWithPhoneEmbedding to True in the config file.
To calculate CDP, Ain and Aout:
cd ophelia/
python calculate_CDP_Ain_Aout.py attention_matrix.npy
To generate without FIA (forcibly incremental attention) set turn_off_monotonic_for_synthesis to True in the config file.
See readme in Tacotron repository: https://github.com/AvashnaGovender/Tacotron