Skip to content

Commit b279ec0

Browse files
authored
Update README-SVS-opencpop-pndm.md
1 parent 5be5212 commit b279ec0

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

‎docs/README-SVS-opencpop-pndm.md‎

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/midi/e2e/opencpo
7474
```sh
7575
CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/midi/e2e/opencpop/ds1000.yaml --exp_name $MY_DS_EXP_NAME --reset --infer
7676
```
77+
Inference results will be saved in `./checkpoints/MY_DS_EXP_NAME/generated_` by default.
7778

7879
We also provide:
7980
- the pre-trained model of DiffSinger;
@@ -104,8 +105,8 @@ inp = {
104105
'input_type': 'phoneme'
105106
} # input like Opencpop dataset.
106107
```
107-
108+
Here the inference results will be saved in `./infer_out` by default.
108109
### 5. Some issues.
109110
a) the HifiGAN-Singing is trained on our [vocoder dataset](https://dl.acm.org/doi/abs/10.1145/3474085.3475437) and the training set of [PopCS](https://arxiv.org/abs/2105.02446). Opencpop is the out-of-domain dataset (unseen speaker). This may cause the deterioration of audio quality, and we are considering fine-tuning this vocoder on the training set of Opencpop.
110111

111-
b) in this version of codes, we used the melody frontend ([lyric + MIDI]->[ph_dur]) to predict phoneme duration. F0 curve is implicitly predicted together with mel-spectrogram.
112+
b) in this version of codes, we used the melody frontend ([lyric + MIDI]->[ph_dur]) to predict phoneme duration. F0 curve is implicitly predicted together with mel-spectrogram.

0 commit comments

Comments
 (0)