Skip to content

unable to generate anything #785

@gucio321

Description

@gucio321

Command: tts --text "Czesc, jestem syntezator mowy" --model_name "tts_models/pl/mai_female/vits" --vocoder_name "vocoder_models/universal/libri-tts/wavegrad" --out_path /output/

Enviroument:
I have too new python so I use docker. Here is my dockerfile:

Dockerfile
FROM python:3.7

RUN python3 -m pip install TTS

# run bash
CMD ["/bin/bash"]

output:

console output
 > tts_models/pl/mai_female/vits is already downloaded.
 > vocoder_models/universal/libri-tts/wavegrad is already downloaded.
 > Using model: vits
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log10
 | > min_level_db:0
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:None
 | > fft_size:1024
 | > power:None
 | > preemphasis:0.0
 | > griffin_lim_iters:None
 | > signal_norm:None
 | > symmetric_norm:None
 | > mel_fmin:0
 | > mel_fmax:None
 | > pitch_fmin:None
 | > pitch_fmax:None
 | > spec_gain:20.0
 | > stft_pad_mode:reflect
 | > max_norm:1.0
 | > clip_norm:True
 | > do_trim_silence:False
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:None
 | > base:10
 | > hop_length:256
 | > win_length:1024
 > initialization of speaker-embedding layers.
 > initialization of language-embedding layers.
 > Vocoder Model: wavegrad
 > Text: Czesc, jestem syntezator mowy
 > Text splitted to sentences.
['Czesc, jestem syntezator mowy']
Traceback (most recent call last):
  File "/usr/local/bin/tts", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/TTS/bin/synthesize.py", line 439, in main
    reference_speaker_name=args.reference_speaker_idx,
  File "/usr/local/lib/python3.7/site-packages/TTS/utils/synthesizer.py", line 393, in tts
    vocoder_input = self.vocoder_ap.normalize(mel_postnet_spec.T)
  File "/usr/local/lib/python3.7/site-packages/TTS/utils/audio/processor.py", line 286, in normalize
    raise RuntimeError(" [!] Mean-Var stats does not match the given feature dimensions.")
RuntimeError:  [!] Mean-Var stats does not match the given feature dimensions.
root@1affd8f1442e:/# 

nvidia-smi output
Fri Jul 26 19:51:50 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2080 ...    Off |   00000000:01:00.0  On |                  N/A |
| N/A   51C    P8              2W /   80W |     718MiB /   8192MiB |     10%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2959      G   /usr/bin/gnome-shell                          372MiB |
|    0   N/A  N/A      3236    C+G   /usr/bin/xwaylandvideobridge                   13MiB |
|    0   N/A  N/A      3548      G   /usr/bin/Xwayland                              11MiB |
|    0   N/A  N/A      3787      G   /usr/libexec/xdg-desktop-portal-gnome          51MiB |
|    0   N/A  N/A      4010    C+G   /usr/libexec/mutter-x11-frames                 10MiB |
|    0   N/A  N/A     13250      G   /usr/bin/gnome-clocks                          35MiB |
|    0   N/A  N/A     19016      G   ...ures=SpareRendererForSitePerProcess        130MiB |
|    0   N/A  N/A     20426      G   /usr/bin/evolution                              2MiB |
|    0   N/A  N/A     24450      G   ...bin/plasma-browser-integration-host          2MiB |
|    0   N/A  N/A     26279      G   /usr/libexec/kactivitymanagerd                  2MiB |
+-----------------------------------------------------------------------------------------+

I followed instructions from readme. Am I missing something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions