About
My goal is to create several voice models for use with the Piper text-to-speech software. I want there to be more voices available without restrictive licenses. Many of the widely available datasets and thus the voices created from them have "for research and learning purposes only"-style licenses.
Voice Models
Voice | Notes | Sample |
---|---|---|
LJSpeech (medium) | US English female voice. Single speaker. Trained from scratch for 1000 epochs on medium quality settings using the LJ Speech dataset. I reencoded the recordings to a bit rate of 22050 Hz so it would match other voices released for Piper TTS License: public domain Downloads: .onnx file | config json file | .ckpt file (for training) |
|
LJSpeech (high) | US English female voice. Single speaker. Trained from scratch for 2000 epochs on high quality settings using the LJ Speech dataset. I reencoded the recordings to a bit rate of 22050 Hz so it would match other voices released for Piper TTS License: public domain Downloads: .onnx file | config json file | .ckpt file (for training) |
|
Jenny (Dioco) | UK English female voice (Irish). Single speaker. Trained from scratch for 287 epochs on medium quality settings using the ~30 hour Jenny (Dioco) dataset. I reencoded the recordings to a bit rate of 22050 Hz so it would match other voices released for Piper TTS License: Attribution required, similar to CC BY See the link for license details. I impose no further license or restrictions. Downloads: .onnx file | config json file |
|
Clean 100 | US English voice. Multi-Speaker with 244 speakers. Trained from scratch on medium quality settings for 1000 epochs. This comes from the "Clean 100" subset of the LibriTTS-r dataset, which is a sound-improved version of the LibriTTS dataset, which is derived from the original materials of the LibriSpeech corpus. I removed the files they have listed as "failed speech restoration." The '100' comes from 100 hours of recordings in the original dataset. The sample is just the first speaker in the set. A zip containing a sample for each speaker in the set can be downloaded here License: CC BY 4.0 This is for the original dataset. I impose no further license or restriction. Downloads: .onnx file | config json file |
|
Cori (high) | UK English female voice. Single Speaker. Trained from scratch on high quality settings for 500 epochs. I put together the dataset, which ended up with about 24 hours of recordings. All recordings came from LibriVox.org. License: public domain Downloads: .onnx file | config json file | .ckpt file (for training) |
|
Cori (medium) | UK English female voice. Single Speaker. Trained from scratch on medium quality settings for 640 epochs. I put together the dataset, which ended up with about 24 hours of recordings. All recordings came from LibriVox.org. License: public domain Downloads: .onnx file | config json file | .ckpt file (for training) |
|
Kristin | US English female voice. Single Speaker. Trained from scratch on medium quality settings for 2000 epochs. I put together the dataset, which ended up with about 11.5 hours of recordings. All recordings came from LibriVox.org. License: public domain Downloads: .onnx file | config json file | .ckpt file (for training) |
|
John | US English male voice. Single Speaker. Finetuned from Kristin (above) on medium quality settings for an additional 600 epochs. I put together the dataset, which ended up with about 12.5 hours of recordings. All recordings came from LibriVox.org. License: public domain Downloads: .onnx file | config json file | .ckpt file (for training) |
|
Bryce | US English male voice. Single Speaker. Finetuned for an additional 1000 epochs from an unreleased voice which had 2500 epochs. This is my voice, I only recorded about 750 samples. License: public domain Downloads: .onnx file | config json file | .ckpt file (for training) |
|
Norman | US English male voice. Single Speaker. Trained from scratch on medium quality settings for 1200 epochs. I put together the dataset, which ended up with about 15.5 hours of recordings. All recordings came from LibriVox.org. I forgot to save the ckpt file on this one, sorry. License: public domain Downloads: .onnx file | config json file |
Note: Feel free to use these for any legal and ethical purpose. If somebody wants to upload these to HuggingFace or somewhere similar, you have my blessing.