Bryce's Piper TTS Voices

Back to my site

About

My goal is to create several voice models for use with the Piper text-to-speech software. I want there to be more voices available without restrictive licenses. Many of the widely available datasets and thus the voices created from them have "for research and learning purposes only"-style licenses.

Voice Models

Voice Notes Sample
LJSpeech (medium)

US English female voice. Single speaker. Trained from scratch for 1000 epochs on medium quality settings using the LJ Speech dataset. I reencoded the recordings to a bit rate of 22050 Hz so it would match other voices released for Piper TTS


License: public domain


Downloads: .onnx file | config json file | .ckpt file (for training)
LJSpeech (high)

US English female voice. Single speaker. Trained from scratch for 2000 epochs on high quality settings using the LJ Speech dataset. I reencoded the recordings to a bit rate of 22050 Hz so it would match other voices released for Piper TTS


License: public domain


Downloads: .onnx file | config json file | .ckpt file (for training)
Jenny (Dioco)

UK English female voice (Irish). Single speaker. Trained from scratch for 287 epochs on medium quality settings using the ~30 hour Jenny (Dioco) dataset. I reencoded the recordings to a bit rate of 22050 Hz so it would match other voices released for Piper TTS


License: Attribution required, similar to CC BY See the link for license details. I impose no further license or restrictions.


Downloads: .onnx file | config json file
Clean 100

US English voice. Multi-Speaker with 244 speakers. Trained from scratch on medium quality settings for 1000 epochs. This comes from the "Clean 100" subset of the LibriTTS-r dataset, which is a sound-improved version of the LibriTTS dataset, which is derived from the original materials of the LibriSpeech corpus. I removed the files they have listed as "failed speech restoration." The '100' comes from 100 hours of recordings in the original dataset. The sample is just the first speaker in the set. A zip containing a sample for each speaker in the set can be downloaded here


License: CC BY 4.0 This is for the original dataset. I impose no further license or restriction.


Downloads: .onnx file | config json file
Cori (high)

UK English female voice. Single Speaker. Trained from scratch on high quality settings for 500 epochs. I put together the dataset, which ended up with about 24 hours of recordings. All recordings came from LibriVox.org.


License: public domain


Downloads: .onnx file | config json file | .ckpt file (for training)
Cori (medium)

UK English female voice. Single Speaker. Trained from scratch on medium quality settings for 640 epochs. I put together the dataset, which ended up with about 24 hours of recordings. All recordings came from LibriVox.org.


License: public domain


Downloads: .onnx file | config json file | .ckpt file (for training)
Kristin

US English female voice. Single Speaker. Trained from scratch on medium quality settings for 2000 epochs. I put together the dataset, which ended up with about 11.5 hours of recordings. All recordings came from LibriVox.org.


License: public domain


Downloads: .onnx file | config json file | .ckpt file (for training)
John

US English male voice. Single Speaker. Finetuned from Kristin (above) on medium quality settings for an additional 600 epochs. I put together the dataset, which ended up with about 12.5 hours of recordings. All recordings came from LibriVox.org.


License: public domain


Downloads: .onnx file | config json file | .ckpt file (for training)
Bryce

US English male voice. Single Speaker. Finetuned for an additional 1000 epochs from an unreleased voice which had 2500 epochs. This is my voice, I only recorded about 750 samples.


License: public domain


Downloads: .onnx file | config json file | .ckpt file (for training)
Norman

US English male voice. Single Speaker. Trained from scratch on medium quality settings for 1200 epochs. I put together the dataset, which ended up with about 15.5 hours of recordings. All recordings came from LibriVox.org. I forgot to save the ckpt file on this one, sorry.


License: public domain


Downloads: .onnx file | config json file

Note: Feel free to use these for any legal and ethical purpose. If somebody wants to upload these to HuggingFace or somewhere similar, you have my blessing.