App: Local Text-To-Speech (text2speech_kokoro)
The text2speech_kokoro app is one of the apps that provide Text-To-Speech functionality in Nextcloud and act as a speech generation backend for the Nextcloud Assistant app and other apps making use of the core `Text-To-Speech Task type. The text2speech_kokoro app specifically runs only open source models and does so entirely on-premises. Nextcloud can provide customer support upon request, please talk to your account manager for the possibilities.
This app uses Kokoro under the hood.
The used model supports the following languages:
American English
British English
Spanish
French
Italian
Hindi
Portuguese
Japanese
Mandarin
Requirements
Minimal Nextcloud version: 31
This app is built as an External App and thus depends on AppAPI v2.3.0
Nextcloud AIO is supported
We currently support x86_64 CPUs
We do not support GPUs
CPU Sizing
The more cores you have and the more powerful the CPU the better, we recommend around 10 cores
The app will hog all cores by default, so it is usually better to run it on a separate machine
800MB RAM
Installation
Make sure the Nextcloud Assistant app is installed
Install the text2speech_kokoro “Local Text-To-Speech” ExApp via the “Apps” page in the Nextcloud web admin user interface
Scaling
It is currently not possible to scale this app, we are working on this. Based on our calculations an instance has a rough capacity of 4h of transcription throughput per minute (measured with 8 CPU threads on an Intel(R) Xeon(R) Gold 6226R). It is unclear how close to real-world usage this number is, so we do appreciate real-world feedback on this.
App store
You can also find this app in our app store, where you can write a review: https://apps.nextcloud.com/apps/text2speech_kokoro
Repository
You can find the app’s code repository on GitHub where you can report bugs and contribute fixes and features: https://github.com/nextcloud/text2speech_kokoro
Nextcloud customers should file bugs directly with our customer support.
Known Limitations
We currently only support languages supported by the underlying Kokoro model
The Kokoro models perform unevenly across languages, and may show lower accuracy on low-resource and/or low-discoverability languages or languages where there was less training data available.
Make sure to test the language model you are using it for whether it meets the use-case’s quality requirements
Customer support is available upon request, however we can’t solve false or problematic output, most performance issues, or other problems caused by the underlying model. Support is thus limited only to bugs directly caused by the implementation of the app (connectors, API, front-end, AppAPI)