mirror of https://github.com/nextcloud/stt_whisper.git synced 2026-02-07 06:11:59 +01:00

No description

Find a file

Nextcloud bot a5d2e82d69 Fix(l10n): Update translations from Transifex Signed-off-by: Nextcloud bot <bot@nextcloud.com>		2024-05-02 01:32:16 +00:00
.github/workflows	workflow check upto php 8.3	2024-01-08 19:29:23 +05:30
.reuse	update reuse license	2024-01-08 19:29:23 +05:30
.tx	chore(l10n): Setup transifex	2023-05-08 14:45:13 +02:00
appinfo	release v1.0.8	2024-01-10 21:00:59 +05:30
bin	First implementation bits	2023-04-18 17:51:55 +02:00
img	add license for app-dark.svg	2023-05-02 14:26:34 +02:00
l10n	Fix(l10n): Update translations from Transifex	2024-05-02 01:32:16 +00:00
lib	silent setup check for binaries	2024-01-18 21:58:26 +05:30
LICENSES	fix(reuse): Remove unused license	2023-05-25 12:41:08 +02:00
models	First implementation bits	2023-04-18 17:51:55 +02:00
screenshots	Add logo	2023-08-08 16:12:36 +02:00
src	drop separate musl bin in favour of a static binary	2024-01-10 19:54:31 +05:30
stubs	composer update & psalm fixes	2024-01-08 19:29:23 +05:30
templates	Fix licensing information	2023-04-25 12:03:14 +02:00
test/fixtures	chore(ci): Fix test fixture	2023-04-20 13:48:35 +02:00
vendor-bin	composer update & psalm fixes	2024-01-08 19:29:23 +05:30
.eslintrc.js	Add eslintrc and stylelint.config	2023-04-25 11:59:45 +02:00
.gitignore	update .gitignore	2023-11-20 14:01:15 +05:30
.l10nignore	chore(l10n): Setup transifex	2023-05-08 14:45:13 +02:00
.php-cs-fixer.dist.php	composer update & psalm fixes	2024-01-08 19:29:23 +05:30
CHANGELOG.md	release v1.0.8	2024-01-10 21:00:59 +05:30
composer.json	composer update & psalm fixes	2024-01-08 19:29:23 +05:30
composer.lock	composer update & psalm fixes	2024-01-08 19:29:23 +05:30
Makefile	fixup for static build	2024-01-11 16:07:44 +05:30
package-lock.json	npm audit fix	2023-11-20 14:03:27 +05:30
package.json	v1.0.7	2023-10-12 20:01:46 +02:00
psalm-baseline.xml	composer update & psalm fixes	2024-01-08 19:29:23 +05:30
psalm.xml	composer update & psalm fixes	2024-01-08 19:29:23 +05:30
README.md	add a few gotchas to the readme	2023-11-01 23:16:25 +05:30
stylelint.config.js	Add eslintrc and stylelint.config	2023-04-25 11:59:45 +02:00
webpack.js	Fix licensing information	2023-04-25 12:03:14 +02:00

README.md

Whisper Speech-To-Text

Speech-To-Text provider for Nextcloud running OpenAI Whisper locally on CPU.

The model runs completely on your machine. No private data leaves your servers.

Requirements

Architecture: x86-64 with AVX support
OS: Linux

Model sizes

Small: 500MB
Medium: 1.5Gb
Large: 3.1GB

Ethical AI Rating

Rating: 🟡

Positive:

the software for training and inference of this model is open source
the trained model is freely available, and thus can be run on-premises

Negative:

the training data is not freely available, limiting the ability of external parties to check and correct for bias or optimise the model’s performance and CO2 usage.

Learn more about the Nextcloud Ethical AI Rating in our blog.

Install

Manual install
- Place this app in nextcloud/apps/
One click install
- Install from the Nextcloud Appstore

Download models

After installing this app you will need to run

occ stt_whisper:download-models [model-name]

where [model-name] is one of

small
medium (default)
large

Building the app

The app can be built by using the provided Makefile by running:

make

This requires the following things to be present:

make
which
tar: for building the archive
curl: used if phpunit and composer are not installed to fetch them from the web
npm: for building and testing everything JS, only required if a package.json is placed inside the js/ folder
gcc: for building whisper.cpp

NOTE

A few things to keep in mind.

Transcriptions need to be enabled in the Talk app if you need the calls to be transcribed with any Speech to Text provider (including this app). It can be set using this occ command:

occ config:app:set spreed call_recording_transcription --value yes

This app tends to be heavy on CPU. If it starts to be an issue in your normal workflow, you can limit the number of threads used by Whisper in the "Whisper Speech-To-Text" section in the admin settings
The generated transcriptions may vary in accuracy based on the spoken language.
Per participant transcription in calls is currently not available but PRs are welcome!

README.md Unescape Escape