No description
Find a file
Nextcloud bot a5d2e82d69
Fix(l10n): Update translations from Transifex
Signed-off-by: Nextcloud bot <bot@nextcloud.com>
2024-05-02 01:32:16 +00:00
.github/workflows workflow check upto php 8.3 2024-01-08 19:29:23 +05:30
.reuse update reuse license 2024-01-08 19:29:23 +05:30
.tx chore(l10n): Setup transifex 2023-05-08 14:45:13 +02:00
appinfo release v1.0.8 2024-01-10 21:00:59 +05:30
bin First implementation bits 2023-04-18 17:51:55 +02:00
img add license for app-dark.svg 2023-05-02 14:26:34 +02:00
l10n Fix(l10n): Update translations from Transifex 2024-05-02 01:32:16 +00:00
lib silent setup check for binaries 2024-01-18 21:58:26 +05:30
LICENSES fix(reuse): Remove unused license 2023-05-25 12:41:08 +02:00
models First implementation bits 2023-04-18 17:51:55 +02:00
screenshots Add logo 2023-08-08 16:12:36 +02:00
src drop separate musl bin in favour of a static binary 2024-01-10 19:54:31 +05:30
stubs composer update & psalm fixes 2024-01-08 19:29:23 +05:30
templates Fix licensing information 2023-04-25 12:03:14 +02:00
test/fixtures chore(ci): Fix test fixture 2023-04-20 13:48:35 +02:00
vendor-bin composer update & psalm fixes 2024-01-08 19:29:23 +05:30
.eslintrc.js Add eslintrc and stylelint.config 2023-04-25 11:59:45 +02:00
.gitignore update .gitignore 2023-11-20 14:01:15 +05:30
.l10nignore chore(l10n): Setup transifex 2023-05-08 14:45:13 +02:00
.php-cs-fixer.dist.php composer update & psalm fixes 2024-01-08 19:29:23 +05:30
CHANGELOG.md release v1.0.8 2024-01-10 21:00:59 +05:30
composer.json composer update & psalm fixes 2024-01-08 19:29:23 +05:30
composer.lock composer update & psalm fixes 2024-01-08 19:29:23 +05:30
Makefile fixup for static build 2024-01-11 16:07:44 +05:30
package-lock.json npm audit fix 2023-11-20 14:03:27 +05:30
package.json v1.0.7 2023-10-12 20:01:46 +02:00
psalm-baseline.xml composer update & psalm fixes 2024-01-08 19:29:23 +05:30
psalm.xml composer update & psalm fixes 2024-01-08 19:29:23 +05:30
README.md add a few gotchas to the readme 2023-11-01 23:16:25 +05:30
stylelint.config.js Add eslintrc and stylelint.config 2023-04-25 11:59:45 +02:00
webpack.js Fix licensing information 2023-04-25 12:03:14 +02:00

Whisper Speech-To-Text

Speech-To-Text provider for Nextcloud running OpenAI Whisper locally on CPU.

The model runs completely on your machine. No private data leaves your servers.

Requirements

  • Architecture: x86-64 with AVX support
  • OS: Linux

Model sizes

  • Small: 500MB
  • Medium: 1.5Gb
  • Large: 3.1GB

Ethical AI Rating

Rating: 🟡

Positive:

  • the software for training and inference of this model is open source
  • the trained model is freely available, and thus can be run on-premises

Negative:

  • the training data is not freely available, limiting the ability of external parties to check and correct for bias or optimise the models performance and CO2 usage.

Learn more about the Nextcloud Ethical AI Rating in our blog.

Install

  • Manual install
    • Place this app in nextcloud/apps/
  • One click install

Download models

After installing this app you will need to run

occ stt_whisper:download-models [model-name]

where [model-name] is one of

  • small
  • medium (default)
  • large

Building the app

The app can be built by using the provided Makefile by running:

make

This requires the following things to be present:

  • make
  • which
  • tar: for building the archive
  • curl: used if phpunit and composer are not installed to fetch them from the web
  • npm: for building and testing everything JS, only required if a package.json is placed inside the js/ folder
  • gcc: for building whisper.cpp

NOTE

A few things to keep in mind.

  • Transcriptions need to be enabled in the Talk app if you need the calls to be transcribed with any Speech to Text provider (including this app). It can be set using this occ command:
occ config:app:set spreed call_recording_transcription --value yes
  • This app tends to be heavy on CPU. If it starts to be an issue in your normal workflow, you can limit the number of threads used by Whisper in the "Whisper Speech-To-Text" section in the admin settings
  • The generated transcriptions may vary in accuracy based on the spoken language.
  • Per participant transcription in calls is currently not available but PRs are welcome!