Skip to content

Commit

Permalink
chore: Document new GUI mode in README.
Browse files Browse the repository at this point in the history
  • Loading branch information
rmarrugat committed Jan 2, 2025
1 parent 7d0188c commit 6b2c33b
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 3 deletions.
25 changes: 22 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,11 +62,11 @@ Speech Center allows to easily convert an audio resource into its associated tex
Speech recognition CLI will retrieve an audio file stored locally and send it through streaming to Speech Center's GRPC interface. To run the client please run the `recognizer_stream.py` from the `cli-client` directory. An example transcription request is shown below, all the flags are then detailed.


**Example for streaming**
**Example for CLI**

```shell
cd cli-client/
python3 recognizer_stream.py --audio-file file.wav --topic GENERIC --language en-US --host us.speechcenter.verbio.com --token token.file --asr-version V1 --label project1
python3 recognizer_stream.py --audio-file file.wav --topic GENERIC --language en-US --host us.speechcenter.verbio.com --token token.file --asr-version V2 --label project1 --formatting
```

You can use the `--help`command for more options.
Expand All @@ -75,7 +75,7 @@ This code will generate the following terminal output on success:
```text
[2023-04-04 12:28:29,078][INFO]:Running speechcenter streaming channel...
[2023-04-04 12:28:29,079][INFO]:Reading Speech Center JWT token from token.file...
[2023-04-04 12:28:29,080][INFO]:Connecting to pre.us.verbiospeechcenter.com using a secure channel...
[2023-04-04 12:28:29,080][INFO]:Connecting to us.speechcenter.verbio.com using a secure channel...
[2023-04-04 12:28:29,082][INFO]:Dividing audio of length 1022720 into 52 chunks of 20000 samples...
[2023-04-04 12:28:29,083][INFO]:Sending streaming message config
[2023-04-04 12:28:29,083][INFO]:Running response watcher
Expand All @@ -92,10 +92,21 @@ This code will generate the following terminal output on success:
"duration": 4.460000
[2023-04-04 12:28:35,412][INFO]:Stream inactivity detected, closing stream...
[2023-04-04 12:28:35,413][INFO]:Recognition finished
```

**Example for CLI with GUI**

Alternatively, you can use the `--gui` parameter to active the GUI mode:

```shell
cd cli-client/
python3 recognizer_stream.py --audio-file file.wav --topic GENERIC --language en-US --host us.speechcenter.verbio.com --token token.file --asr-version V2 --label project1 --formatting --gui
```

This code will generate the following terminal output on success:

![Real-Time Speech-to-Text GUI](img/STT_gui.png)

You can also run:
```shell
python3 recognizer_stream.py --help
Expand Down Expand Up @@ -242,6 +253,14 @@ This option allows for a one word argument to be sent so that the speech transcr
- **Argument must be the same each time for the same project. If there is a typo another project will be created.**
- **There is no limit on the amount of projects that can be created.**

#### GUI Mode

```
--gui
```

This option enables GUI mode. Check out the **Example for CLI with GUI** above to understand the differences between the standard and the GUI modes.


## Text-To-Speech

Expand Down
Binary file added img/STT_gui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 6b2c33b

Please sign in to comment.