Skip to content

readium/speech

This branch is up to date with main.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ab3f223 Â· Mar 3, 2025

History

69 Commits
Dec 2, 2024
Sep 1, 2024
Dec 2, 2024
Nov 29, 2024
Mar 3, 2025
Nov 6, 2024
Dec 2, 2024
Jul 31, 2024
Dec 2, 2024
Dec 2, 2024
Dec 2, 2024
Dec 2, 2024
Dec 2, 2024
Dec 2, 2024
Dec 2, 2024
Aug 19, 2024
Dec 2, 2024
Dec 2, 2024
Dec 2, 2024

Repository files navigation

Readium Speech

Readium Speech is a TypeScript library for implementing a read aloud feature with Web technologies. It follows best practices gathered through interviews with members of the digital publishing industry.

While this project is still in a very early stage, it is meant to power the read aloud feature for two different Readium projects: Readium Web and Thorium.

Readium Speech was spun out as a separate project in order to facilitate its integration as a shared component, but also because of its potential outside of the realm of ebook reading apps.

Scope

  • Extracting Guided Navigation objects from a document (or a fragment of a document)
  • Generating utterances from these Guided Navigation objects
  • Processing utterances (prepending/appending text to utterances based on context, pronunciation through SSML/PLS…)
  • Voice selection
  • TTS playback
  • Highlighting

Current focus

For our initial work on this project, we're focusing on voice selection based on recommended voices.

The outline of this work has been explored in a GitHub discussion and through a best practices document.

Demo

A live demo of the voice selection API is available.

It demonstrates the following features:

  • fetching a list of all available languages, translating them to the user's locale and sorting them based on these translations
  • returning a list of voices for a given language, grouped by region and sorted based on quality
  • filtering languages and voices based on gender and offline availability
  • using embedded test utterances to demo voices

QuickStart

npm install https://github.com/readium/speech#build

import { voicesSelection} from "readium-speech";
console.log(voicesSelection);

// or with cjs only : 
const { getVoices } = require("readium-speech/cjs/voices.js");
console.log(getVoices);

// or with esm mjs :
import { getVoices } from "readium-speech/mjs/voices.js";
console.log(getVoices);

const voices = await voicesSelection.getVoices();
console.log(voices);

API

Interface

export interface IVoices {
    label: string;
    voiceURI: string;
    name: string;
    language: string;
    gender?: TGender | undefined;
    age?: string | undefined;
    offlineAvailability: boolean;
    quality?: TQuality | undefined;
    pitchControl: boolean;
    recommendedPitch?: number | undefined;
    recommendedRate?: number | undefined;
}

export interface ILanguages {
    label: string;
    code: string;
    count: number;
}

Parse and Extract IVoices from speechSynthesis WebAPI

function getVoices(preferredLanguage?: string[] | string, localization?: string): Promise<IVoices[]>

List languages from IVoices

function getLanguages(voices: IVoices[], preferredLanguage?: string[] | string, localization?: string | undefined): ILanguages[]

helpers

function listLanguages(voices: IVoices[], localization?: string): ILanguages[]

function ListRegions(voices: IVoices[], localization?: string): ILanguages[]

function parseSpeechSynthesisVoices(speechSynthesisVoices: SpeechSynthesisVoice[]): IVoices[]

function getSpeechSynthesisVoices(): Promise<SpeechSynthesisVoice[]>

groupBy

function groupByKindOfVoices(allVoices: IVoices[]): TGroupVoices

function groupByRegions(voices: IVoices[], language: string, preferredRegions?: string[] | string, localization?: string): TGroupVoices

function groupByLanguage(voices: IVoices[], preferredLanguage?: string[] | string, localization?: string): TGroupVoices

sortBy

function sortByLanguage(voices: IVoices[], preferredLanguage?: string[] | string): IVoices[]

function sortByRegion(voices: IVoices[], preferredRegions?: string[] | string, localization?: string | undefined): IVoices[]

function sortByGender(voices: IVoices[], genderFirst: TGender): IVoices[]

function sortByName(voices: IVoices[]): IVoices[]

function sortByQuality(voices: IVoices[]): IVoices[]

filterOn

function filterOnRecommended(voices: IVoices[], _recommended?: IRecommended[]): TReturnFilterOnRecommended

function filterOnVeryLowQuality(voices: IVoices[]): IVoices[]

function filterOnNovelty(voices: IVoices[]): IVoices[]

function filterOnQuality(voices: IVoices[], quality: TQuality | TQuality[]): IVoices[]

function filterOnLanguage(voices: IVoices[], language: string | string[]): IVoices[]

function filterOnGender(voices: IVoices[], gender: TGender): IVoices[]