Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SAPI 4 driver #17599

Merged
merged 29 commits into from
Jan 23, 2025
Merged

Fix SAPI 4 driver #17599

merged 29 commits into from
Jan 23, 2025

Conversation

SaschaCowley
Copy link
Member

@SaschaCowley SaschaCowley commented Jan 8, 2025

Link to issue number:

Fixes #17516

Summary of the issue:

After the move to exclusively Windows core audio APIs, the SAPI4 driver stopped working.

Description of user facing changes

The SAPI4 driver works again.

A warning is shown the first time the user uses SAPI4 informing them that it is deprecated.

Description of development approach

Implemented a function to translate between MMDevice Endpoint IDs and WaveOut device IDs, based on this Microsoft code sample.

Added a config key, speech.hasSapi4WarningBeenShown, which defaults to False.
Added a synthChanged callback that shows a dialog when the synth is set to SAPI4 if this config key is False and this is not a fallback synthesizer.

Testing strategy:

Ran NVDA, and used it with SAPI4. Changed the audio output device to ensure audio was routed as expected.

Known issues with pull request:

When first updating to a version with this PR merged, if the user uses SAPI4 as their primary speech synth, they will be warned about its deprecation in the launcher and when they first start the newly updated NVDA. This is unavoidable as we don't save config from the launcher.

The dialog is only shown once per config profile, so may be missed by some users.
Other options I have considered include:

  • Making this a nag dialog that appears, say, once a week or once a month.
  • Also making a dialog appear whenever the user manually sets their synth to SAPI4.
  • Adding a new dialog in 2025.4 (or the last release before 2026.1) that warns users that this will be the last release to support SAPI4.
  • Adding a dialog when updating to 2026.1 that warns users that they will no longer be able to use SAPI4.
  • Adding a Windows toast notification that appears every time NVDA starts with SAPI4 as the synth.

The warning dialog is shown after SAPI4 is loaded. In the instance that the user is already using SAPI4, this is correct behaviour. In the case of switching to SAPI4, perhaps a dialog should appear before we terminate the current synth and initialise SAPI4.

Code Review Checklist:

  • Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • API is compatible with existing add-ons.
  • Security precautions taken.

@coderabbitai summary

source/speech/speech.py Outdated Show resolved Hide resolved
@SaschaCowley SaschaCowley marked this pull request as ready for review January 8, 2025 05:47
@SaschaCowley SaschaCowley requested a review from a team as a code owner January 8, 2025 05:47
@SaschaCowley SaschaCowley requested a review from seanbudd January 8, 2025 05:47
seanbudd
seanbudd previously approved these changes Jan 8, 2025
source/synthDrivers/_sapi4.py Outdated Show resolved Hide resolved
source/speech/speech.py Outdated Show resolved Hide resolved
source/config/configSpec.py Outdated Show resolved Hide resolved
source/synthDrivers/sapi4.py Outdated Show resolved Hide resolved
source/synthDrivers/sapi4.py Show resolved Hide resolved
source/synthDrivers/sapi4.py Show resolved Hide resolved
user_docs/en/changes.md Outdated Show resolved Hide resolved
@seanbudd seanbudd self-requested a review January 8, 2025 07:03
@seanbudd seanbudd dismissed their stale review January 8, 2025 07:03

accidental approval

user_docs/en/changes.md Outdated Show resolved Hide resolved
@seanbudd
Copy link
Member

seanbudd commented Jan 8, 2025

Could you confirm what the experience is like when in secure mode? both on the secure desktop, or with the environment variable set.
I imagine the warning dialog is always read when switching in secure mode, as settings cannot be saved.
This is probably good behaviour, but will be a little obnoxious I'm sure.

@CyrilleB79
Copy link
Collaborator

I guess we will have the same issue as the add-on store warning (#15261), i.e. the warning may be shown various times, maximum one time per profile.

@CyrilleB79
Copy link
Collaborator

Also, in the UG, the SAPI4 paragraph says:

SAPI 4 is an older Microsoft standard for software speech synthesizers. NVDA still supports this for users who already have SAPI 4 synthesizers installed. However, Microsoft no longer support this and needed components are no longer available from Microsoft.

Should we add here that SAPI4 usage in NVDA is deprecated and will be removed in the future?

@cary-rowen
Copy link
Contributor

For the user, does this warning dialog have any practical effect? As far as I know, many Chinese users rely on this synthesizer.
If we remove support for Sapi4 in the future, they will just be frustrated and unable to do anything else.

@zstanecic
Copy link
Contributor

@cary-rowen
Sorry about that, but what does it mean users will be frustrated?
I know about which synthesizer you are talking about... Let's keep no secrets here. We are talking about Ibm viavoice with the chinese support.
In croatia, there was similar experience. Many people used sapi4, because of the old WinTalker voice speech synthesizer.
I warned where i can, that the sapi4 will be discontinued, and that the synth which is no more developed, but sold is not a good thing to use in so fast development of technology.
That's why i am recommending people the alternatives...
By the way, for chinese, haven't you used aisound5?

@cary-rowen
Copy link
Contributor

Hi @zstanecic

I'm not complaining, I just thought about what effect this dialog can have on end users, so I made the above comment.
If there's nothing they can do and replacing a different speech synthesizer is the only option, is this dialog still that valuable?

I'd love to talk about the current state of TTS in Chinese, although it's a bit off topic, and it probably deserves a separate discussion.

Regarding the AISound you mentioned, it may be just a temporary solution:

  1. It responds slowly and consumes a lot of CPU resources.
  2. It can only pronounce Chinese characters. Even if it encounters English, it will only use Chinese voice to pronounce it.
  3. The speaking speed is extremely limited. Even if it is set to the fastest speaking speed, it will appear slow to skilled screen reader users.

Regarding IBM Viovoice TTS:

  1. Fast response.
  2. Can support more Chinese character pronunciations, which is compared to IBMTTS addon.

Regarding Eloquence:
As far as I know, it doesn't support Chinese. So there is nothing to say.

Of course, for me, Vocalizer is my only choice. As for Sapi5 and oneCore they are really slow to respond.

In summary, response speed is important. New technologies and new TTS are of course developing, but as you can see, Microsoft's natural voice is not yet supported. Although it can be supported through sapi5 through the Natural Voice Adapter, it is still limited by The response speed is not suitable for long-term use.

Hope this will be clearer

@zstanecic
Copy link
Contributor

Hi @cary-rowen, It is very interesting to hear the history about chinese speech synthesizer and what exists, what's popular and widely used.
It is not an off-topic. I think that it will shed more light on the situation we have now with sapi4.
Well, It is not that so much valuable, i think.
For example, we had Jaws for windows, which deprecated sapi4 for a long time ago. Nobody really complained about it.
Regarding response and speed? There is a Ibm viavoice driver, but it is somewhat gray zone.
Really, be honest. How many chinese users have optained their viavoice legally?
To conclude things: when it breaks, it will break, and we cannot do anything if this protocol is abandoned.
And. This dialog will not get any value, besides the fact that the users will be warned about thhe possible removal.

@SaschaCowley
Copy link
Member Author

@cary-rowen thank you for the questions. We have chosen to add this dialog as, while it may be irritating, we believe the experience of this being a surprise to users would be worse. Many users do not regularly read the changelog, which is not always translated into their language, so this is more likely to be noticed.

Regarding TTS support, does eSpeak-ng have support for Chinese languages? I see the following options in NVDA with eSpeak:

  • Chinese (Mandarin, latin as English)
  • Chinese (Mandarin, latin as Pinyin)
  • Chinese (Cantonese)
  • Chinese (Cantonese, latin as Jyutping)

Notwithstanding users' dislike of the sound of eSpeak (which is of course valid), are these voices unsatisfactory in other ways? For example, in English eSpeak just reads Chinese characters as "Chinese letter", does Chinese eSpeak have proper support for Chinese writing? Do other TTS engines support more Chinese languages?

@SaschaCowley
Copy link
Member Author

@LeonarddeR I've switched to using supportedSettings as you suggested. Let me know what you think

Copy link
Collaborator

@LeonarddeR LeonarddeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greate overall, just some minor suggestions.

source/speech/speech.py Outdated Show resolved Hide resolved
source/speech/speech.py Outdated Show resolved Hide resolved
source/speech/speech.py Outdated Show resolved Hide resolved
@nishimotz
Copy link
Contributor

In Japan, there is a SAPI4 speech engine developed more than 20 years ago that is still preferred by users today.
Currently, there are screen readers that bundle this voice and are being legally sold.
Japanese users who use multiple screen readers are likely able to use the SAPI4 Japanese voice legally.

According to recent server logs of the Japanese version of NVDA, 7% of users who have opted to send data are using the SAPI4 driver.

It is important to clearly communicate the necessity of discontinuing SAPI4 support, as well as the benefits that will come in exchange for its removal.

source/synthDrivers/sapi4.py Outdated Show resolved Hide resolved
source/synthDrivers/sapi4.py Outdated Show resolved Hide resolved
@SaschaCowley SaschaCowley requested a review from seanbudd January 21, 2025 06:24
@XLTechie
Copy link
Collaborator

XLTechie commented Jan 22, 2025 via email

@seanbudd
Copy link
Member

@XLTechie our current policy is to not reveal details of a security risk until the flaw is fully patched out of NVDA. As such we cannot provide more information until SAPI4 has been fully removed

# Conflicts:
#	user_docs/en/changes.md
Copy link
Member

@Qchristensen Qchristensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reads well

@SaschaCowley SaschaCowley merged commit bf3ba01 into master Jan 23, 2025
5 checks passed
@SaschaCowley SaschaCowley deleted the fixSapi4 branch January 23, 2025 23:33
@github-actions github-actions bot added this to the 2025.1 milestone Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-breaking-change conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SAPI 4 synthesizer not loading anymore