Unity Accessibility Plugin – Update 5 – Text to Speech vs. Voice Recordings

In one of my last posts I went on a crusade about how important it is to require as little manual setup as possible from the users of your software. I stand by that, but I want to add a footnote: “Advanced Settings”.

A control panel with about one hundred buttons that all look the same.

Just turn the third dial from the right in the second row-counter clockwise by 45 degrees.

A screen full of settings and options seems intimidating, sure. And people will end up using them wrong. But take 95% of those settings and hide them in a tab called “Advanced Settings”, and all is well. People that are easily scared off by this many choices will leave it well alone (and subsequently not break the system). And those who would like a little more control will be excited that they can get it. Everybody is happy. Now let’s apply this to my plugin.

Customization Ability

The accessibility plugin needs very little setup so far. Import the package into Unity, put the Accessibility Manager prefab into your scene and add an Accessibility component to every UI element that you would like handled. I might even automate this last step. Provided you have given your buttons, toggles and sliders decent names, you are good to go. It’s fast and simple and I would like to keep it this way.

However, it wouldn’t be a good plugin if this wasn’t customizable ad absurdum.

Text to Speech vs. Voice Recordings

One of the most important things I imagine users will want to customize is the voice output. By default, the native text-to-speech engine installed on the device will read out the text. And honestly, this might be good enough for your app.

Text To Speech has come a long way since the old days. They now include different intonations and take context into account. The good systems can even occasionally fool you for a second into believing that it’s real. Here are a few examples, in ascending quality:

They can even do accents now:

By default, the accessibility plugin will use the native text-to-speech available on the device. In the spirit of customizability, all calls to the speech synthesis are wrapped into four separate, dedicated functions. That will make it easy to use a different TTS engine. As I mentioned in a previous post, there are a number of TTS engines available on the asset store. I’ll make a section in the documentation on how to change the TTS engine. (Yes, I have already started on the documentation.)

Alternative to Artificial Speech

And in the spirit of not just regular, but maximum customizability, the plugin will also support custom pre-recorded audio instead of synthesized speech.
There are good reasons for going this way.

  • Stand out: Users get tired of hearing the same voice all the time, in all their accessible apps
  • Improve immersion: Synthesized voices are often neutral, which is not great for story telling. Also, if you have multiple characters, you don’t want them all having the same voice.
  • Avoid Fatigue: It’s hard to listen to artificial speech for longer texts. Especially since the speech rate is often set to high.
  • It cheaper than you might think: You can get good to professional voice acting on platforms such as Fiverr.com, and semi-professional acting on various Audio Forums. Or simply grab a decent microphone and some friends. It might still be better than artificial speech.

Other Customizations

This post is already quite long, but here is a quick rundown of other optional customizations I am working on.

  • Custom Hints – Customize the hint text/audio either for an individual UI element, or for all elements of that type
  • Custom Values – Customize default value text/audio. This will come in especially handy for various toggles. For example, instead of just reading “On” or “Off”, the UI element can be customized to read “Hints are enabled/disabled”.
  • Custom Label text – Customize what the label reads and what is being read by the TTS. Sometimes the text just needs to be different. Maybe you need to add a non-visual description, or replace the word read with hear etc..
  • Keyboard – On Windows, you can choose the keys which will be used to navigate the menus and trigger buttons (although I recommend to leave the defaults alone)

If you have any other ideas for customization, please drop me a line!