Don't sound like a robot: use CSS to control Text-to-Speech

Ingo Steinke, web developer - Sep 13 '23 - - Dev Community

Websites often sound like a vintage science-fiction robot when using a screen reader / text-to-speech output to read the text aloud.

While there might be a good reason to do so, as this tone of voice was easier to understand when playing it at double or triple speed, that's also true for other ways of speaking.

Drawing of a cartoon robot saying: I am not a robot. Human character relies: but you sound like a robot.

"I am not a ro-bot" - "but you sound like a robot!"

Scholars, actors or opera singers use an exaggerated articulation to facilitate understanding. Another example is classic radio advertisement where they often speak in a fast, high-pitched and overly excited voice to convey both emotion a lot of information in a few seconds of time.

How does your website sound?

In his beyond Tellerrand 2022 talk "exclusive design", Vasilis van Gemert proved that you can add an individual touch to your website using some sort of simple poetry or exclamation expressions in ALT attributes like "boing boing".

Some screen readers offer different voices, often at least one female and one male one, but that's a client setting that doesn't change for a website an its content.

CSS Voice Control

We can use CSS to declare different voices much like we use CSS to declare font families and typographic details. So we could make a Q&A section sound like someone is asking and another voice is answering the questions.

Code syntax is still an early draft, so it might change before browser support, but the current recommendation (CSS Speech Module Level 1) looks quite similar to typography:

selector {
  voice-family: female;
  voice-pitch: medium;
}
Enter fullscreen mode Exit fullscreen mode

As the property is still experimental, stylelint does not recognize it yet at the time of writing this, so let's explicitly disable the property-no-unknown rule only where we use it by adding a stylelint-disable comment and re-enable it afterwards.

selector {
  /* stylelint-disable property-no-unknown */
  voice-family: female;
  voice-pitch: medium;
  voice-stress: moderate;
  voice-rate: fast;
  voice-volume: soft;
  pause-after: strong;
  voice-balance: left;
  /* stylelint-enable property-no-unknown */
}
Enter fullscreen mode Exit fullscreen mode

Pretty much the same like we have to do with some other helpful styles that became common practice but are not standard yet, like optimizing text rendering legibility rather than optimizing rendering speed.

selector {
  /* stylelint-disable-next-line value-keyword-case */
  text-rendering: optimizeLegibility;
}
Enter fullscreen mode Exit fullscreen mode

Altogether, some of our base styles might look like this.

html, body, main {
  background-color: var(--color-primary-background);
  color: var(--color-primary-foreground);
    font-family: var(--font-family-default);
  font-weight: var(--font-weight-regular);
  /* stylelint-disable-next-line value-keyword-case */
  text-rendering: optimizeLegibility;
  font-size: var(--font-size-16);
  line-height: 100%;
  /* prepare voice settings according to CSS speech draft */
  /* stylelint-disable property-no-unknown */
  voice-family: female;
  voice-pitch: medium;
  voice-stress: moderate;
  voice-rate: fast;
  voice-volume: soft;
  pause-after: strong;
  voice-balance: left;
  /* stylelint-enable property-no-unknown */
}
Enter fullscreen mode Exit fullscreen mode
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player