SabiKo's Neural Voices: 37 Natural-Sounding Voices Across 5 Languages

An AAC user's voice is their voice. It's how they greet people, tell stories, ask for help, and say "I love you." The quality of that voice matters as much as the quality of a hearing aid or a pair of glasses. It needs to sound right.

SabiKo uses neural text-to-speech voices that sound natural and human-like. Six voices are included free. Pro users get access to all 37 voices across five languages. Every voice runs fully on the device, so communication works anywhere, with or without internet.

What Makes Neural Voices Different

If you've ever heard an older AAC device or a budget app, you know what robotic TTS sounds like. Flat tone. Awkward pauses. Emphasis in the wrong places. It sounds obviously artificial, and it changes how people respond to the person using it.

Neural voices are different. They're generated by neural networks trained on large datasets of human speech. The result is natural inflection, realistic rhythm, and appropriate emphasis. Most listeners can't distinguish a high-quality neural voice from a human recording.

For a deeper dive into how TTS technology works in AAC apps and why voice quality affects social interaction, see our guide to text-to-speech apps for kids.

Free vs. Pro Voices

Feature	Free	Pro
Total voices	6	37
English voices	2	Multiple
Other languages	1 voice per language	Multiple per language
Languages available	English, Spanish, French, German, Portuguese	English, Spanish, French, German, Portuguese
Offline	Yes	Yes
Speed control	Yes	Yes

The free tier includes 6 neural voices: 2 English voices and 1 voice for each of the other four languages. That's enough for most families to find a voice that works.

Pro expands the selection to 37 voices across the same five languages, with multiple options per language. More options means a better chance of finding a voice that matches the user's age, personality, and preference.

Why Voice Choice Matters

Identity

For a speaking person, their voice is part of who they are. The same is true for an AAC user. A 7-year-old girl using a deep adult male voice feels wrong, to her and to the people she's talking to. A teenager using a voice that sounds like a child feels equally wrong.

Having multiple voice options lets users find one that feels like theirs. That sense of ownership matters for long-term device adoption. A child who likes their voice uses the device more.

Social acceptance

When an AAC user speaks with a natural-sounding voice, listeners engage more naturally. They respond to the content of the message instead of reacting to the sound of the voice. This is especially important for children in school settings where peer interaction is already challenging.

A natural voice won't eliminate the social differences that come with using AAC, but it removes one unnecessary barrier.

Multilingual families

For bilingual families, having voices in multiple languages means a child can communicate in each language with a voice that sounds appropriate for that language. A Spanish voice for communicating with grandparents. An English voice for school. Each voice sounds natural in its own language rather than forcing one language through a voice designed for another.

SabiKo supports five languages: English, Spanish, French, German, and Portuguese.

Fully Offline

Every SabiKo voice runs entirely on the device. There's no cloud processing, no internet dependency, and no delay caused by network requests.

This matters because communication doesn't wait for Wi-Fi. Your child needs to talk at the park, in the car, at the grocery store, at grandma's house, and during power outages. If the voice requires internet, it fails in exactly the situations where communication matters most.

SabiKo uses the sherpa-onnx engine, which runs fully on-device. Voices are downloaded once and stored locally. After that initial download, the voice works anywhere, always.

Speed Control

Not every listener processes speech at the same rate. Not every environment is the same. SabiKo includes speed control that lets you adjust how fast the voice speaks.

Slower speeds help when communicating with someone who is hard of hearing, when the environment is noisy, or when the listener needs extra processing time.

Faster speeds work when the user is comfortable with the voice and wants conversations to flow more quickly. Some experienced AAC users prefer faster output because it feels more natural to them.

Speed control applies to all voice output: communication boards, the spelling keyboard, and Quick Phrases.

Choosing the Right Voice

Picking a voice is personal. Here are practical guidelines.

Let the user choose. Whenever possible, play voice samples and let the AAC user pick. This is their voice. Self-determination in this choice matters, even for young children who can indicate a preference by pointing or smiling.

Match age range. A child's voice for a child. An adult's voice for an adult. Mismatched voices create a disconnect that affects how peers and communication partners interact with the user.

Test in real settings. A voice that sounds great in a quiet room may be hard to hear in a noisy cafeteria. Try the voice at the volume and in the environments where your child actually communicates.

Try full sentences. Single words can sound fine, but full sentences reveal how the voice handles rhythm and phrasing. Listen to how it says "Can I have more juice please" and "I don't want to go to bed" before deciding.

Revisit periodically. As children grow, their voice preference may change. A voice that felt right at age 5 might not feel right at age 10. Check in every year or so.

Common Questions

Are the 6 free voices good enough?

Yes. The free voices are the same neural quality as the Pro voices. The difference is selection, not quality. If one of the 6 free voices works for your child, there's no need to upgrade for voice reasons alone.

Can I preview voices before selecting one?

Yes. You can listen to voice samples within the app before choosing. Try several and let the user participate in the decision.

Does the voice work with grammar correction?

Yes. When SabiKo's grammar correction turns a short phrase like "I want go park" into "I want to go to the park," the selected neural voice speaks the corrected sentence. The voice and grammar features work together to produce natural-sounding output.

Can different profiles use different voices?

SabiKo Pro supports multiple user profiles. Each profile has its own voice, vocabulary, and settings. On a shared device in a classroom or clinic, every user can have a voice that fits them.

Does changing the voice change how words sound on the boards?

Yes. The selected voice is used for all speech output in SabiKo, including communication boards, the spelling keyboard, Quick Phrases, and any other spoken output.

How Neural Voices Fit with Other SabiKo Features

Communication boards use the selected voice to speak composed messages. Grammar correction produces natural sentences, and the neural voice delivers them naturally.
Spelling keyboard speaks typed text with the same voice. Whether you compose with symbols or type with letters, the voice is consistent.
Quick Phrases speak pre-built sentences with the selected voice. Emergency phrases, greetings, and self-advocacy phrases all sound natural.
Word forms change individual words (tense, plural), and the voice pronounces each form correctly.
Custom audio (Pro) lets you record your own voice for specific words or folders. This supplements the neural voice with personal recordings for names, nicknames, or inside jokes.

Getting Started

Open SabiKo and go to voice settings
Listen to the available voices and select one
Adjust the speech speed if needed
Compose a message on the boards or type one on the keyboard
Tap speak and hear the voice in action

Your child's AAC voice should sound like a voice they're proud of. Neural TTS makes that possible.

Download SabiKo free and choose from 6 neural voices at no cost.