Text-to-speech technology helps produce more audiobooks for people who are blind or have low vision

A woman show objects to a group of children
Hongdandan’s founder Zheng Xiaojie shares some audiobooks with a group of school children.

Nowadays the library rolls out content via Microsoft Azure to 105 schools across China for students who are blind or have low vision. They can also access 1,000-plus titles on the library’s own app and a mini-program on WeChat, China’s popular social media platform.

Microsoft has been Hongdandan’s partner for around 15 years. And the center produces its audiobooks in line with Microsoft’s commitment to responsible AI, which safeguards against the misuse of the technology and prioritizes transparency, fairness, accountability, privacy and security.

“Microsoft has been in contact with us all the time,” says Zheng. “Supporting all aspects of the Eyes of the Soul Library, including the AI voice service we are using now, which was unimaginable for us before. In front-line jobs, we knew the needs of blind people, but we didn’t know how to use high-tech methods to solve their needs. In fact, technology is a particularly good method for the education of people who are blind or who have low vision. It brings us closer together.”

ALSO READ: Are you talking to me? Azure AI brings iconic characters to life with Custom Neural Voice

As well as teaching and volunteering, Dong is currently in a graduate program at the Communication University of China where she is researching the creation and use of synthetic voices. “As a blind person, the development of technology has changed my life,” she says.

So, with her experience and well-tuned ear for voices, how does she rate Microsoft’s AI creations, including her own?

“Microsoft’s Custom Neural Voice actually simulates a real voice much better than more general synthetic voices,” she says. “For example, there are some tone changes and more details to the voices—these details are really good.”

Dong says that whether real or synthetic, an ideal audio voice needs to sound warm and clear, with a sense of confidence and even a feeling of love and affection. “The most similar point between a human voice and Microsoft’s Custom Neural Voice is the timbre—the timbre of the Custom Neural Voice is really vivid.”

Both Dong and Zheng emphasize the importance of the Eyes of the Soul Library for improving education and employment prospects for people who are blind or have low vision. But they also see another crucial benefit: a sense of connection that instills confidence and self-reliance.

Zheng says many people who are blind or have low vision can now “seize opportunities in the internet era and find the professions and positions they are good at.

“We give them a channel to acquire knowledge and know the world. Having the companionship of a voice has eliminated the distance between them and the world, so many have become more positive and confident. They no longer have a sense of isolation or fear of the world. They believe that they can do a lot of things all by themselves.”

All images are courtesy of the Hongdandan Visually Impaired Service Center. TOP: Lina Dong in a recording booth. CENTER: Lisa Dong (center) conducts a lesson with students. 

This article is purposely trimmed, please visit the source to read the full article.

The post Text-to-speech technology helps produce more audiobooks for people who are blind or have low vision appeared first on Microsoft | The AI Blog.

This post was originally published on this site