Integrating Azure Personal Voice with Home Assistant

Home Assistant has had the ability to convert text to speech (TTS) for a number of years, and specifically the ability to use Azure speech services since late 2017 – so you might think that what I’m writing about isn’t new.

In fact, if you’re a Nabu Casa subscriber and use the Voice Assist functionality – you’re already using Azure speech services.

The main benefit of defining your own TTS service is to give you more control over the voice itself. We see this through the numerous integrations that exist for other voice services from Google, Amazon, ElevenLabs, and even self-hosted methods.

At the Ignite conference in 2023, Microsoft announced the Personal Voice feature which allows services like Truecaller to have an “AI” answer and screen calls using your own voice.

Unlike some other voice generation services, the Azure Personal Voice service requires a specific statement to be recorded and uploaded, which is used to start generating your voice. You can add more samples in order for it to synthesize a more accurate version of your voice.

On a personal note, I would prefer to have my voice being generated in my own private Azure subscription, rather than hosted with a third-party provider.

For this reason, when the Azure Personal Voice service became generally available around mid-2024, I jumped at the chance to play around with it.

Potential Use Cases

What could possibly be the use case of being able to generate my voice, you ask?

Trolling.

Trolling my family.

That’s it.

No real business value or any other purpose.

I wanted the ability to generate my voice via my home automation platform (Home Assistant) so that I could play messages throughout the house on various speakers and devices. Anything from calling one of my daughters a doofus while she is in her room, to tell them that I have picked up the takeaway and am heading home.

I could possibly try to talk to the dogs through it, or even to someone at the front door in conjunction with a doorbell camera system and conversational AI agent.

A colleague did suggest that because I can generate up to 10 minutes of audio at a time, that I could potentially combine it with an avatar in a Microsoft Teams meeting and have it present on my behalf.

Integrating with Home Assistant

For the past year I’ve had this working in a mobile app built using Power Apps and Power Automate, but for a while have wanted to use it across the house for nefarious purposes.

While the native Azure TTS integration in Home Assistant supports most of what is required to work with Azure Personal Voice, unfortunately there are some differences I couldn’t work around.

This was primarily the requirement for a Speaker ID value, but also a couple of the other items needed a bit more control.

The way the Azure Personal Voice service works is that you send a HTTP request with the SSML body of what you want it to say, with what voice, and how. It will then return the audio as part of response body, which generally would need to be saved as an MP3 file before it could be played.

Because I wanted to pipe the response directly to a media player, I went down the route of creating a custom component for Home Assistant which can be imported using the Home Assistant Community Store (HACS) system by adding my repository: https://github.com/loryanstrant/Azure-Personal-Voice-HA/

Before doing this, you’ll need to have signed up for the Azure Personal Voice service and created a voice profile.

Once you’ve done that, follow the steps on my GitHub repo page to install the custom component and configure it.

Then you can test it out via an ad-hoc action, add it to scripts, automations, use it with buttons, or whatever you like.

Here’s a demonstration of it in action:

For every father out there, who wouldn’t like the ability to automatically have a warning play across the house when a light has been left on in an area where nobody is present!? Or when someone adjusts the temperature on the heater?

Think of all the parenting you could do via automation!!!


Also published on Medium.


Discover more from Loryan Strant, Microsoft 365 MVP

Subscribe to get the latest posts sent to your email.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from Loryan Strant, Microsoft 365 MVP

Subscribe now to keep reading and get access to the full archive.

Continue reading