Echo & Alexa Forums

Hearing the Keyword - Anyone know why it's difficult to do for computers?

0 Members and 1 Guest are viewing this topic.

Snafs

Just a general question really.

Does anyone know, why it's seemingly hard for a computing device to hear a keyword over and above, or perhaps through other sound/music?

Do we feel it's down to physical things like quality of microphones, or the poor ability, unlike the human brain to tell the difference between different types of sounds?

I mean, you could be listening to say a pop tune, quite LOUD and your wife speaks to you, and you still notice her, and can probably hear her as her voice is, in your brain, quite separate from the music, even though the music may be louder.

It's only pressure waves through the air hitting your ear drums of course, but we seem to be able to easy pick out and focus on sound coming from a source even though it may, in theory perhaps be muddled up with other sounds.

Perhaps it's that we understand the other sounds, when a computer cannot?
If I hear say a queen track, I know it's a queen track, the sounds of the instruments, the beat, rhythm etc.
If someone says your name, behind you, you'd instantly turn as that sound was not part of the music.

Perhaps that's the biggest problem to solve, like understanding what word you meant with speech, you need to understand the sentence, and the context to get it right.
Filtering out keywords from other sounds may perhaps be a similar hard thing to crack as the mic and computer is just hearing soundwaves and not understanding them, so it can detect something that's not part of what's coming out the speakers or other noise in the room.




Offline jwlv

  • *
  • 1470
Re: Hearing the Keyword - Anyone know why it's difficult to do for computers?
« Reply #1 on: November 03, 2016, 02:58:32 pm »
The human brain can tune out or focus in on specific sounds. For example, have you ever been to someone's house whose smoke alarms make a short chirp every minute? Yea, they need to change the battery in the smoke alarm. The people who live there probably don't even hear that chirp any more because they've tuned it out.
A computer hears everything. that means it hears a car driving by, the TV in the next room, a plane flying overhead, the dog barking next door, and just about everything else. To process all that audio data, a computer has to work extremely hard to determine what is actually a voice and not just noise.

mike27oct

Re: Hearing the Keyword - Anyone know why it's difficult to do for computers?
« Reply #2 on: November 03, 2016, 04:27:05 pm »
Also, computer microphones (other than perhaps on the newest models) are pretty bad and low quality compared to the mic array in Alexa devices.

Snafs

Re: Hearing the Keyword - Anyone know why it's difficult to do for computers?
« Reply #3 on: November 03, 2016, 04:39:19 pm »
The human brain can tune out or focus in on specific sounds. For example, have you ever been to someone's house whose smoke alarms make a short chirp every minute? Yea, they need to change the battery in the smoke alarm. The people who live there probably don't even hear that chirp any more because they've tuned it out.
A computer hears everything. that means it hears a car driving by, the TV in the next room, a plane flying overhead, the dog barking next door, and just about everything else. To process all that audio data, a computer has to work extremely hard to determine what is actually a voice and not just noise.

Yes yes, that was the essence of what I was thinking.
For this to be great, you don't just need to hear the sounds, you need to understand the sounds.
That noise is a dog barking in the distance (I can picture a dog in my head and it's not near me, so I can ignore it) The TV is on, but I know what a TV is, and where it is, so again I can ignore it. However my child upstairs calls me, and I know their voice, and that it's not a sound that would be coming from another things, radio, etc, so I can focus and hear that childs voice.