Sarmad Gardezi

Sarmad Gardezi

a freelance developer


Home about Portfolio Blog

HTML5 Speech Search in website with Voice Recognition

The HTML5 Web Speech API has been around for few years now but it takes slightly more work now to include it in your website.

Earlier, you could add the attribute x-webkit-speech to any form input field and it would become voice capable. The x-webkit-speech attribute has however been deprecated and you are now required to use the JavaScript API to include speech recognition. Here’s the updated code:


<!-- CSS Styles -->
<style>
.speech {border: 1px solid #DDD; width: 300px; padding: 0; margin: 0}
.speech input {border: 0; width: 240px; display: inline-block; height: 30px;}
.speech img {float: right; width: 40px }
</style>

<!-- Search Form -->
<form id="sarmad" method="get" action="https://www.google.com/search">
<div class="speech">
<input type="text" name="q" id="transcript" placeholder="Speak" />
<img onclick="startDictation()" src="//i.imgur.com/cHidSVu.gif" />
</div>
</form>

<!-- HTML5 Speech Recognition API -->
<script>
function startDictation() {

if (window.hasOwnProperty('webkitSpeechRecognition')) {

var recognition = new webkitSpeechRecognition();

recognition.continuous = false;
recognition.interimResults = false;

recognition.lang = "en-US";
recognition.start();

recognition.onresult = function(e) {
document.getElementById('transcript').value
= e.results[0][0].transcript;
recognition.stop();
document.getElementById('sarmad').submit();
};

recognition.onerror = function(e) {
recognition.stop();
}

}
}
</script>

We have the CSS to place the microphone image inside the input box, the form code containing the input button and the JavaScript that does all the heavy work.

The Dictation App also uses the speech recognition API though it writes the transcribed text to textarea field instead of an input box.

Some notes:

  1. If the HTML form / search box is embedded inside an HTTPS website, the browser will not repeatedly ask for permission to use the microphone.
  2. You can change the value of the recognition.lang property from ‘en-US’ to another language (like hi-In for Hindi or fr-FR for Français). See the complete list of supported languages.

This piece of jQuery / JavaScript code powers the Dictation app.

// Written by Sarmad Gardezi
// See sarmadgardezi.com/dictation for live demo

$(document).ready(function () {

// Check if the user's web browser supports HTML5 Speech Input API
if(document.createElement('input').webkitSpeech == undefined) {
$(".answer").append("We are sorry but Dictation requires Google Chrome.");
}
else {

// Get the default locale of the user's browser (e.g. en-US, or de)
var language = window.navigator.userLanguage || window.navigator.language;
$("#speech").attr("lang", language).focus();

// Make the text region editable to easily fix transcription errors
$(".answer").click(function () {
$('.answer').attr('contentEditable', 'true');
});
}

// This is called when Chrome successfully transcribes the spoken word
$("#speech").bind("webkitspeechchange", function (e) {
var val = $(this).val();

// Did the user say Delete? Then clear the canvas.
if(val == "delete everything") {
$(".answer").text("");
return;
}

// For "new line" commands, add double line breaks.
if(val == "new line")
val = "

";
else {

// Capitalize the first letter of the sentence.
val = val.substr(0, 1).toUpperCase() + val.substr(1);

// If the last letter is a alphanumeric character, add a period (full stop)
if(val.match(/[a-zA-Z]$/))
val = val + ".";
}

// Append the transcribed text but set the focus to the hidden speech input.
// This enables keyboard shortcut Ctrl+Shift+Period (.) for speech mode.
$(".answer").append(val + " ").fadeIn();
$(this).val("").focus();
});
});



RELATED POSTS

Video Tutorials