Figure 2 below captures a few of my utterances.Ĭ:\Users\User1\Desktop>python "STT.py" hai hay how are you doingįigure 2. Speak something and you will see your voice converted to text and printed on the console window. Open your Windows command prompt or any other terminal that you are comfortable using and CD to the path where you have saved the Python script file. The Python script to translate speech to text is ready and it’s now time to see it in action. The translation of speech to text is accomplished with the aid of Google Speech Recognition ( Google Web Speech API), and for it to work, you need an active internet connection. The default duration of one second should be adequate for most applications, though. The minimum value you need for the duration keyword argument depends on the microphone’s ambient environment. In some cases, you may find that durations longer than the default of one second generate better results. The SpeechRecognition documentation recommends using a duration no less than 0.5 seconds. So, after the Python script has started executing, you should wait for approximately the time specified as the value of the duration keyword argument for the adjust_for_ambient_noise() method to do its thing, and then try speaking into the microphone. The adjust_for_ambient_noise() method analyzes the audio source for the time specified as the value of the duration keyword argument (the default value of the argument being one second). To handle ambient noise, we use the adjust_for_ambient_noise() method of the Recognizer class. The code allows for ambient noise adjustment.ĭepending on the surrounding noise level, the script can wait for a miniscule amount of time which allows the Recognizer to adjust the energy threshold of the recording of the user voice. Your system’s default microphone is used as the source of the user voice input. A KeyboardInterrupt (pressing CTRL+C on the keyboard) terminates the program gracefully. The while loop makes the script run infinitely, waiting to listen to the user voice. Python script code that helps translate Speech to Text Print("No User Voice detected OR unintelligible noises detected OR the recognized audio cannot be matched to text !!!")įigure 1. Print('A KeyboardInterrupt encountered Terminating the Program !!!')Įxcept speech_recognition.UnknownValueError: UserVoiceInput_converted_to_Text = UserVoiceInput_converted_to_Text.lower() UserVoiceInput_converted_to_Text = UserVoiceRecognizer.recognize_google(UserVoiceInput) UserVoiceInput = UserVoiceRecognizer.listen(UserVoiceInputSource) # The Program listens to the user voice input. UserVoiceRecognizer.adjust_for_ambient_noise(UserVoiceInputSource, duration=0.5) With speech_recognition.Microphone() as UserVoiceInputSource: UserVoiceRecognizer = speech_recognition.Recognizer() # Python Program that helps translate Speech to Text The Python script code looks like the one referenced below in Figure 1. Save the file anywhere on your local Windows machine. Let’s name the Python Script file STT.py. pip install speechrecognition pip install wheel pip install pipwin pipwin install pyaudio Step 2: Code the Python Script That Implements a Very Basic STT Engine Open your Windows command prompt or any other terminal that you are comfortable using and execute the following commands in sequence, with the next command executed only after the previous one has completed its successful execution. We will start by installing the Python libraries, namely: speechrecognition, wheel, pipwin and pyaudio. Step 1: Installation of Specific Python Libraries NOTE: I worked on this proof-of-concept (PoC) project on my local Windows machine and therefore, I assume that all instructions pertaining to this PoC are tried out by the readers on a system running Microsoft Windows OS. Let’s go through the sequence of steps required. We can build a very basic STT engine using a simple Python script. One of the most important and critical functionalities involved with any voice technology implementation is a speech to text (STT) engine that performs voice recognition and conversion of the voice into text. Starting from voice shopping on Amazon to routine (and growingly complex) tasks performed by the personal voice assistant devices/speakers such as Amazon’s Alexa at the command of our voice, voice technology has found many practical uses in different spheres of life. The technology has grown, evolved and matured at a tremendous pace. In today’s world, voice technology has become very prevalent.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |