Sitemap
DataDrivenInvestor

empowerment through data, knowledge, and expertise. Join DDI community at https://join.datadriveninvestor.com

How Natural Language Processing Helped Me Code My New Sidekick

How to create a Virtual Assistant using Python

7 min readFeb 20, 2019

--

Admit it… everyone has. You are lazy. Your alarm rings in the morning and you hit snooze. You can drive to work but, you find someone to drive you. You could cook at home but, you order take-out. Instead of dialing someone’s number, you’ll ask Siri to call them. Wait. Is that laziness, or is that taking advantage of the 21st century?

Honestly, I’m pretty conflicted. On one hand, I see the advantage of having a virtual assistant on standby. Easy access in emergencies and you don’t need to unlock your phone to speak to someone. On the other hand, you won’t need to memorize numbers which could lead to a weaker mind and you could get so reliant on your virtual assistant that without it, you could be clueless about your phone (layout, speed, type, etc.).

Like every good feature, it has its pros and cons but, nevertheless, I decided to create something like this that does the same job.

Press enter or click to view image in full size
Different companies’ virtual assistants. From left to right: Amazon Alexa, Google Assistant, Apple’s Siri, and Samsung’s Bixby.

Using machine learning and natural language processing (NLP), you can easily make something similar. You want to get the right environment before you start coding because when you are running your program, it will work on the first try. To simply set up the environment, all you really need to do is install Anaconda from your web browser or the pip command and Anaconda makes it very easy because everything is already installed onto it.

Natural Language Processing (NLP)

What is NLP?

Natural language processing (NLP) is a form of artificial intelligence that focuses on analyzing human languages to draw insights, create advertisements, aid you in texting and more.

NLP is an emerging technology that drives many forms of AI than many people are not exposed to. NLP has many different applications that can benefit almost every single person on this planet. This is why I used a basic form of NLP to build this amazing virtual assistant.

Every day, humans say millions of words and every single human is able to easily interpret what we are saying. Fundamentally, it’s a simple relay of words, but words run much deeper than that as there’s a different context that we derive from anything anyone says. They could imply something with their body language or in how frequently they mention something. While NLP doesn’t focus on voice inflection, it does draw on contextual patterns.

How NLP Works

The most difficult part of NLP is understanding or providing meaning to the natural language that the computer received.

First, the computer must take natural language (humans speaking English) and convert it into artificial language. This is what speech recognition, or speech-to-text, does. This is called natural language understanding (NLU). Once the information is in text form, NLU can take place to try to understand the meaning of that text.

Natural Language Understanding (NLU)

Most speech recognition systems today are based on Hidden Markov Models (HMMs). These are statistical models that turn your speech to text by using math to figure out what you said.

HMMs do this by listening to you speak, breaking it down into small units (usually 10–20 milliseconds), then comparing it to a pre-recorded speech from our imported libraries. Then, it looks at the series of phonemes (distinctive part of speech like p in pat) and statistically determines the most likely words and sentences you were saying. It outputs this information in the form of text.

The next and hardest step of NLU is the actual understanding part. Again, different NLP systems use different techniques. However, the process is generally similar. First, the computer must understand what each word is. It tries to understand if it’s a noun or a verb, whether it’s past or present tense and other grammar tenses. This is called Part-of-Speech tagging (POS).

Natural Language Generation (NLG)

NLG is simpler to accomplish. It translates a computer’s artificial language into text. It can also translate text-to-speech which is what this virtual assistant does.

First, the NLP system determines what information to translate into text. If you asked your computer a question about the weather, it most likely did an online search to find your answer. When and from there it decides that the temperature, wind, and humidity are the parts that should be read aloud to you.

Then, it organizes the way it will say it. This is similar to NLU except, NLU understands what to say and NLG generates it. Using the English lexicon and a set of grammar rules, an NLG system can form full sentences.

Finally, if you want the text to be read aloud, text-to-speech takes over. The text-to-speech engine (in this case it is Google’s Text-To-Speech module) analyzes the text using a prosody model, which determines breaks, duration, and pitch. Then, using a speech database (recordings from a voice actor), the engine puts together all the recorded phonemes to form one comprehensible form of speech.

Importing Libraries

Every python program needs libraries to be imported in order to harness information that is recognized by the computer. For example, when I was writing this code, I needed a library that would give me access to my computer microphone and be able to relay messages back to me. That’s when I used the Google Text-to-Speech library. To import the library all you need to type is from gtts import gTTS.

Next, the computer needs to be able to recognize your speech. This is where natural language processing comes in. This is a less complicated form of natural language processing. Basically, NLP is a form of AI that lets the computer take in and store information. Then, analyze that information and relay what it is programmed to do. To harness this library all that needs to be typed is import speech_recognition as sr.

Get Armaan Merchant’s stories in your inbox

Join Medium for free to get updates from this writer.

The next five libraries are supporter libraries that help create the environment and help the first two libraries perform their duties. The OS library allows you to interface with whichever operating system you have on your computer (it could be Windows, Mac or Linux). To use this module type import os. The RE library just harnesses the English language. Essentially, it allows you to speak to it in English and allows itself to understand what you are saying. To import this just type import re. The webbrowser library gives you access to the internet. You are able to set which web browser you want to access, whether it is Google Chrome, Safari, Firefox, Internet Explorer or Microsoft Edge. Code it in using import webbrowser. The smtplib library defines an SMTP client session object that can be used to send mail to any Internet machine. The requests library is placed in there to ensure all requests are taken in by the computer and the computer is able to output relevant information to the user. To access this library type import requests.

Different Functions in the Code

In order to write a successful code, you need to be able to define your own functions to make whatever you are building to work. For this virtual assistant, I wrote (and defined) three functions that, I used throughout the code to make the code easier to follow in fewer lines.

def talkToMe(audio):
"speaks audio passed as argument"
print(audio)
for line in audio.splitlines():
os.system("say " + audio)

This function, essentially, lets us speak to the computer. Using natural language processing it types onto the screen what we just said. This is an example of text-to-speech communication.

def myCommand():
"listens for commands"
r = sr.Recognizer()with sr.Microphone() as source:
print('Ready...')
r.pause_threshold = 1
r.adjust_for_ambient_noise(source, duration=1)
audio = r.listen(source)
try:
command = r.recognize_google(audio).lower()
print('You said: ' + command + '\n')
#loop back to continue to listen for commands if unrecognizable speech is received

except sr.UnknownValueError:
print('Your last command couldn\'t be heard')
command = myCommand();
return command

This function gives us access to the in-computer microphone and uses NLP, NLU, and NLG to recognize our speech. Then, if it doesn’t understand our speech, it is able to tell us that it didn’t understand what we said and gives us the opportunity to correct ourselves.

def assistant(command):
"if statements for executing commands"

Even though this function is very short, it is the whole reason we can command our computer to do anything. For example, if we want the computer to open a website, we put in an “else if” command(elif) and can verbally tell the computer to open that certain site.

elif 'open personal website' in command:
reg_ex = re.search('open website (.+)', command)
if reg_ex:
domain = reg_ex.group(1)
url = 'https://armaanmerchant.com/' + domain
webbrowser.open(url)
print('Done!')
else:
pass

If I wanted my virtual assistant to open my personal website, I just need to program it in. I’ve used the webbrowser library to gain access to the Internet and command my computer to open http://armaanmerchant.com/ (my personal website). Then it simply prints “Done!” right before it opens!

while True:
assistant(myCommand())

Finally, this command loops the whole code so, we can continually use the virtual assistant without terminating the session and restarting it.

At its core, NLP is a translator for computers. Just like how in foreign countries you rely on Google Translate to help you speak another language, computers rely on NLP as their translator because human language is foreign to them.

NLP, NLU, NLG, ML, and AI are really amazing concepts alone, and when they work together, so much can be accomplished!

Key Takeaways

  • Using machine learning and natural language processing (NLP), you can easily make a simple virtual assistant.
  • Natural language processing (NLP) is a form of artificial intelligence that focuses on analyzing languages and provided a suggested output.
  • The computer must take natural language (humans speaking English) and convert it into artificial language. This is called natural language understanding (NLU).
  • Most speech recognition systems today are based on Hidden Markov Models (HMMs). HMMs do this by listening to you speak, breaking it down into small units (usually 10–20 milliseconds), then comparing it to a pre-recorded speech from our imported libraries.
  • Translating a computer’s artificial language into text is easier. It just outputs its artificial languages to text and reads off the text. This is called natural language generation (NLG).
  • Each computer function does something another can’t. When each function works together, it can accomplish a lot!

Thanks for reading my article, please stay tuned for more and please give me some claps!! Thanks!

Check out my other articles here! Also, feel free to add me on LinkedIn, Armaan Merchant.

DataDrivenInvestor
DataDrivenInvestor

Published in DataDrivenInvestor

empowerment through data, knowledge, and expertise. Join DDI community at https://join.datadriveninvestor.com