# AI For Trading:Term2 NLP (78)

## Industry Experts

We’ve worked with experts from the industry to put together course materials that will introduce you to the exciting and fast moving world of quant trading!

## NLP Overview

Language is an important medium for human communication. It allows us to convey information ,express our ideas, and give instructions to others.

Some philosophers argue that it enables us to form complex thoughts and reason about them. It may turn out to be a critical component of human intelligence. Now consider the various artificial systems we interact with every day,phones, cars, websites, coffee machines.

It's natural to expect them to be able to process and understand human language, right? Yet, computers are still lagging behind. No doubt, we have made some incredible progress in the field of natural language processing,but there is still a long way to go.And that's what makes this an exciting and dynamic area of study.

In this lesson you will not only get to know more about the applications and challenges in NLP, you will learn how to design an intelligent application that uses NLP techniques and deploy it on a scalable platform.Sounds fun? Let's get started.

## Counting Words

Let's implement a simple function that is often used in Natural Language Processing: Counting word frequencies.

Consider this passage of text:

As I was waiting, a man came out of a side room, and at a glance I was sure he must be Long John. His left leg was cut off close by the hip, and under the left shoulder he carried a crutch, which he managed with wonderful dexterity, hopping about upon it like a bird. He was very tall and strong, with a face as big as a ham—plain and pale, but intelligent and smiling. Indeed, he seemed in the most cheerful spirits, whistling as he moved about among the tables, with a merry word or a slap on the shoulder for the more favoured of his guests.

— Excerpt from Treasure Island, by Robert Louis Stevenson.

In the following coding exercise, we have provided code to load the text from a file, call the function count_words() to obtain word counts (which you need to implement), and print the 10 most common and least common unique words.

Complete the portions marked as TODO to count how many times each unique word occurs in the text.

"""Count words."""

def count_words(text):
"""Count how many times each unique word occurs in text."""
counts = dict()  # dictionary of { <word>: <count> } pairs to return

# TODO: Convert to lowercase
text = text.lower()

# TODO: Split text into tokens (words), leaving out punctuation
# (Hint: Use regex to split on non-alphanumeric characters)
text_list = text.split()

for v in text_list:
if v in counts:
counts[v] += 1
else:
counts[v] = 1

# TODO: Aggregate word counts using a dictionary
return counts

def test_run():
with open("input.txt", "r") as f:
counts = count_words(text)
sorted_counts = sorted(counts.items(), key=lambda pair: pair[1], reverse=True)

print("10 most common words:\nWord\tCount")
for word, count in sorted_counts[:10]:
print("{}\t{}".format(word, count))

print("\n10 least common words:\nWord\tCount")
for word, count in sorted_counts[-10:]:
print("{}\t{}".format(word, count))

if __name__ == "__main__":
test_run()


10 most common words:
Word    Count
a   9
he  6
the 6
and 5
was 4
as  4
with    3
i   2
left    2

10 least common words:
Word    Count
on  1
wonderful   1
dexterity,  1
waiting,    1
glance  1
like    1
hopping 1
upon    1
but 1
room,   1

input.tx

