Welcome to the Simplilearn Community

Want to join the rest of our members? Sign up right away!

Sign Up

Natural Language Processing-Nov-7-Dec-6-Raghav

Not able to filter nouns in Topic modelling project. I am trying it inside dataframe.

import re
nouns = []
def pos_noun(text):
#for sent in nltk.pos_tag(text):
#nouns.append([token for token in sent if re.search("NN.*", token[1])])
for word,pos in nltk.pos_tag(text):
if (pos == 'NN' or pos == 'NNP' or pos == 'NNS' or pos == 'NNPS'):
text = nouns.append(word)
#text = nouns
return text

voc['review_pos_noun'] = voc.review_token.apply(pos_noun)
voc.head(10)

It returns none.
 
Please guide me on how to start with the following task given.
Note: the original file is actually in an xlsx format, I don't know why I'm not able to attach an excel file, so I just copied the text snippets from the excel file and pasted it in text doc

Task:

Develop a simple sentiment analyzer



Part 1


The task is to create code that is able to read a text snippet and identify the sentiment of the input text snippet based on a dictionary.



This dictionary needs to store 2 sets of words – words indicating positive sentiment (e.g., good, great, happy, etc) and the other set of words indicate negative sentiment (e.g., bad, angry, unhappy, sad, etc). You will need to build this dictionary and take care to see those simple transformations of words such as case changes or different noun or verb forms do not affect the functioning of your code. You may also extend the dictionary to include phrases or terms containing multiple words to make it more general.



In case the input text snippet includes both negative and positive words, you may count the number of positive and negative terms and give the snippet a sentiment score that ranges from -1 to 1. An example of such a function is given next:

Sentiment score =

(Number of positive words – number of negative words)

max{number of positive words, number of negative words}



A score close to 0 indicates a neutral statement; whereas a score close to 1 or -1 indicates a positive or negative sentiment respectively.



You may use the sample text snippets (attached) and dictionary links provided with this mail and extend them to make them more general. Feel free to think about other enhancements that will improve sentiment analyser and implement them. Remember, the idea is to be able to capture sentiments accurately for customer feedback or customer service comments



Part 2


Extend the code created in part 1 to read multiple text snippets (number may vary from 1 to a few thousand) and identify sentiments for each of those snippets based on the stored dictionary.



Note:

You can start with the dictionary based on the following links:

https://ptrckprry.com/course/ssd/data/positive-words.txt

https://ptrckprry.com/course/ssd/data/negative-words.txt

Thanks
Manjari
 

Attachments

  • sample text.txt
    880 bytes · Views: 0
Top