Sentiment Analysis
What is the Sentiment Analysis?
Sentiment Analysis refers to the use of NLP — Natural Language Processing, text analysis, computational linguistics, and biometrics to automatically identify, extract, quantify, and qualify subjective information. In some cases, Sentiments Analysis is also known as Opinion Mining or Emotion AI, and it is widely adopted and applied to the voice of the customer materials such as products/services reviews, survey responses, social media and etc.
This text analysis technique tries to capture the essence of a free text excerpt and, interpreting words used in the text, it classifies the perception of the writer if there is a positive, negative, or neutral sentiment (feeling) in the text.
Basic Example of the Sentiment Analysis Return
The initial idea — project
The scope of the project was basically:
(1) connect with some data sources on the web and scrape data (web scrapping)
(2) using a cloud environment store the scraped data
(3) perform sentiment analysis for the collected data
(4) develop dashboards to visualize the analysis, considering the different data sources and the final results
Architecture Proposed
Considering the proposed architecture this project will be able to:
(1) connect with some data sources on the web, scrap some updated data — news agencies like NYT, Toronto Star, CNN, BBC.
(2) generate a CSV file with all the scrapped data
(3) use python library TextBlob to perform Sentiment Analysis and get the polarity of each text (score)
(4) store the CSV file in an Azure Blob Storage instance
(5) use Azure Data Factory (ADF) to ingest the data and push all the data to an Azure SQL database
(6) Consume the data available on the SQL using Power BI or another visualization tool
(7) Alternatively there is an additional scenario (3_PROCESSING) to use Azure Cognitive Services, Azure Stream Analytics, and Azure Machine Learning to have a complete architecture and to perform additional analysis on the data.
TextBlob details
SENTIMENT ANALYSIS
from textblob import TextBlobsentiment = obj.sentiment.polarity
TextBlob Features
- Noun phrase extraction
- Part-of-speech tagging
- Sentiment analysis
- Classification (Naive Bayes, Decision Tree)
- Language translation and detection powered by Google Translate
- Tokenization (splitting text into words and sentences)
- Word and phrase frequencies
- Parsing
- n-grams
- Word inflection (pluralization and singularization) and lemmatization
- Spelling correction
- Add new models or languages through extensions
- WordNet integration
$ pip install -U textblob$ python -m textblob.download_corpora
STEP#1 — Python code running
(1) Scrapping the data
(2) Generating CSV file with the data
(3) Sending CSV file to Azure Blob Storage
STEP#2 — Data Ingestion — Data Factory | CSV to SQL
(1) Collecting the workload stored on Azure Blob Storage
(2) Ingesting this data to a SQL database
Azure Resources
STEP#3 — Power BI Dashboards
Using Power BI and connecting to the Azure SQL database the dashboards were developed.
Power BI Dashboards
The Purpose
The purpose of this work is ONLY to present a simple solution for the proposed scope using free resources (or at least free for students) available on the web.
There IS NO pretension to show the most optimized solution and also to showcase the best tools. It is also important to recognize that, in some cases, has been applied more steps than needed to test and validate different features and services.
The main idea of this project is to explore possibilities, resources, and solutions that could be an alternative to solve the presented problem/scope in an effective way.
References
Thank you! ;-)