Introduction
A lot of consumer data is being posted on social media every minute and social media analysis has become a critical component in audience analysis, competitive research, and product research.
Social media analytics and its tools are helping organizations around the world understand currently trending topics. Trending topics are those subjects and attitudes that have a high volume of posts on any social media platform. Sentiment analysis, uses social media analytics tools to determine users attitudes toward a product or the idea.
In this post we will see how to build a social media sentiment analysis solution by bringing real-time Twitter events into Azure Event Hubs. Real-time Twitter trend analysis is a great example of an analytics tool because the hashtag model enables you to listen to the specific keywords (hashtags) and develop sentiment analysis of those feeds.
We are going to identify trending topics in real time from Twitter, we need real-time analytics about the tweet volume and sentiment for key topics we are going to provide.
To perform this exercise we will need the following pre-requisites
- Azure subscription
- Twitter account
- TwitterClientCore for passing on the twitter live feeds -You can download it from below url.
https://github.com/Azure/azure-stream-analytics/tree/master/DataGenerators/TwitterClientCore
- .Net Core CLI version 2.1
Event Hub creation for streaming input of data
The demo application generates events and sends them to event hub. Azure Event hubs are best method for ingesting events in steam analytics. If you are new to event hubs, please refer my previous article about it.
Next you will have to create Event hub namespace and event hub. Since there are tons of resources available in the web on how to perform this, I am going to skip the following steps.
- Create namespace
- Create event hub
- Granting access to the event hub (Make a note of the connection string from shared access policies in this step, we will need it)
Initiate the Twitter client application
For the application to receive the tweet events directly, it needs permission to call on the twitter steaming APIs. We will walk through that in the following steps
Creating an twitter app
For the sake of simplicity I will not be explaining how to create an twitter app here, one thing to note is you have to copy the consumer API secret key and consumer API key once when you create the app.
Configuring our application
Before starting our application which collect tweets using twitter API about a topic, we have to provide certain info including conn keys and Event hub conn strings.
- Download TwitterClientcore application from GitHub save in local folder
- Edit the app.config file in the notepad and change the appsettings values as below
oauth_consumer_key | TwitterConsumer Key (API key) |
oauth_consumer_secret | Twitter Consumer Secret (API secret key) |
oauth_token | TwitterAccess token |
oauth_token_secret | TwitterAccess secret token |
EventHubNameConnectionString | connection string-primary key |
EventHubName | eventhub name |
- After completing the above changes go to the directory where Twitterclientcore app is saved. Now build the project using ‘dotnet build’ command post which start the app using command ‘dotnet run’ and then you can now see the app starts sending tweets to your event hub.
Now if you receive any error when trying to build the project, there is a minor workaround which you have to carry out. Initially I faced error which cost me some time to figure out, pls follow the below steps
Download dotnet core SDK from Microsoft site and install in your system. I downloaded .NET core 3.1 SDK installer from here.
- dotnet build
- dotnet run
Now that we could see the app is sending tweets into the EventHub successfully, we can proceed with setting up stream analytics.
Stream Analytics
Create a new ‘Stream analytics Job’ resource in your respective resource group
Create Input
Once created when you go to your resource you could see it is currently in ‘Stopped’ state which is normal. The job can be started only after proper inputs/outputs has been configured which we are going to perform now.
In our newly created stream analytics job, from the left menu select ‘Inputs’ under category ‘Job Topology’.
If you face any connection test failure when creating this make sure you have selected ‘Connecting string’ as authentication mode with the following options
It’s now time to create the input query. Go to overview and click edit button on query box below
For testing I am providing it with my simple query “select * from streaminput2” -remember streaminput2 is the name we have given while creating the inputs of stream analytics job. Once you have typed the query you can see the output in the ‘input preview’ pane below.
You can also click the test query button with a different query and see the test results in the next pane below.
Create Output
So far we have created an event stream, event input and query to transform the incoming data over the event streams. One final step is to configure the outputs for the stream analytics job.
We can push the results to Azure SQL database, Azure BLOB Storage, Powe BI dashboard, Azure table storage and even to Event Hubs etc., based on ones need. Here I will demo both pushing to SQL database as well as Azure BLOB.
Now as we have successfully configured Input, Query and Output, we can go ahead and start the Stream Analytics Job. Before starting you have to make sure the twitter client application which we started earlier is running.
We could see the data is getting stored as json file in blob storage, we could see the size of the files getting increased gradually once if we refresh which proves the data that are getting stored.
Summary
Phew… thanks for staying this long for reading this article. I would always like to have it short but given this is an End-to-End project it took a while -kudos to you all!
What next ?
We have introduced ourselves to Azure Stream Analytics through which we can capture a live stream data to a blob storage easily. In future we will see how to save live feed data into SQL database and then to create a real time PowerBI dashboard from a live twitter feed which I believe will be interesting topics.