News

Big Data

Can you predict who will win the US election?

sam.jpg
Samvel Gevorgyan
CEO, CYBER GATES
I cover cybercrime, privacy and security in digital form.

Who would win the battle for the White House to become the next President of the United States was a topic of hot debate in 2012.

Much of that debate was taking place online, with plenty of people blogging, tweeting or updating social media with their thoughts on Mitt Romney versus Barack Obama.

Photo: usatoday.com

This provided us with a rich source of information about what people were thinking and feeling about the election race. So today I've decided to cover Techniques of Digital Data Analysis that are used to predict the US election. And perhaps the 2012 election will be remembered as the first election where big data analysis played a crucial role and had a tremendous impact on the outcome of the presidential election.

I am fairly familiar with the above mentioned techniques, because I had an opportunity to meet the CEO of EMC company on January 2013 in Singapore. EMC was one of a selected few companies that Twitter had entrusted to syndicate and provide access to the full Twitter feed for use in internal analytics applications for Obama's campaign in 2012. In my humble opinion that was the reason that in 2015 this company was sold to Dell for $67B in largest deal in Tech history.

The techniques of big data analysis remain the same, so let’s jump to year 2016 and see what social media data is used to predict the US election nowadays.

What Does Big Data Look Like?

Facebook

  • 293,000 statuses are updated / minute
  • 31,25 million messages are sent / minute

Twitter

  • 440,640 new tweets go online / minute

However, any data stored (posted, sent, etc.) in the Internet is messy and noisy, that's why we need process the data to get value from it.

Processing the data

  • Data: raw data that has not been processed for use
  • Information: data that are processed to be useful; provides answers to "who", "what", "where", and "when" questions
  • Knowledge: application of data and information; answers "how" questions
  • Understanding: appreciation of "why"
  • Wisdom: evaluated understanding

Text Mining and Sentiment Analysis

The first technique that is used to find out what we are thinking and feeling about the election race is called “Text Mining and Sentiment Analysis”. The objective is to classify to tweets as Positive, Neutral, or Negative by analyzing each word (or emoticon) in your post.

An example of positive post

An example of negative post

US presidential candidate Twitter analysis, according to Datameer:

  • Overall candidate sentiment level
    • Clinton: 48%
    • Trump: 48%
  • Positive and negative terms
    • Clinton: 5,522 positive, 6,098 negative
    • Trump: 3,254 positive, 3,550 negative
Note: based on 5465 users, 11283 tweets, and the past 30 days of data.
Note: sentiment level is measured as the number of positive words over the total number of both positive and negative words in tweets (e.g. 100% being the most positive).

But, wait a minute, what about the people who has no posts about presidential candidates?

Facebook Reactions

In 2009, Facebook introduced a button that allowed people to give feedback to their friends’ posts. Facebook called it Like, and people liked it a lot.

Photo: vocativ.com

On February 24, 2016, Facebook launched Facebook Reactions, which allows users to respond to posts with multiple reactions in addition to "liking" it.

So, how can we use Facebook Reactions to find out what other people think about the US president candidates?

Assume you hate Donald Trump and you post some negative thoughts about him online. Can we predict your friends’ thoughts based on their likes or other reactions? If your answer is yes, then you're absolutely right.

Photo: vocativ.com

Now it is required to find some "juicy" information about the audience including genders, ages, locations and interests. That’s it!

References

Share this article

Comments ()