Predicting the Election with Twitter

Want to know who’s going to win the presidential election? A Georgia Southern study suggests you might find the answer on Twitter.

Georgia Southern data science graduates Garrett Neidlinger, Chris Bergin and Jacob Dahlberg created a capstone senior project that analyzed the political opinions of Twitter users against Democrat and Republican Primary results. They captured almost 17 million tweets over a month-and-a-half surrounding Super Tuesday on March 1, and used search terms based on primary and caucus names, candidate names and slogans.

twitter_logo_blueAfter analyzing the data, one of the study models suggested a strong correlation between the opinions of verified Twitter users and the number of delegates a candidate received in each election. In short, when verified users tweeted favorably about a candidate before an election, that candidate won the primary.

“It really speaks to the influence of the verified users,” said Neidlinger. “It’s interesting because of how much influence these accounts have. They have so many more followers and more people retweeting them that they can share their influence with everybody else.”

A verified Twitter user account, unlike normal user accounts, is verified through a stringent application process. Verification doesn’t depend on the user’s follower count or tweet count. According to Twitter, verified accounts belong to “highly sought users in music, acting, fashion, government, politics, religion, journalism, media, sports, business and other key interest areas.”

“You have to have some degree of influence,” said Neidlinger. “People have to know who you are.”

While the data showed the strongest link between verified sentiment and election results, it also demonstrated a lesser connection between the number of retweets and the overall number of tweets about a particular candidate.

“The subset of the population that are voting and are on social media is reflective of the actual population,” said Neidlinger.

And if Donald Trump’s presidential campaign in particular has demonstrated anything this year, it has been a sea change in how candidates will use Twitter and other social media platforms in the future.

The truth is out there

Studies like this one demonstrate the potential of “big data” not just for analyzing politics, but also for making predictions, creating efficiency and solving problems not only in businesses and organizations, but also in society as a whole.

More than just a buzzword, big data refers to the massive amounts of structured data (as in a database) and unstructured data (longform text in different formats) that businesses and organizations collect and store on a regular basis.

Just how much data is being collected? Consider this. The global internet population presently represents 2.4 billion people — all of whom are not only consuming data, but creating it in unprecedented volume. In fact, every minute of the day, YouTube users upload 72 hours of new video, Twitter users tweet 277,000 times, Instagram users post 216,000 new photos and Facebook users share 2,460,000 pieces of new content.

In addition to social media data, there are corporations, hospitals, grocery stores, fitness clubs, automobiles and a myriad of personal devices collecting massive amounts of data about you each time you visit a website, make a purchase or just walk around the block.

Cheryl Aasheim, professor of Information Technology and one of the architects of the undergraduate data science concentration at Georgia Southern, believes that with all this data out there accumulating exponentially, we’re only scratching the surface of what can be done with it.

“A lot of companies are still overwhelmed with their structured data,” she said. “They don’t know what to do with this other data. Even the amount of structured data that we produce is more than we used to. Just think about your FitBit, and all these other health applications on your phone. Those are all collecting data. Imagine if they merged with insurance companies.”

The problem facing most organizations in the U.S. and abroad is finding people with the skills to parse and analyze the data. In a widely cited report from McKinsey and Company, the U.S. faces a shortage of 140,000 to 190,000 people with deep analytical skills, and needs some 1.5 million managers and analysts to analyze big data and make decisions based on their findings.

Big data has unlocked vast new treasures for marketing professionals who can now pinpoint advertising based on a specific user’s demographics and preferences. But what could this data unlock in the future for scientists? For corporations? For law enforcement? There’s no way to tell for sure, but Aasheim says one thing is for certain.

“I think we all realize the amount of data is not going to get smaller,” she said.

— Doy Cave

bd1