As a psychologist scientist, data enthusiast, and novice programmer, one of the things I’ve been really interested in lately is applying text-mining tools to social media to learn more about public opinion on important news stories and current events.
I know it’s cliche to say it, but social media is an incredibly powerful tool. Not only does it obviously allow friends, family, and colleagues to easily communicate and share information with one another in near real-time, but it also provides a rich storehouse of communications for researchers and data geeks, such as myself, to comb thorough and mine for interesting patterns in human behavior and human thought.
For instance, I’ve previously written about research demonstrating how Twitter can be used to predict the risk of dying from a heart attack in particular regions of the country. And more recently, I’ve done a bit of text-mining in Twitter to try to learn more about our President’s tweeting habit, such as the time of day he generally prefers to tweet (usually around 8:00 am EST), the most frequent words he uses when he tweets (“thank,” “great,” and “Hillary”),* and whether his tweets include mostly positive or mostly negative words (on average it’s split pretty evenly, actually).
So, with my political and scientific interests being what they are, I figured I would turn to Twitter to try to learn a little more about how people have perceived and reacted to the continuous flood of news stories that has been coming out of North Carolina recently.
Wait, what’s that? You haven’t heard about the latest major breaking news coming from our state? Well, for those who don’t live here or who haven’t been paying much attention – or for those who maybe just need a little reminder – let’s begin with a brief recap on some of the major things that have gone down in the Tar Heel State during the past year or so.
*Note: Hmm…what are you trying to tell us, Donald?
A Brief Recap of Major #NCPOL Headlines
House Bill 2
In February 2016, the city of Charlotte passed an ordinance that added gay and transgender people to the list of classes protected against discrimination. The ordinance also allowed transgender residents to use the public restroom that corresponds to the gender with which they identify when in a government building. Of course, Republican state lawmakers weren’t happy with that at all. So almost immediately afterwards, in March of that same year, the Republican-led General Assembly gathered for a special session and passed the now infamous House Bill 2, also known as HB2 for short.
Though frequently referred to in the media as North Carolina’s “Bathroom Bill,” HB2 – or The Public Facilities Privacy and Security Act, as it’s formally known – actually went far beyond anything having to do with public restrooms. The law not only forced individuals in government buildings to use public restrooms and changing facilities that correspond to the biological sex on their birth certificate, it also limited anti-discrimination protections for LGBT North Carolinians and restricted towns and cities in North Carolina from passing their own non-discrimination ordinances – ordinances such as the one passed by Charlotte, which attempted to go above and beyond the minimal protections already established by the state.
Advocacy groups and rights organizations, such as the ACLU and Equality NC, argued that HB2 was highly discriminatory against LGBT citizens and, in particular, members of the transgender community. Many others agreed, and so the law brought a torrent of negative press for the state and drew national backlash from all parts of the country.
Indeed, by early December 2016, a total of 43 states, counties, and cities in the U.S., including the District of Columbia, had banned official, taxpayer-funded government travel to our state in direct response to HB2. In addition to this, many businesses, professional conferences, concerts, and sporting events (most notably the NCAA tournament and the NBA All-Star Game) pulled out of North Carolina, or removed North Carolina from consideration as a host location, until HB2 was repealed. The Associated Press estimated earlier this year that, all totaled, HB2 likely cost the state about $3.76 Billion or more in lost business.
Then finally this past March, after an earlier attempt at repeal went down in flames spectacularly, the state legislature reached a deal with our newly elected Democratic Governor (more on that in a bit) to pass HB142, which once and for all, at long last, repealed HB2 – well, sort of. In fact, many complain that HB2 has been repealed only in name and that HB142 does not go far enough in restoring protections for LGBT citizens. Nor does it do enough to restore the rights of towns and cities to pass non-discrimination protections for their own citizens and workers. As such, progressive advocacy groups lament that North Carolinians are, for the time being, stuck with something akin to HB2.0.
The 2016 Election and Subsequent Coup
Now let’s back things up a bit to this past Fall and this past year’s gubernatorial election.
In November, before the partial, kinda-sorta repeal of HB2, Republican Governor Pat McCrory lost his bid for re-election when he was defeated at the polls by Democratic challenger and former State Attorney General Roy Cooper.
McCrory took a lot of the heat for HB2, and it’s probably the major reason he was defeated. And in further embarrassment to North Carolina, McCrory didn’t exactly accept defeat graciously. Instead, he dragged things out for a month after the election by requesting recounts based on unconfirmed voting irregularities. Then, when he finally did concede in December and it became clear that a Democrat was soon going to take control over the Governor’s mansion, Republican state lawmakers intervened with what some have described as a “Legislative Coup” and stripped Cooper, the incoming Democratic Governor, of some of his executive powers – powers, which by the way, were enjoyed without question by McCrory.
Now, North Carolina is no longer considered by some to be a fully free and independent democracy.
North Carolina v. The U.S. Supreme Court
Fast forward to present day, May 2017 as of the time of my writing this, and we find that North Carolina is still in the national spotlight.
Earlier this month, the U.S. Supreme Court declined to hear an appeal to a lower court ruling that struck down key parts of our restrictive and controversial voter ID law. The law had previously been nullified by the 4th U.S. Circuit Court of Appeals because it appeared intentionally designed to discriminate against black voters.
Then, just a few day later on May 22, the Supreme Court acted again to strike down two of North Carolina’s congressional districts – Districts 1 and 12 – after concluding they amounted to unconstitutional racial gerrymandering. The Court’s 5-3 decision in Cooper v. Harris appears likely to be a big win for voting rights advocates and opponents of partisan gerrymandering, as described below by the “failing” New York Times:
Writing for the majority, Justice Elena Kagan said states did not have unlimited leeway in drawing districts in a claimed attempt to comply with the voting law [the Voting Rights Act]. With the decision, the court was also trying to solve a constitutional puzzle: how to disentangle the roles of race and partisanship when black voters overwhelmingly favor Democrats. The difference matters because the Supreme Court has said only racial gerrymandering is constitutionally suspect.
Some election law experts said the ruling, which upheld a lower court decision, would make it easier to challenge voting districts based partly on partisan affiliations and partly on race.
“This will lead to many more successful racial gerrymandering cases in the American South and elsewhere, said Richard L. Hansen, a law professor at the University of California, Irvine.
In the same piece, Eric H. Holder Jr., the former U.S. Attorney General and chairman of the National Democratic Redistricting Committee, described the ruling as “a watershed moment in the fight to end racial gerrymandering,” given that “North Carolina’s maps were among the worst racial gerrymanders in the nation.”
Phew, That’s A Lot of News!
So yeah, between the discriminatory laws, racial gerrymandering, contentious elections, and legislative coups, North Carolina certainly has made its way into the national headlines a lot lately.
But what if you’re one of those people who thinks all this negative press directed toward our state is just a bunch of made up “fake news” concocted by crooked, goofy liberal elites? I mean, surely the dishonest media are the only ones who care about any of this stuff, right? What about “real people?” What do they think?
Let’s turn now to Twitter to try to find out.
What Does Twitter Think of North Carolina Politics?
To try to learn more about what people think of North Carolina politics – and what news stories about North Carolina have gained the most traction as of late – I wrote a program in R, an open source statistics program, to collect and compile a bunch of publicly available tweets that included the hashtag #ncpol (short for North Carolina Politics). All totaled, I collected 5,000 tweets, spanning from May 17, 2017 to May 24, 2017.
1. Who tweets the most about North Carolina politics?
Between May 17 and May 24, there were 2,519 accounts that tweeted using the hashtag #ncpol.
The graphic below shows the accounts that tweeted most frequently during this time frame. By quite a wide margin, the one who tweeted most frequently about North Carolina politics was @jmsexton_, with 112 tweets. He was followed by @timothypeck (59 tweets), @NC_Zero (53 tweets), @RaleighReporter (46 tweets), and @ncFortyEight (43 tweets).
Of course, if you’re interested in keeping up with North Carolina politics and you’re on Twitter, then you might want to check out a few of these accounts.
2. What are the most frequent words that appear in tweets about North Carolina politics?
The next graphic shows a word cloud of the most frequent terms that appeared in tweets about North Carolina politics. Keep in mind that the size of each word in the cloud represents the number of times it appeared in my collection of 5,000 tweets.
As you can see, the most frequent term was “gerrymandering,” with 614 occurrences across 5,000 tweets. That constitutes about 12% of all the tweets included in my analysis. Among the ten most frequent terms, this was followed by “state” (604), “just” (574), “map” (503), “power” (466), “struck” (456), “Jim” (450; this is in reference to Jim Womack, who is running for NCGOP Chair), “exemplifies” (448), “congressional” (378), and “districts” (327).
Obviously, the fact that “gerrymandering” was the most frequent term means there was a lot of discussion on Twitter this past week about the recent Supreme Court ruling that struck down the 1st and 12th congressional districts. Indeed, the tweet shown below, from @PoliticsWolf, was the most frequently retweeted post during the time window I analyzed:
— Stephen Wolf (@PoliticsWolf) May 22, 2017
3. Are tweets about North Carolina politics mostly positive, mostly negative, or neutral?
Okay, so a lot of people talking about North Carolina politics on Twitter recently were talking about the Supreme Court ruling in Cooper v. Harris. But what was the overall “emotional tone” of these conversations? Were most tweets positive, negative, neutral?
In addition to looking at how frequently all words appeared in my collection of #ncpol tweets, I also performed something called a sentiment analysis to determine how many of the words in each tweet were positive and how many were negative. Examples of positive words are things like “love,” “respect,” and “admire,” whereas examples of negative words are things like “hate,” “terror,” and “disgrace.”
As a technical aside, “sentiment” is defined here as the number of positive words per tweet minus the number of negative words per tweet, and words are defined as either positive or negative based on Bing Liu’s Opinion Lexicon. Tweets with more positive words than negative words are assumed to express positive emotion, whereas tweets with more negative words than positive words are assumed to express negative emotion.
All totaled, about half of the tweets in my set (52%) included either positive or negative words. Moreover, as shown in the graphic below, 57% of these tweets were flagged as predominantly negative, meaning they included more negative words than positive words. Meanwhile, 35% were predominantly positive, with more positive words than negative words, and only 8% were neutral, with an equal number of positive and negative words.
The most frequent word among these emotionally charged tweets was “struck,” as in “the U.S. Supreme Court struck down two of North Carolina’s congressional districts (see below).”
Yet, this single word didn’t fully account for all the negativity in this set of tweets. Even when I excluded the word “struck” from the analysis, the percentage of tweets categorized as negative (48%) still exceeded the percentage of tweets categorized as either positive (43%) or neutral (9%).
So, What’s The Take Home Message in All This?
Clearly, the analysis I’ve provided here is far from perfect, as it’s difficult for a computer program to capture the precise nature and tone of conversations among real people. For one thing, my sample of Twitter users might not be representative of the larger population of people who live in North Carolina and pay attention to local and state-level politics. Therefore, the opinions of people on Twitter might not accurately reflect the opinions of those who are not on Twitter.
Furthermore, there’s no guarantee that the tweets I analyzed from between May 17 and May 24 are representative of tweets from other points in time throughout the year. Perhaps, with news of the Supreme Court striking down two North Carolina congressional districts for racial gerrymandering, tweets during this time window were more negative than usual. Third and finally, it’s possible that merely counting up the number of positive and negative words per tweet doesn’t always provide a very good measure of real human emotion. Indeed, my analysis would completely miss the point of a sarcastic tweet along the lines of, “@realDonaldTrump is the greatest president in history and has the largest, most beautiful hands of any of the leaders in the western world.” And let’s face it, there’s a lot of sarcasm and snark on Twitter.
But although imperfect, an analysis such as this one can be useful if we think of it as merely providing a rough estimate of what people on Twitter have to say. And I would argue that an imperfect estimate is usually better than no estimate at all. And when viewed in this way, it certainly seems that most Twitter conversations about North Carolina politics are indeed generally negative.
Now is this surprising? Probably not, given what we’ve seen transpire in our state over the past year. Moreover, maybe this is nothing even unique to North Carolina. Perhaps nowadays, people are expressing lots of negative emotion about U.S. politics no matter where they live.
This post was also published over at Stronger North Carolina
Brian Kurilla is a psychological scientist with a Ph.D. in cognitive psychology. You can follow Brian on Twitter @briankurilla