Methodology

https://legacy.pewresearch.org/short-reads/2014/01/30/on-twitter-criticism-exceeds-praise-for-obamas-speech/

This analysis of the Twitter reaction to the 2014 State of the Union address employed media research methods that combined Pew Research’s content analysis rules with computer coding software developed by Crimson Hexagon (CH). This report is based on examinations of more than 1.6 million tweets, and is a follow-up to a similar report produced after the 2013 State of the Union speech.

Crimson Hexagon is a software platform that identifies statistical patterns in words used in online texts. Researchers enter key terms using Boolean search logic so the software can identify relevant material to analyze. Pew Research draws its analysis sample from all public Twitter posts. Then a researcher trains the software to classify documents using examples from those collected posts. Finally, the software classifies the rest of the online content according to the patterns derived during the training.

This analysis contains two parts. The first is an analysis of the sentiment or tone of the response on Twitter. The second is an analysis of the most discussed topics using keyword searches.

The time frame for this report was 9 pm ET, January 28, 2014 to 1 am ET, January 29, 2014.

The Boolean search used to identify tweets about the State of the Union address was: (state and union) OR Obama OR SOTU.

Tone of Twitter Response

Reaction on Twitter can often be at wide variance with public opinion. A Pew Research Center analysis last March compared the results of national polls to the tone of tweets about eight major news events and found that the Twitter conversation can be more liberal than survey responses, while at other times it is more conservative. During the 2012 presidential campaign, Twitter sentiment was much more critical of Republican candidate Mitt Romney than of President Obama.

Researchers classified more than 250 documents in order to “train” this specific Crimson Hexagon monitor. All documents were put into one of four categories: positive, neutral, negative or jokes. A tweet was considered positive if it clearly praised President Obama, his speech or his policy positions. A tweet was considered negative if it was clearly critical of Obama.

CH monitors examine the entire discussion in the aggregate. To do that, the algorithm breaks up all relevant texts into subsections. Rather than the dividing each story, paragraph, sentence or word, CH treats the “assertion” as the unit of measurement. Thus, posts are divided up by the computer algorithm. Consequently, the results are not expressed in percent of newshole or percent of stories. Instead, the results are the percent of assertions out of the entire body of stories identified by the original Boolean search terms. We refer to the entire collection of assertions as the “conversation.”

The numbers reported for this analysis are only of the percent of conversation that had a clear positive or negative tone.

Volume of Twitter Subjects

In order to discover which topics were the most discussed on Twitter, researchers searched the entirety of public posts to see which words or phrases were used the most in tweets about the speech.  For a tweet to be counted, it must have included at least one of the search terms along with the words “state” and “union,” or “Obama,” or “SOTU.”

Researchers created a list of topics to follow based on an examination of the subjects Obama discussed, along with phrases often used in media coverage before and after the event. For each subject, multiple search terms were used to identify appropriate tweets. For example, for the subject of “jobs,” a tweet was counted if it included the term “jobs,” or “employment” or “unemployment” or “jobless.”

The table below shows the exact terms used for the top 30 subjects according to our searches.

search terms table