This report on media coverage of religion in the 2012 presidential campaign uses data derived from two different methodologies. Data regarding the coverage in the mainstream press were derived from the Project for Excellence in Journalism’s in-house coding operation. (Click here for details on how that project, also known as PEJ’s News Coverage Index, is conducted.)

Data regarding the tone of conversation on social media (Twitter and Facebook) were derived from a combination of PEJ’s traditional media research methods, based on long-standing rules regarding content analysis, along with computer coding software developed by Crimson Hexagon. That software is able to analyze the textual content from millions of posts on social media platforms. Crimson Hexagon (CH) classifies online content by identifying statistical patterns in words.

Human Coding of Mainstream Media

Sample Design

The mainstream media content was based on coverage originally captured as part of PEJ’s weekly News Coverage Index (NCI). 

Each week, the NCI examines the coverage from 52 outlets in five media sectors, including newspapers, online news, network TV, cable TV, and radio. Following a system of rotation, between 25 and 28 outlets each weekday are studied as well as 3 newspapers each Sunday.

In total, the 52 media outlets examined for this campaign study were as follows:

Newspapers (Eleven in all)

Coded two out of these four every weekday; one on Sunday 
The New York Times 
Los Angeles Times 
USA Today 
The Wall Street Journal 

Coded two out of these four every weekday; one on Sunday
The Washington Post
The Denver Post
Houston Chronicle
Orlando Sentinel 

Coded one out of these three every weekday and Sunday
Traverse City Record-Eagle (MI)
The Daily Herald (WA)
The Eagle-Tribune (MA)

Web sites (Coded six of twelve each weekday)

Yahoo News 
Google News 
Wall Street Journal Online

Network TV (Seven in all, Monday-Friday)

Morning shows – coded one or two every weekday
ABC – Good Morning America 
CBS – Early Show 
NBC – Today

Evening news – coded two of three every weekday 
ABC – World News Tonight 
CBS – CBS Evening News 
NBC – NBC Nightly News

Coded two consecutive days, then skip one
PBS – NewsHour

Cable TV (Fifteen in all, Monday-Friday)

Daytime (2:00 to 2:30 pm) coded two out of three every weekday
Fox News 

Nighttime CNN – coded one or two out of the four every day

Situation Room (5 pm)
Situation Room (6 pm) 
Erin Burnett OutFront
Anderson Cooper 360

Nighttime Fox News – coded two out of the four every day 
Special Report w/ Bret Baier 
Fox Report w/ Shepard Smith 
O’Reilly Factor 

Nighttime MSNBC – coded one or two out of the four every day 
Hardball (7 pm) 
The Rachel Maddow Show 
The Ed Show

Radio (Seven in all, Monday-Friday)

NPR – Coded one of the two every weekday

Morning Edition
All Things Considered

Talk Radio
Rotate between:

Rush Limbaugh
Sean Hannity

Coded ever other day
Ed Schultz

Radio News 
ABC Headlines
CBS Headlines

From that sample, the study included all relevant stories:

  • On the front page of newspapers
  • In the entirety of commercial network evening newscasts and radio headline segments
  • In the first 30 minutes of network morning news and all cable programs
  • The first 30 minutes of talk radio programs
  • A 30-minute segment of NPR’s broadcasts or PBS’ NewsHour (rotated between the first and second half of the programs)
  • The top 5 stories on each website at the time of capture

Click here for the full methodology regarding the News Coverage Index and the justification for the choices of outlets studied.

Sample Selection

To arrive at the sample for this particular study of campaign coverage, we first gathered all stories from August 13, 2011-November 6, 2012, that were coded as campaign stories, meaning that 50% or more of the story was devoted to discussion of the ongoing presidential campaign.

This process resulted in a sample of 12,726 stories. That sample was then further narrowed to include only campaign stories that contained at least a reference to religion. This resulted in 793 stories.

Coding of Mainstream Press Religion-Related Campaign Stories

The baseline data in this study derived from PEJ’s regular Index coding was created by a team of seven experienced coders. We have tested all of the variables derived from the regular weekly Index coding and all the variables reached a level of agreement of 80% or higher. For specific information about those tests, see the methodology section for the NCI.

Unit of Analysis

The unit of analysis for this study was the religion reference. Anytime a story contained even a passing reference to religion-a word, a sentence, a paragraph or a section of the story-it was coded using five variables, unique to this study.

An additional set of five variables, focused on religion, were designed as follows: 

Trigger variable

This variable designates the action, event or editorial decision that caused religion to become news in any particular campaign story. Possible triggers include reporters, political candidates, religious figures, and poll releases, among others.

Theme variable

This variable determines the type of broad subject matter addressed by the discussion of religion in a campaign story, such as religious voter support of candidates, candidate beliefs, or impact of faith on policy and governance.

Religion events variable

This variable tracks whether a reference to faith in a campaign story touches on an event or isolated incident, as opposed to religion generically. Examples include the Jeremiah Wright controversy, the gathering of religious leaders in Texas, or Robert Jeffress’ comments about Mormonism being a cult.

Religion newsmaker variable

This variable designates which candidate, surrogate, figure or part of the electorate is the primary focus of the religion reference in a campaign story.

Religious faith focus variable

This variable designates the religious faith tradition that is the focus of the religion reference in a campaign story. If two faiths were discussed in the story, coders selected the one that received more space or time.

Coding Process

Testing of all variables used to determine campaign stories has shown levels of agreement of 80% or higher. For specific information about those tests, see the methodology on intercoder testing.

During coder training for this particular study, intercoder reliability tests were conducted for all the religion-specific variables.

For this study, each of the three coders were trained on the tone coding methodology and then were given a set of 30 stories to code for the five religion-specific variables. The rate of intercoder reliability for each variable was as follows:

  • Trigger variable: 83%
  • Theme variable: 80%
  • Religion events variable: 85%
  • Religion newsmaker variable: 83%
  • Religious faith focus variable: 80%

Coding of the Conversation in Social Media Using a Computer Algorithm

The section of this report that dealt with the social media discussion of the candidates and religion employed media research methods that combine PEJ’s content analysis rules developed over more than a decade with computer coding software developed by Crimson Hexagon. The analysis was based on examination of nearly 670,000 tweets and 76,000 Facebook posts about Obama and nearly 400,000 tweets and 29,000 Facebook posts about Romney.

Crimson Hexagon is a software platform that identifies statistical patterns in words used in online texts. Researchers enter key terms using Boolean search logic so the software can identify relevant material to analyze. PEJ draws its analysis samples from all public Twitter posts and a random sample of publicly available Facebook posts (and for other PEJ studies, blog posts). Then a researcher trains the software to classify documents using examples from those collected posts. Finally, the software classifies the rest of the online content according to the patterns derived during the training.  

According to Crimson Hexagon: "Our technology analyzes the entire social internet (blog posts, forum messages, Tweets, etc.) by identifying statistical patterns in the words used to express opinions on different topics."  Information on the tool itself can be found at and the in-depth methodologies can be found here

Crimson Hexagon measures text in the aggregate and the unit of measure is the ‘statement’ or assertion, not the post or Tweet. One post or Tweet can contain more than one statement if multiple ideas are expressed. The results are determined as a percentage of the overall conversation.

Monitor Creation and Training

Each individual study or query related to a set of variables is referred to as a "monitor."

The process of creating a new monitor consists of four steps. There were four monitors created for this study – two for Obama (Twitter and Facebook) and two for Romney (Twitter and Facebook).

First, PEJ researchers decide what timeframe and universe of content to examine. The timeframe for this study was August 23, 2011-November 6, 2012. PEJ only includes English-language content.

Second, the researchers enter key terms using Boolean search logic so the software can identify the universe of posts to analyze. The following terms were used:

  • Obama Facebook monitor: Obama AND (Muslim OR Moslem OR Islam OR Islamic OR Islamist)
  • Obama Twitter monitor: Obama AND (Muslim OR Moslem OR Islam OR Islamic OR Islamist)
  • Romney Facebook monitor: Romney AND (Mormon OR Mormonism OR LDS OR "Latter-Day Saints")
  • Romney Twitter monitor: Romney AND (Mormon OR Mormonism OR LDS OR "Latter-Day Saints")

Next, researchers define categories appropriate to the parameters of the study. The categories were as follows:

  • Obama Facebook monitor: Obama is a Muslim; General description of ‘Muslim’ rumors; Obama is not a Muslim; Obama has Muslim sympathies
  • Obama Twitter monitor: Obama is a Muslim; General description of ‘Muslim’ rumors; Obama is not a Muslim; Obama has Muslim sympathies
  • Romney Facebook monitor: Positive; Neutral; Negative
  • Romney Twitter monitor: Positive; Neutral; Negative (general); Negative (jokes)

Fourth, researchers "train" the CH platform to analyze content according to specific parameters they want to study. The PEJ researchers in this role have gone through in-depth training at two different levels. They are professional content analysts fully versed in PEJ’s existing content analysis operation and methodology. They then undergo specific training on the CH platform including multiple rounds of reliability testing.

The monitor training itself is done with a random selection of posts collected by the technology. One at a time, the software displays posts and a human coder determines which category each example best fits into. In categorizing the content, PEJ staff follows coding rules created over the many years that PEJ has been content analyzing the news media. If an example does not fit easily into a category, that specific post is skipped. The goal of this training is to feed the software with clear examples for every category.

For each new monitor, human coders categorize at least 250 distinct posts. Typically, each individual category includes 20 or more posts before the training is complete. To validate the training, PEJ has conducted numerous intercoder reliability tests (see below) and the training of every monitor is examined by a second coder in order to discover errors.

The training process consists of researchers showing the algorithm stories in their entirety that are unambiguous in tone. Once the training is complete, the algorithm analyzes content at the assertion level, to ensure that the meaning is similarly unambiguous. This makes it possible to analyze and proportion content that contains assertions of differing tone. This classification is done by applying statistical word patterns derived from posts categorized by human coders during the training process.

The monitors are then reviewed by a second coder to ensure there is agreement. Any questionable posts are removed from the sample.

How the Algorithm Works

To understand how the software recognizes and uses patterns of words to interpret texts, consider a simplified example regarding an examination of the tone of coverage regarding Mitt Romney. As a result of the example stories categorized by a human coder during the training, the CH monitor might recognize that portions of a story with the words "Romney," "faithful" and "committed" near each other are likely positive for Romney. However, a section that includes the words "Romney," "secretive" and "corrupt" is likely to be negative for Romney.

Unlike most human coding, CH monitors do not measure each story as a unit, but examine the entire discussion in the aggregate. To do that, the algorithm breaks up all relevant texts into subsections. Rather than dividing each story, paragraph, sentence or word, CH treats the "assertion" as the unit of measurement. Thus, posts are divided up by the computer algorithm. If 40% of a post fits into one category, and 60% fits into another, the software will divide the text accordingly. Consequently, the results are not expressed in percent of newshole or percent of posts. Instead, the results are the percent of assertions out of the entire body of stories identified by the original Boolean search terms. We refer to the entire collection of assertions as the "conversation."

Testing and Validity

Extensive testing by Crimson Hexagon has demonstrated that the tool is 97% reliable, that is, in 97% of cases analyzed, the technology’s coding has been shown to match human coding. PEJ spent more than 12 months testing CH, and our own tests comparing coding by humans and the software came up with similar results.

In addition to validity tests of the platform itself, PEJ conducted separate examinations of human intercoder reliability to show that the training process for complex concepts is replicable. The first test had five researchers each code the same 30 stories which resulted in an agreement of 85%.

A second test had each of the five researchers build their own separate monitors to see how the results compared. This test involved not only testing coder agreement, but also how the algorithm handles various examinations of the same content when different human trainers are working on the same subject. The five separate monitors came up with results that were within 85% of each other.

Unlike polling data, the results from the CH tool do not have a sampling margin of error since there is no sampling involved. For the algorithmic tool, reliability tested at 97% meets the highest standards of academic rigor.