Data in this study came from two main data sources: 1) analysis of Facebook posts from a set of 30 science-related pages based on data downloaded from the public Facebook Graph API from Jan. 1, 2014 to June 30, 2017, and 2) human content analysis coding by Pew Research Center staff of a random selection of Facebook posts produced by each of these pages from Jan. 1 to June 30, 2017.

Selection of science-related Facebook pages

For a comparison of Facebook pages from Facebook-primary and multiplatform organizations, the Center identified science-related public pages (in English only) with a large number of followers.  The term “follower” is used interchangeably throughout this report with the number of users who “like” a page using the thumbs up icon.

The selection of “science-related” pages was based on each page’s self-statement that it covers content about science or any of the following science topics: health/medicine, food/nutrition, astronomy, physics, biology/animal science, neurology, chemistry, technology/engineering, energy/environment, geosciences, math, or social and behavioral sciences. The categories broadly align with the major fields of scientific inquiry as defined by the National Science Foundation.

Commercial pages aimed primarily at selling consumer products were excluded, as were advocacy pages such as The Breast Cancer Site and PETA. Pages that covered a range of health/medicine topics were eligible for selection but those that focused exclusively on exercise or recipes were not.

There is no definitive list of science-related Facebook pages. (Facebook offers a list of science pages on its site, but the list is not exhaustive.) To create the list of popular pages analyzed in this study, five researchers searched for pages with a large number of page likes using a variety of methods in June 2017. Using Facebook’s search function, numerous blogs and articles, and results from searches on sites such as Google and trackalytics.com, the Center compiled a list of more than 200 English-language science-related pages that met the above criteria. Only pages with at least 2 million page likes were recorded since pages with fewer followers would not have enough to qualify among the top 30 pages. Each page was classified into one of two groups: a Facebook-primary page or a multiplatform page. The top 15 most popular pages from each group were selected for this study.

A page was considered a Facebook-primary page if it was run by an individual or an organization that used Facebook as their primary way of disseminating information. In some cases, such as IFLScience and ScienceAlert, Facebook was central to the creation and growth of their content and audience. In other cases, such as Stephen Hawking, Bill Nye or Neil deGrasse Tyson, prominent scientists were public figures prior to the creation of Facebook, yet the social media site has enabled these people to reach larger audiences.

Multiplatform pages were run by organizations whose primary method of communication was a media outlet that existed prior to the growth of Facebook as a social media platform. National Geographic, for example, has had a popular magazine and television channel for years. Women’s Health and Popular Science are best known for their traditional magazines rather than their social media presence. The pages for NASA and NASA Earth represent a government agency that has existed for decades.

Content Analysis of Facebook pages

Historical data from Facebook Graph API

Pew Research Center downloaded the details for all posts from the selected pages from Jan. 1, 2014, to June 30, 2017, from version 2.8 of the Facebook Graph API. Using Python scripts, details were collected regarding each post’s creation date, ID, description, caption, link, permalink and the total number of shares, comments, likes and other reactions. In total, details were collected for 340,333 posts from the 30 Facebook pages.

For posts created from January 2014 to March 2017, the details were downloaded from the API during the months of April and May 2017. For posts created from April to June 2017, the details were downloaded during July 2017. Because the numbers of comments, shares, and likes and other reactions can increase over time, the numbers included in this study reflect the numbers at the time of capture.

Data from the following dates/pages were not available from the Facebook API: Interesting Engineering from Jan. 1-April 13, 2014; Dr. Mehmet Oz from Jan. 1-April 29, 2014; Science Channel from Jan. 1-Sept. 19, 2015; and BBC Earth from Jan. 31-March 27, 2014.

Posts for the page Daily Health Tips were missing from the Facebook API for Jan. 7-April 11, 2017. However, researchers were able to manually capture the ID for each of those missing posts and consequently download the accompanying details of all posts during that time. Therefore, those Daily Health Tips posts are included in the study.

Facebook’s API is missing data for large numbers of posts in the second half of 2017 for at least 20 of the 30 pages in the sample. In their forums in early 2018, Facebook acknowledged problems with the API that resulted in the absence of some posts. Therefore, in this report the annual number of posts for 2017 is estimated based on doubling the volume of posts that appeared in the first six months of the year.

Posts that were produced during the sample time period but were removed by the pages themselves prior to the collection of data by the Center were not included in this study.

The total number of interactions for each post was the sum of all comments, shares and number of reactions (including clicking on icons for ‘like’ and other reactions such as ‘wow,’ ‘sad’ or ‘love.’)  In this study, the total number of interactions is the primary metric used to indicate audience engagement.

For some posts, the numbers of shares were not available through the Facebook API. For the sake of consistency, the numbers of shares were considered zero when compiling the total number of interactions for those posts.

The number of shares and comments are from the Facebook page URL only. If a page posted a link to a video located on another Facebook page, any data regarding shares and comments on that secondary page were not counted. For example, this April 11, 2016, video on Bill Nye’s page linked to a video on the GQ Facebook page. That post received 1,962 comments and seven shares on Bill Nye’s page at the time of capture, which were counted for this study. On the GQ page, that video received around 75,000 shares and more than 3,000 comments. Those 78,000 interactions were not included in this study since they did not appear on the specific science-related page in this sample.

The link provided by the Facebook API was the URL of the main link featured in each post. In many cases this was a link to the website of the same organization as the Facebook page. In other cases, the link was to another website produced by a different organization. In a few cases, posts did not include any links at all.

Human coding

In order to examine the specific content and format of posts from these Facebook pages, researchers performed detailed content analysis coding of a random sample of posts from Jan. 1 to June 30, 2017. To control for the different frequency of posts, the Center coded an equal number of posts – 250 – appearing during that six month period from each page. Half of the randomly selected posts appeared during the first three months of 2017, while the other half came from the next three months of 2017.15

For pages that did not have at least 250 posts during those six months, all the posts appearing during the six month period were coded.

For some posts originally selected for the sample, the Facebook post was available, but the content linked to from that post was no longer active. In those cases, those posts were excluded from the coding sample and replaced by another randomly selected post. (The details for these posts were still included in the historical data in this study.)

For each post, coders considered any text or video that appeared on the original Facebook page, along with any text or other information that appeared in the content that was linked to by that Facebook post. Comments were excluded. In total, 6,582 posts were included as part of the human coding sample. See the Appendix for details on the number of sampled posts.

Coded variables in this study were as follows:

  • Primary science topic: The topic or research area that best fits the content of the post. There were 22 topics, although several were combined for the final analysis. Some posts were classified as having a “non-science” topic. If more than one topic area was discussed, the area that received more time or space was coded as primary.
  • Primary storyline: The specific topic or theme of the post. There were 36 storylines, although some were later combined. Storylines were sometimes quite specific, such as the “March for Science on April 22, 2017,” and sometimes related to broader themes, such as vaccines or climate change. Many posts did not include mentions of any of the storylines and were given the equivalent of an “NA” for this variable. If two or more storylines were mentioned in a post, the storyline that received the most mentions was coded as primary.
  • Primary frame: The main goal or focus of the post. There were 15 frames coded. If two or more frames were present in a post, researchers coded the focus that received the most time or space as primary.
  • Link to external evidentiary research: The presence or absence of a link to external research in the post or the accompanying article. This usually included a link to a scientific journal article but could also include links to original research by government agencies or other institutions. For a link to be counted, it had to either have a hyperlink to the research or provide enough clear bibliographical information that a reader could easily find the research in question. Only research links external to the particular Facebook page counted. Therefore, if a post on NASA’s Facebook page included a link to original research that was conducted by NASA itself, it did not count as an external While some pages, such as Hashem Al-Ghaili’s Science Nature Page, occasionally place links to a scientific publication in their comment section, these were not included as external links in the post.
  • Producer of content: The organization responsible for the creation of the content in the post. For many posts, the content was written and published by the same Facebook page where the post appeared. In some cases, posts link to articles that were originally produced by another organization. Coders logged the name of the organization responsible for the original text or video in the post. Once the coding was completed, researchers categorized the producers assigned to each post as either produced by the same organization running the Facebook page or a different organization. The producer code was considered to be the same organization if the post and accompanying website were both owned by the same company. For example, links from posts on the MythBusters Facebook page to discovery.com were considered the same organization since they are both owned by Discovery Communications Inc.. When a post linked to a website that included content from many producers, such as YouTube or Twitter, researchers followed the link to determine if the material appearing on that site was created by the same organization as the Facebook post or by a different organization.

To test the validity of the coding, four researchers classified the same set of 121 posts on five variables. For the three more complex variables, an additional 35 posts were also coded by each person. Intercoder agreement ranged from 80% to 97% across these five classifications. Krippendorf alpha ranged from .71 to .84.

Data of Twitter pages

Data regarding the number of followers for Twitter accounts discussed in this report was collected as of Jan. 31, 2018. Data regarding the number of tweets posted in 2017 was collected using Crimson Hexagon.