How external researchers struggle to understand the ‘black box’ of Facebook

Facebook, like other social media companies, control all their data, so researchers either work with Facebook to access what data they can, or struggle on the outside.

Listen 10:15
Facebook CEO Mark Zuckerberg arrives to testify before a joint hearing of the Commerce and Judiciary Committees on Capitol Hill in Washington, Tuesday, April 10, 2018, about the use of Facebook data to target American voters in the 2016 election. (AP Photo/Carolyn Kaster)

Facebook CEO Mark Zuckerberg arrives to testify before a joint hearing of the Commerce and Judiciary Committees on Capitol Hill in Washington, Tuesday, April 10, 2018, about the use of Facebook data to target American voters in the 2016 election. (AP Photo/Carolyn Kaster)

This story is from The Pulse, a weekly health and science podcast.

Find it on Apple PodcastsSpotify, or wherever you get your podcasts.


In 2021, computer scientist Laura Edelson got banned from Facebook.

She said that became a problem in her personal life.  Edelson lives in a small town where  people use Facebook to find out about school delays, lost pets, and town meetings.

Edelson’s  research colleague, Damon McCoy, also got kicked off.

“It’s kind of annoying,” Edelson said. “I used to say it was us and Trump but now — it’s just us.”

All this happened because of their research on Facebook, which the company says used data gathering methods that broke their terms of service.

Facebook remains one of the biggest social media networks in the U.S., and around the world.

Over the years, researchers, parents, and politicians have had many questions about the effects of Facebook: Is misinformation really more engaging than other posts? Are particular groups of people more likely to see harmful ads, like scams? How effective are the platforms at driving users to accurate information? Facebook, like other social media companies, control all their data, so researchers either work with Facebook to access what data they can, or struggle on the outside.

Edelson and McCoy got interested in Facebook after the 2016 election, when the platform was under a lot of scrutiny over whether the company had mishandled user data, and swayed the outcome of the U.S. presidential election. CEO Mark Zuckerberg testified before Congress in 2018, where he said, “We didn’t take a broad enough view of our responsibility, and that was a big mistake.”

After that, Facebook released a continually updated archive of political ads in the U.S. so people can see how much each advertiser spent on their ads, and whom the ads reached.

Edelson and McCoy, who were both at New York University at the time, wanted to use this data to understand Facebook’s powerful recommendation engine, and the effects the ads can have on society.

They quickly published some work in 2018 showing that then President Donald Trump was the biggest political advertiser on Facebook. Facebook welcomed their work, telling The New York Times this is exactly how they hoped people would use their tool.

But Edelson says they quickly realized there were some important details missing from the archive, like the ad targeting data that shows the specific ways that advertisers had targeted ads at specific audiences. This data is valuable because they wanted to study who the advertisers wanted to reach, and who they were actually reaching.

“If you want to understand patterns across the entire ad ecosystem, you need information about the entire ad ecosystem,” Edelson said.

They got around this by working with investigative news outlet ProPublica to make a research tool that allowed them to collect information that Facebook provides to users. Facebook users can click on an ad and see the targeting criteria the company used to show them the ad.

Edelson and McCoy created a tool that people could download and voluntarily send the researchers information about the ads they were seeing.

Facebook ordered them to stop in 2020, and cut off their access completely in 2021. The company said in a press release that they did it because Laura and Damon were collecting data about Facebook users in a way that broke the company’s terms of service.

Edelson is now at Northeastern University.  She and McCoy are still doing their research, but now they need to rely on a research partner to get them the data.

A spokesperson for Meta, the parent company of Facebook, Instagram, and WhatsApp, said he could not tell me more details about this specific case. He also pointed out that the company has a track record of working with outside researchers on a variety of topics.

But even if researchers work with Meta, getting access to data can still be a challenge.

“Researcher access to Meta systems has really been on a downward trajectory over the past decade,” said Deen Freelon, professor of communications at the University of Pennsylvania. He is part of a research collaboration with Meta called Social Science One.

He says it would be unreasonable to expect complete access. “If you have a fundamental problem with that, then you need to get out of social media research because there’s no way around that.”

However, he added that researchers can still do valuable work “under the assumption that the process that produces the data is a black box, but that the output of that black box can be evaluated productively and usefully.”

Subscribe to The Pulse

Dannagal Young, a political scientist at the University of Delaware,said that in some cases, Facebook knows what they can do to help researchers, but just don’t do it.

She recalled that in 2019, Facebook invited her and other academics to meet some of Facebook’s in-house researchers. One of the Facebook researchers said they had been working internally on a mock-up Facebook that researchers could use to run experiments on eye movements, or how people respond to posts. Young told the group that would be amazing for researchers to have, only to find out this would not be shared outside Facebook.

“That to me is sort of the story of how things go with Facebook. They want to engage with academic researchers outside of the organization. They want academics and scholars to know that they have people internally asking the hard questions and doing the work,” Young said. “But it is so clear to all of us who work in this space that they’re not actually interested in having the answers to those questions inform the way the platform operates.”

In response to the critiques from researchers, a Meta spokesperson wrote, “We’re committed to supporting rigorous, independent research and have a track record of working closely with researchers to ensure they have access to the right tools and data to better understand our impact. We’re also continuing to partner with academic researchers to better understand our role in elections. To this end, we recently rolled out additional tools for researchers to access more publicly available content across Facebook and Instagram.”

Young said the overall problem is that Facebook and social media form big parts of our digital media diets, but researchers don’t know what’s inside, and there’s no equivalent to the Food and Drug Administration to make them explain.

She said unless researchers know what kind of data Facebook has on its users, and understand how the platform works, there is no way to even come up with meaningful research questions.

“How can we as outsiders ask: ‘Can you run the following analyses or correlations between I don’t know what, and I definitely don’t know what?’”

“So what are you then left with as a researcher? You’re left with information that is meaningless and gets you farther from truth than closer to truth.”

WHYY is your source for fact-based, in-depth journalism and information. As a nonprofit organization, we rely on financial support from readers like you. Please give today.

Want a digest of WHYY’s programs, events & stories? Sign up for our weekly newsletter.

Together we can reach 100% of WHYY’s fiscal year goal