Break Glass: Structural Safeguards For Social Media

M-RCBG Associate Working Paper No. 260

Break Glass: Structural Safeguards For Social Media

John Fiske
Satwik Mishra

Introduction

As of early 2025, the challenges of social media governance are back in the headlines. Some countries are embracing an ‘anything goes’ approach, emphasizing freedom of speech and the absence of censorship. Other countries are doubling down on protections to minimize online harm and offensive content. While each country sorts out its own path for the near future, the importance of this decision is not in any doubt. Social media channels are a hugely powerful force driving many societal outcomes, which we have witnessed repeatedly over the past decade. Social media has been instrumental in connecting family and friends, sharing art, rescuing lost loved ones, forming new high-growth businesses, informing people, globalizing culture, enabling bullying of children, deepening partisan divides, influencing electoral outcomes, causing riots, toppling governments, enabling criminals and fomenting genocide. It’s a mixed record, to say the least.

Social media is now the planet’s most prevalent method for sharing information. As of 2024, more than 5 billion users spend an average of 2 hours and 23 minutes per day on these platforms (more time than with traditional TV or any other form of media). In the next two years, social commerce—the process of selling products directly on social media platforms or within social-first commerce marketplaces—is projected to surge to $1.2 trillion, making it the 17th largest economy in the world.

Despite many social benefits, this rapidly expanding ecosystem is fertile ground for misinformation, hate speech, fraud, varied harms to youth and many other issues. The rise of AI generated content, and AI-powered human-like agents are further amplifying these risks. In response, some governments have taken measures to combat online harm, typically by requiring platforms to take protective measures. As an initial 'stop-gap' approach, these requirements have helped reduce harm, but have not been entirely successful. The regulatory approach has several shortcomings such as:

Empowering any regulator to (directly or indirectly) dictate acceptable speech raises the specter of government censorship.
Platforms use varied thresholds for harm, and define harmful content differently than regulators do, creating legal and operational disputes.
Placing the onus on platforms fails to adequately account for the responsibility of the individual - both creators and consumers - in enabling harm.
Ensuring online safety imposes significant costs on platforms which inadvertently reinforces platform monopolies as only the largest networks can afford compliance.

Can we do better? How can we allow free information while limiting harms? How should we balance conflicting rights such as freedom of speech, privacy and personal safety and security? How should we better protect youth? What expectations can we reasonably put on platforms, regulators, content producers or consumers? How can we decide any of this in a fractured global regulatory environment, where governments take significantly different views on these issues, yet information flows are globalized?

In our view, the technological changes coming demand not just incremental changes to oversight, but a bold reimagination of our online safeguards. Instead of tightening the regulatory grip on social media (or taking a hands-off approach!) we need a smarter solution—one that focuses on improving the experiences of users themselves, and lets them ultimately manage the boundaries of their online experience. We frame our argument on five expectations which reasonable users should demand in the coming years:

I. Certainty that counterparties are not misrepresenting themselves. Anonymous or pseudonymous services are fine - but misrepresentation of oneself is not!

II. Standards-based guidance about whether information is trustworthy or not. Public standards-based classifiers will allow platforms, users and regulators to all use the same yardstick to assess whether information is trustworthy, toxic, etc.

III. The ability to set boundaries on one’s own information flows. Once universal standards are established, users should be able to set guidelines for their own information flows.

IV. Visibility into profile and content scores (both their own and other entities). Users and businesses should be able to review and appeal their scores, to understand why certain content may be down ranked or filtered for certain audiences.

V. The right to justice after suffering online harms. Too many bad actors are escaping punishment because of cross-jurisdictional challenges with online harms.

To reframe these expectations as pillars of a long-term strategy, we suggest regulators and four critical areas to improve as part of a longer-term strategy:

1. Identity Assurance Protocols – Establish a global identity assurance framework to underpin accountability (in a privacy-protective way). This framework will enable users and platforms to confirm identity claims of publishers, content creators, people and bots they interact with via consistent standards and support important protections such as age verification, fraud prevention, information lineage, etc

2. Public Classifiers and Scoring Standards - Establish a public library of classifiers and a scoring rubric to systematically measure the integrity, trustworthiness and toxicity of content and publishers. The scores of specific accounts and their content should be made visible to account owners, and an open appeals process created.

3. Transparent Distribution Protocols - Require platforms to be more transparent about their content distribution mechanisms (using public classifiers) and give users more power to set limits over content integrity, trustworthiness and toxicity.

4. Bad Actor Accountability – Develop better mechanisms to hold bad actors accountable both within and across jurisdictions

This proposal then is not a call for more regulation. Instead, it advocates for relevant industry leaders, regulators, and civil society in each country to reexamine their strategy to address online challenges. In each country the mix of stakeholders will be somewhat different - in some countries, government will lead; in others, private sector firms or NGOs can drive change.

In the short term, these four levers can help platforms strengthen safeguards against misinformation, youth harms, hate speech, and other harms. Longer term, their broader impact lies in empowering users to understand and guardrail their own information flows. A virtuous cycle of incentives should emerge as users themselves filter out more obnoxious and untrustworthy content, organically ‘downranking’ such content. This restructuring of incentives and rebalancing of power is, in our view, the only globally-scalable and sustainable path to a more trustworthy social media ecosystem.

Download the paper in PDF format.

��vlog��

Break Glass: Structural Safeguards For Social Media

In This Section

����vlog����

In This Section

��vlog��