How much worse is misuse risk from open foundation models?
Can we measure whether open foundation models actually increase misuse risk beyond what bad actors could already accomplish with existing technology? Current research hasn't adequately answered this question across cyber, biotech, and information warfare domains.
The open-vs-closed release debate is heated and under-evidenced. This position paper clarifies it by defining open foundation models (broadly available weights — Llama 2, Stable Diffusion XL) via five distinctive properties (greater customizability, deeper inspectability, poor monitoring, etc.) that drive both their benefits (innovation, competition, distributed decision-making power, transparency) and risks. Its analytical contribution is a marginal-risk framework: assess misuse not in absolute terms but relative to pre-existing technology (search engines, prior models). Applying it across vectors (cyberattacks, bioweapons, disinformation), it finds current research insufficient to characterize the marginal risk — and shows that past disagreements stem from focusing on different parts of the framework under different assumptions.
The keeper is the marginal reframing: the policy question is not "could an open model help a bad actor?" but "how much does it help beyond what they could already do?" — and on that question the evidence is mostly missing, which is itself the finding.
This is a discourse/governance anchor for the vault. It complements the empirical risk register of Where do frontier AI models actually pose the greatest risk today? — both insist on measured marginal risk over speculation — and informs the open-weights side of the alignment-and-society conversation.
Inquiring lines that use this note as a source 3
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
Related concepts in this collection 2
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Where do frontier AI models actually pose the greatest risk today?
Current AI safety discourse focuses on autonomous R&D and self-replication, but empirical risk assessment may reveal a different priority. Where should mitigation efforts concentrate?
both demand measured marginal risk over speculative misuse narratives
-
How soon do AI researchers expect artificial general intelligence?
A survey of 2,778 AI researchers reveals how expert timelines for human-level AI have shifted over the past year, and what factors drive disagreement among specialists on this critical timeline.
the broader risk-discourse context where open-model debates sit
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- On the Societal Impact of Open Foundation Models
- Agentic Misalignment: How LLMs Could Be Insider Threats
- Foundation Priors
- Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
- Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
- LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring
- Seemingly Conscious AI Risks
- The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?
Original note title
open foundation models need a marginal-risk framework because current evidence cannot characterize their misuse risk relative to existing technology