What stops language models from improving themselves autonomously?

How LLMs function as agents, align with values, and attempt self-improvement within structural limits.

Topic Hub · 9 linked notes · 2 sections

View as

Sub-Topic Maps

2 notes

Why do multi-agent systems fail despite individual capability?

Multi-agent systems show lower performance than individual models despite coordinating multiple reasoning instances. What structural failures emerge when multiple LLMs deliberate together, and what ecosystem conditions are required for effective autonomous cooperation?

What actually constrains AI systems from behaving badly?

Explores whether alignment comes from matching human preferences, adopting normative standards, or confronting fundamental limits like the generation-verification gap. Examines how safety evaluation reveals whether constraints are real or performative.