Bottom-up Domain-specific Superintelligence: A Reliable Knowledge Graph is What We Need

Paper · arXiv 2507.13966 · Published July 18, 2025

Language models traditionally utilized for cross-domain generalization in natural language understanding and generation have recently demonstrated task-specific reasoning through inference-time scaling. However, their top-down training approach on general text corpora is insufficient for acquiring domain-specific abstractions required for deep expertise in a particular domain. This may require a bottom-up approach that acquires deep expertise by explicitly learning to compose simple concepts of a domain into more complex ones. A knowledge graph (KG) provides such an abstraction where domain primitives are captured by head-relation-tail triples. A KG path formed by such triples captures a higher-level concept. We present a task generation pipeline that directly synthesizes tasks from the domain-specific primitives, enabling the model to explicitly acquire and compose these primitives for reasoning. We fine-tune language models on the resultant bottom-up KG-grounded curriculum to demonstrate domain-specific superintelligence. Although our approach is readily applicable to a wide variety of domains, we validate it in the context of medicine where reliable KGs are available. Applying our proposed pipeline to a medical KG, we curate a dataset of 24,000 high-quality reasoning tasks paired with structured thinking traces derived from diverse medical primitives. We fine-tune the QwQ-32B model on this bottom-up curriculum to obtain QwQ-Med-3 that takes a step towards medical superintelligence. We also introduce an evaluation suite, ICD-Bench, to quantify domain-specific capabilities of models on reasoning tasks across 15 medical domains. Our experiments demonstrate that QwQ-Med-3 significantly outperforms state-of-the-art open-source and proprietary reasoning models on all categories of ICD-Bench.

The industry’s approach to artificial general intelligence (AGI) centers on breadth of acquired expertise. We envision a future in which a compositional model of AGI emerges from interacting superintelligent agents, much like how the human society hierarchically acquires ever deeper expertise by combining the expertise of a group of individuals in adjacent domains or super-domains. Furthermore, since language models that are fine-tuned for superintelligence can be relatively small (e.g., 32B parameters), this bottom-up approach may also significantly cut down on training/inference energy costs.

Recent advances in language modeling [1–8] have made a significant stride towards a cognitive system [9, 10] capable of performing a wide spectrum of tasks with human-like proficiency [11–14]. Yet, human-level generality may only be a waypoint on the path to advanced intelligent systems that may exceed the cognitive performance of humans: Superintelligence [15, 16]. While achieving the breadth of human cognition is one goal of advanced artificial intelligence, superintelligence might be orthogonally characterized by depth, outperforming the best human experts in specialized domains [17–23], like proving unsolved conjectures in number theory, developing novel kinase inhibitors for rare cancer subtypes, or discovering new ferromagnetic semiconductors that operate at room temperature. Consequently, advancing towards superintelligence might require fine-tuning general cross-domain intelligence into specialized domain-specific expertise.

Language models (LMs) have achieved remarkable cross-domain performance in natural language understanding and generation, largely through scaled pre-training [24, 25]. Most recently, scaling inference-time compute [26–28] in pre-trained models via reinforcement learning [29] or posttraining on high-quality data [30] has been shown to elicit deeper task-specific reasoning. The emergent capacity for specialized reasoning within generalist LMs may suggest that they could serve as a foundation for scaling toward superintelligent specialists [31]. However, these models are fundamentally limited by their top-down approach to learning: they acquire general abstractions of the world through self-supervised learning on vast datasets that may predominantly capture surface-level regularities of a domain [32–37]. Acquiring deep expertise in a field necessitates a bottom-up understanding, starting with axioms that capture fundamental relationships among concepts of the domain and then composing them to build upwards to a higher-order understanding [38–40]. This kind of bottom-up organization is difficult to find and acquire through Internet-derived general text corpora. For example, a student builds expertise by following the pedagogical structure of a textbook, beginning with foundational chapters and gradually progressing to more advanced chapters, not merely reading encyclopedic summaries. Past pioneering works, in neuro-symbolic reasoning [41] and probabilistic graph inference [42], have attempted to develop hierarchical domain expertise from primitives but have failed to generalize beyond synthetic regimes. Conversely, LMs demonstrate incredible generalizability but lack grounding in structured knowledge. This motivates the central question of our work:

Can explicitly training LMs on structured domain knowledge via a bottom-up curriculum elicit the emergence (if any) of a domain-specific superintelligence?

Naturally, the question then arises: How do we organize domain knowledge into a structured curriculum from which an LM can effectively learn? Knowledge graphs (KGs) [43] offer a useful scaffold for structuring knowledge that can tackle this challenge. KGs organize information as a rich graph database where nodes represent semantically meaningful entities from the domain and edges denote the relationships between them. Each edge typically captures a primitive relation in the form of a (head entity, relation, tail entity) triple. For example, (Methane, Contains Element, Carbon) represents the axiomatic fact that methane molecules contain carbon atoms. Edges further facilitate composite relational reasoning through the traversal of multi-hop paths along a chain of interconnected edges. For example, (Methane, Contains Bond, C-H Bond), (C-H Bond, Is Type Of, Sigma Bond), (Sigma Bond, Has Property, Single Covalent Bond) captures the bonding structure of methane, where C-H bonds are sigma bonds that possess the property of being single covalent bonds. A KG comprises many such paths whose local topology naturally induces a bottom-up curriculum, beginning with atomic relations and composing them into more complex reasoning chains.