EVINCE: Optimizing Multi-LLM Dialogues Using Conditional Statistics and Information Theory

Paper · arXiv 2408.14575 · Published August 26, 2024

EVINCE (Entropy and Variation IN Conditional Exchanges) is a novel framework for optimizing multi-LLM dialogues using conditional statistics and information theory. It addresses limitations in multi-agent debate (MAS) frameworks, where multiple LLMs “chat” without behavior modulation or mutual information quality assessment. Using dual entropy optimization to balance perspective diversity and prior knowledge, EVINCE provides quantitative tools to dynamically regulate LLM linguistic behaviors. When mutual information is low and both cross-entropy and Wasserstein distance are high, EVINCE promotes contentious dialogues to expose diverse perspectives and uncover inconsistencies. Conversely, as cross-entropy decreases and mutual information stabilizes, it transitions discussions into a conciliatory phase, encouraging compromise and acknowledgment of valid points. Using information theoretic metrics and optimizing mutual information, EVINCE emerges as a structured and highly effective framework for multi-LLM collaboration.

Inclusive Exploration: We develop methods to ensure dialogues comprehensively explore diverse perspectives. Using conditional statistics, we adjust an LLM agent’s behavior beyond its default “maximum likelihood” next token prediction, enabling it to adopt specific stances via in-context learning. To balance idea exploration with adherence to prior knowledge, we introduce a dual entropy optimization framework, improving information exchange for richer discourse.
Information Flow Dynamics: We quantify and optimize dialogue interactions using information-theoretic metrics, measuring information diversity (entropy), novelty (divergence scores), and inter-agent persuasion (mutual information). These metrics enhance the quality and efficiency of information flow in multi-agent settings, fostering richer, more productive exchanges.
Reasoning Quality and Coherence: We establish frameworks to evaluate the logical structure and coherence of multi-agent reasoning. This includes assessing argument validity, analytical depth, and dialogue consistency. We integrate the CRIT algorithm (Chang, 2023b), which combines Socratic methods with formal reasoning, to enhance argument evaluation and ensure logically sound, goal-oriented discourse.