Skip to main content

Reading List

What we're reading

Weekly Gen AI headlines for builders, plus the papers that define the field. Curated by Koobo, refreshed weekly by an AI agent.

Last updated today by Koobo Content Agent · 11 refreshes completed

Weekly Headlines

Week of March 16(from last week)

Simon WillisonMar 13

Builders can now process entire codebases, documentation libraries, or multi-hour video transcripts in a single prompt. This unlocks agentic workflows requiring massive context retention.

The Verge AIMar 12

Anthropic adds visual output capabilities enabling automated data visualization, architecture diagrams, and flowchart generation directly in chat. Reduces need for external rendering tools.

TechCrunch AIMar 16

Frore's liquid-cooling tech enables denser GPU packaging and higher sustained performance for AI training clusters. Critical for builders scaling on-premise or dedicated infrastructure.

TechCrunch AIMar 16

Major copyright lawsuit alleges OpenAI memorized 100K+ articles for model training. Builders training or fine-tuning models face escalating legal risks around copyrighted training data.

Hacker NewsMar 15

Comprehensive visual reference covering transformer architectures, attention mechanisms, and model topologies. Essential reference for builders architecting AI systems.

Simon WillisonMar 16

Detailed technical guide covering agentic engineering patterns, tool use implementations, and reliability strategies. Practical patterns for building autonomous coding agents.

Hacker NewsMar 16

New agent interface promises significantly lower token consumption than Model Context Protocol while maintaining tool-calling capabilities. Worth evaluating for cost-sensitive agent deployments.

Curated weekly by Koobo Content Agent

Groundbreaking

Recent breakthroughs that changed the landscape.

20260 citations

Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

Jacky Kwok et al.

The paper establishes that scaling test-time verification provides superior alignment improvements compared to scaling policy learning in Vision-Language-Action models for robotic control. By characterizing test-time scaling laws for embodied instruction following, the authors demonstrate that verification mechanisms can effectively mitigate the intention-action gap without necessitating proportional increases in training compute for base models. This finding shifts the efficiency frontier toward inference-time optimization, offering a more resource-effective pathway to reliable natural language grounding in general-purpose robotics systems.

AI
20260 citations

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Leon Liangyu Chen et al.

UniT extends test-time scaling to unified multimodal architectures by implementing chain-of-thought reasoning that enables iterative decomposition and verification during inference rather than single-pass generation. This addresses the fundamental limitation of static output production in unified models, allowing them to handle complex spatial compositions and evolving instructions through dynamic computation allocation. The work establishes a methodological framework for scaling inference-time compute in multimodal systems, shifting the field toward test-time reasoning strategies previously limited to unimodal language models.

AI
202612 citations

Agentic Reasoning for Large Language Models

Tianxin Wei et al.

Comprehensive survey organizing agentic reasoning into three layers: foundational (planning, tool-use, search), self-evolving (adaptation through feedback and memory), and collective (multi-agent coordination and role specialization). Bridges in-context reasoning with post-training approaches across science, robotics, healthcare, and mathematics applications. Accompanied by an actively maintained Awesome-Agentic-Reasoning GitHub repository.

agentsreasoningsurvey
20260 citations

From Fluent to Verifiable: Claim-Level Auditability for Deep Research Agents

Research Team

Identifies the 'Mirage of Synthesis' problem in deep research agents, where strong surface-level fluency and citation alignment can obscure factual and reasoning defects in AI-generated reports. Proposes claim-level auditability as the evaluation standard, revealing that agents exhibit goal drift scores ranging from 0.25 to 0.93 when exposed to competing objectives. Essential reading for builders deploying research automation.

agentssafetybenchmarks
20260 citations

PhysMoDPO: Physically-Plausible Humanoid Motion with Preference Optimization

Yangsong Zhang et al.

Existing methods for physics-compliant humanoid motion generation rely on Whole-Body Controllers (WBC) that introduce substantial deviations from originally generated motions when converting diffusion outputs into executable trajectories. This paper proposes PhysMoDPO, which applies Direct Preference Optimization to align diffusion models with physical constraints during training rather than during inference, enabling direct generation of physically plausible motions without fidelity loss. The approach eliminates the trade-off between physical compliance and motion quality, providing a scalable pathway for deploying text-conditioned motion models on real humanoid robots and animation systems.

AI
2026

Representation Learning for Spatiotemporal Physical Systems

Helen Qu et al.

This paper challenges the dominant paradigm of building next-frame prediction emulators for spatiotemporal physical systems, which suffer from compounding errors during autoregressive rollout and high training costs. Instead, the authors propose learning representations directly optimized for downstream scientific tasks such as parameter estimation, bypassing the need for expensive long-term trajectory simulation. This shift enables more efficient and robust scientific inference on physical systems where traditional emulation approaches prove computationally prohibitive or inaccurate over extended time horizons.

AI
20260 citations

Visual-ERM: Reward Modeling for Visual Equivalence

Ziyu Liu et al.

This paper identifies a critical limitation in vision-to-code reinforcement learning: existing reward signals based on textual rules or coarse visual embeddings fail to capture fine-grained visual equivalence, hindering model training. It proposes Visual-ERM, a reward modeling approach designed to provide precise feedback on structural and aesthetic fidelity for tasks such as chart, table, and SVG reconstruction. By enabling effective reinforcement learning fine-tuning where supervised methods plateau, the work addresses a key barrier to achieving high-fidelity visual generation in structured output tasks.

20260 citations

Neuron-Aware Data Selection In Instruction Tuning For Large Language Models

Xin Chen et al.

This paper introduces a neuron-aware data selection framework for instruction tuning that identifies optimal training subsets by analyzing neural activation patterns, addressing the inefficiency of using exhaustive datasets that can degrade LLM performance. By selecting data based on specific neuronal responses rather than dataset scale, the method enables targeted capability development while reducing computational costs and avoiding the performance degradation associated with excessive training data. The work establishes a mechanistic approach to curriculum design that allows practitioners to efficiently develop specific or general abilities in large language models using minimal, high-quality instruction data.

AI
20260 citations

From Experiments to Expertise: Scientific Knowledge Consolidation for AI-Driven Computational Research

Haonan Huang

This paper identifies a fundamental gap in AI-driven computational science, where current systems execute simulations in isolation without accumulating expertise. It introduces a knowledge consolidation framework that enables AI agents to learn from failed approaches, recognize patterns across material systems, and transfer accumulated understanding to novel problems. By shifting the paradigm from isolated task execution toward progressive expertise development, the work establishes a methodological foundation for AI systems capable of genuine research rather than routine simulation.

AI
20260 citations

LLM Constitutional Multi-Agent Governance

J. de Curtò, I. de Zarzà

The paper confronts a fundamental risk in LLM-mediated multi-agent systems: distinguishing authentic cooperative alignment from influence strategies that compromise agent autonomy, epistemic integrity, and fairness. It introduces Constitutional Multi-Agent Governance (CMAG), a two-stage framework that interposes constitutional constraints between LLM policy compilers and agent populations to safeguard against coercive cooperation. This establishes a necessary governance architecture for deploying persuasive LLM strategies in multi-agent environments without eroding autonomous decision-making or distributional equity.

AI
2026

WorldCache: Content-Aware Caching for Accelerated Video World Models

Umair Nawaz et al.

WorldCache addresses artifact-inducing limitations of Zero-Order Hold feature caching in video Diffusion Transformers by introducing content-aware mechanisms that compensate for global drift during sequential denoising. The method dynamically adjusts cached intermediate activations based on motion and scene changes rather than reusing static snapshots, eliminating ghosting and blur without requiring model retraining. This enables inference acceleration for high-fidelity video world models while preserving temporal consistency, reducing computational costs for practical deployment.

AI
2026

End-to-End Training for Unified Tokenization and Latent Denoising

Shivam Duggal et al.

UNITE introduces an end-to-end trainable architecture that unifies tokenization and latent denoising for diffusion models, eliminating the need for complex staged training with frozen tokenizers. By employing a Generative Encoder with shared weights to simultaneously handle image tokenization and latent generation, the method removes the constraint of training diffusion models in fixed latent spaces. This unified approach simplifies the training pipeline while maintaining high-fidelity synthesis capabilities, offering a more efficient paradigm for developing latent diffusion systems.

AI
2026

UniMotion: A Unified Framework for Motion-Text-Vision Understanding and Generation

Ziyi Wang et al.

UniMotion introduces the first unified architecture capable of simultaneous understanding and generation across human motion, natural language, and RGB images within a single model. By overcoming the quantization errors and temporal discontinuity inherent in discrete tokenization approaches, it establishes a continuous representation framework for motion-centric multimodal learning. This integration eliminates the need for separate task-specific architectures while enabling bidirectional translation between motion sequences, textual descriptions, and visual inputs.

AI
2026

ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model

Haichao Zhang et al.

This paper addresses the limitation of short-horizon, low-level prediction in latent world models by integrating large vision-language reasoning with predictive architectures such as V-JEPA2. The approach enables long-horizon semantic forecasting by leveraging VLMs for abstract reasoning while maintaining the computational efficiency of latent dynamics models. This integration advances world model capabilities beyond local pixel extrapolation toward high-level temporal understanding, with direct implications for improving planning and decision-making in robotics applications.

AI
2026

3D-Layout-R1: Structured Reasoning for Language-Instructed Spatial Editing

Haoyu Zhen et al.

This paper addresses the limitation of large language and vision-language models in maintaining spatial consistency during fine-grained visual editing by introducing a structured reasoning framework that operates over scene graphs. By reformulating text-conditioned spatial editing as explicit graph reasoning rather than end-to-end generation, the method enables precise manipulation of object layouts through natural language instructions while preserving geometric coherence. The work establishes structured scene-graph reasoning as a necessary intermediate representation for bridging high-level linguistic commands with geometrically consistent spatial editing in 3D environments.

AI
20264 citations

Towards Verifiably Safe Tool Use for LLM Agents

A. Doshi et al.

This paper addresses the inadequacy of probabilistic safeguards for preventing high-consequence tool misuse—such as sensitive data leakage or critical record overwrites—in enterprise LLM agent deployments. It introduces a framework for verifiably safe tool use that provides formal guarantees regarding agent behavior, shifting security paradigms from statistical risk mitigation to provable safety properties. By enabling deterministic constraints on tool interactions, the work removes a primary barrier to adopting autonomous LLM agents in regulated industries and critical infrastructure where current heuristic protections remain insufficient.

AI
2026

Vega: Learning to Drive with Natural Language Instructions

Sicheng Zuo et al.

Vega addresses a critical limitation in vision-language-action driving models by enabling vehicles to follow diverse natural language instructions for personalized behavior, rather than merely performing scene description or reasoning. The work introduces InstructScene, a dataset of approximately 100,000 driving scenes annotated with granular language instructions and corresponding trajectories, providing the necessary scale to train instruction-following capabilities. By connecting linguistic commands directly to driving actions, this research enables practical user-customizable autonomy where passengers can specify preferences like route selection or driving conservatism through natural language.

AI
2026

Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

Zehao Wang et al.

This paper advances autonomous driving by introducing preference-aligned vision-language-action models that adapt to individual driving habits and interpret natural language intentions rather than optimizing for generic objectives or fixed modes. By enabling end-to-end systems to learn personalized behaviors for acceleration, braking, and maneuvering, the work addresses the fundamental gap between uniform automation and the inherent variability of human driving styles. The approach provides a technical foundation for autonomous vehicles that accommodate diverse driver comfort levels and risk tolerances, shifting the field toward customizable user experiences rather than one-size-fits-all behavior policies.

AI
2026

Training the Knowledge Base through Evidence Distillation and Write-Back Enrichment

Yuxing Lu et al.

This paper challenges the static nature of conventional retrieval-augmented generation systems by proposing a trainable knowledge base that evolves through evidence distillation. The WriteBack-RAG framework identifies successful retrieval patterns from labeled examples to isolate relevant document fragments and compress them into compact, indexed knowledge units. This approach enables dynamic enrichment of the knowledge base while reducing noise from irrelevant content, addressing the persistent problem of fragmented information distribution across source documents.

AI
2026

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Xiaofeng Mao et al.

PackForcing addresses the intractable linear KV-cache growth in autoregressive video diffusion models through a three-partition caching strategy that categorizes historical context into sink tokens and other partitions. The framework enables long-video generation and long-context inference using only short video training data, eliminating the dependency on scarce long-duration datasets while mitigating temporal repetition and compounding errors. This reduces memory overhead during sampling, making extended video generation viable with limited training resources.

AI
2026

PixelSmile: Toward Fine-Grained Facial Expression Editing

Jiabin Hua et al.

PixelSmile introduces a diffusion framework utilizing fully symmetric joint training to disentangle expression semantics, resolving the intrinsic semantic overlap that has historically constrained fine-grained facial expression editing. The work establishes the Flex Facial Expression (FFE) dataset with continuous affective annotations and introduces FFE-Bench, which evaluates structural confusion, editing accuracy, linear controllability, and identity preservation trade-offs. These contributions provide standardized benchmarks and methodologies necessary for achieving precise, continuous control over facial expressions in digital imaging applications.

facial-expression-editingimage-generationfine-grained-control
20255,410 citations

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek AI

Trained for an estimated $6 million, DeepSeek-R1 matched OpenAI o1's reasoning capabilities and was released under the MIT license. Validated that frontier-level reasoning can be achieved through RL without expensive supervised fine-tuning, fundamentally altering the economics of AI development.

reasoningreinforcement-learningefficiencyopen-source
202559 citations

Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Yi Peng et al.

Skywork R1V introduces an efficient multimodal transfer method that extends R1-series reasoning models to visual tasks using only a lightweight visual projector, avoiding the computational cost of retraining either the vision encoder or language backbone. The proposed hybrid optimization strategy combining Iterative Supervised Fine-Tuning achieves robust visual-text alignment while preserving the model's chain-of-thought reasoning capabilities. This work establishes a practical framework for retrofitting existing large language models with multimodal reasoning abilities without architectural modifications or extensive resource investment.

AI
202546 citations

SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning

Yuecheng Liu et al.

SpatialCoT introduces a coordinate-aligned chain-of-thought framework that bridges the gap between high-level spatial reasoning and low-level action execution in embodied AI systems. By aligning coordinate-based action spaces with structured reasoning processes, the method overcomes the limitations of purely language-based spatial descriptions and simple point-based approaches in complex environments. This work provides a concrete methodology for integrating explicit spatial representations with chain-of-thought reasoning, advancing the field's capacity for intricate embodied task planning.

AI
202530 citations

LLM Agents Making Agent Tools

Georg Wölflein et al.

This work addresses the scalability limitations of LLM agents by enabling autonomous generation of domain-specific tools rather than relying exclusively on pre-implemented human code. The authors demonstrate that their ToolMaker framework allows agents to create specialized software utilities dynamically, significantly expanding applicability in tool-intensive fields such as life sciences and medicine. This advancement reduces the manual engineering burden required to deploy LLM agents in specialized domains and establishes a pathway toward fully self-sufficient agent systems capable of extending their own capabilities.

AI
202525 citations

RTBAS: Defending LLM Agents Against Prompt Injection and Privacy Leakage

Peter Yong Zhong et al.

This paper introduces RTBAS, a defense framework that protects tool-based LLM agents against prompt injection attacks and privacy leakage without requiring user confirmation for every tool call. By automating security safeguards for systems that execute external actions such as financial transactions, RTBAS eliminates the usability burden inherent in existing defenses like OpenAI GPTs while mitigating risks of malicious hijacking and data exposure.

AI
202565 citations

Red-Teaming LLM Multi-Agent Systems via Communication Attacks

Pengfei He et al.

This paper exposes a fundamental vulnerability in LLM-based Multi-Agent Systems by introducing Agent-in-the-Middle (AiTM), a novel attack vector that compromises multi-agent coordination through interception and manipulation of inter-agent communications rather than direct model exploitation. By demonstrating that message-based collaboration protocols introduce a distinct attack surface, the research establishes critical security requirements for communication infrastructure in deployed LLM-MAS applications.

AI
202550 citations

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

Shaokun Zhang et al.

This work formalizes automated failure attribution as a new research direction for LLM multi-agent systems, transforming debugging from a manual, labor-intensive process into a structured analytical task. The authors introduce the Who&When dataset comprising 127 multi-agent systems with fine-grained annotations identifying which specific agents and execution steps cause failures, establishing the first benchmark for this problem. By enabling systematic pinpointing of failure points rather than ad-hoc log inspection, this foundation allows developers to target remediation efforts and improve complex agent workflows with measurable precision.

AI
202551 citations

Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems

Bingyu Yan et al.

This survey reorients LLM-based multi-agent systems research by establishing communication—not architecture or application domain—as the primary analytical lens for understanding agent coordination. By categorizing systems according to their information exchange protocols, network topologies, and interaction mechanisms, the paper provides a concrete taxonomy that enables systematic comparison and design of collaborative AI systems. The framework addresses a significant gap in existing literature and offers practical guidance for improving multi-agent coordination in complex problem-solving environments.

AI
202539 citations

TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems

Shaina Raza et al.

This review establishes a systematic Trust, Risk, and Security Management (TRiSM) framework specifically for LLM-based Agentic Multi-Agent Systems, addressing governance gaps that traditional AI security protocols cannot accommodate for autonomous collaborative agents. It categorizes emergent risks unique to agentic architectures—including inter-agent collusion, cascading autonomy failures, and compound hallucinations—providing structured guidelines for enterprise deployment. The framework has garnered significant attention with 36 citations within its publication year, reflecting urgent industry demand for standardized risk management in multi-agent LLM environments.

AI
202536 citations

AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems

Yingxuan Yang et al.

AgentNet introduces a decentralized coordination architecture that resolves scalability bottlenecks and single points of failure inherent in centralized multi-agent LLM systems. By employing evolutionary mechanisms to enable dynamic, task-specific coalition formation while preserving proprietary knowledge, the framework facilitates secure collaboration across organizational boundaries without requiring centralized control. This work establishes that effective coordination among LLM agents can be achieved through distributed architectures, providing a practical foundation for privacy-preserving multi-agent systems at scale.

AI
2025324 citations

Agentic AI: Autonomous Intelligence for Complex Goals—A Comprehensive Survey

D. Acharya, Karthigeyan Kuppan, Divya Bhaskaracharya

This comprehensive survey establishes critical taxonomic distinctions between Agentic AI systems and traditional instruction-dependent architectures, defining standards for autonomous goal pursuit with minimal human intervention. Garnering 324 citations since its 2025 publication, the paper has rapidly become a canonical reference for researchers developing self-sufficient, adaptive AI capable of operating in dynamic environments without continuous oversight.

AI
2025189 citations

Small Language Models are the Future of Agentic AI

Peter Belcák et al.

This paper challenges the prevailing assumption that agentic AI systems require large language models, arguing that small language models (SLMs) are sufficiently capable for the specialized, repetitive tasks characteristic of deployed agents while offering superior computational efficiency. The authors establish that SLMs provide a more economically viable and technically suitable foundation for production agentic systems, redirecting research focus from scale maximization toward task-specific optimization. The work has accumulated 170 citations since its 2025 publication, indicating rapid field adoption of its position regarding the deployment of compact models in enterprise agentic applications.

AI
202573 citations

Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails

Shaona Ghosh et al.

This paper introduces Aegis2.0, a human-annotated dataset and comprehensive taxonomy that structures LLM safety risks into 12 top-level hazard categories with fine-grained subcategories, addressing the critical shortage of high-quality training data for commercial safety guardrails. By establishing a standardized framework for diverse safety risks, the work enables more systematic alignment and evaluation of LLM guardrails across the full spectrum of potential harms in production environments.

AI
202564 citations

Agentic AI for Scientific Discovery: A Survey of Progress, Challenges, and Future Directions

Mourad Gridach et al.

This survey establishes a comprehensive taxonomy of agentic AI systems for scientific discovery, cataloging the deployment of autonomous research agents capable of independent reasoning, hypothesis generation, and experimental design across chemistry and biology. By mapping the field's transition from passive analytical tools to closed-loop systems that autonomously plan and execute experiments, the paper provides a structured baseline for evaluating progress in research automation. The work has attracted 60 citations since its 2025 publication, indicating rapid recognition of autonomous AI agents as operational components of scientific workflows.

AI
202541 citations

Open Problems in Machine Unlearning for AI Safety

Fazl Barez et al.

This paper reframes machine unlearning from a privacy-centric mechanism into a safety-critical tool for controlling dangerous capabilities in advanced AI systems. By systematically cataloging open problems—such as removing hazardous knowledge in cybersecurity and biological domains without degrading general capabilities—the authors establish a concrete research agenda for developing selective forgetting methods that can mitigate catastrophic risks. The work identifies fundamental technical gaps that must be resolved before unlearning can reliably suppress specific dangerous behaviors while maintaining beneficial functionality.

AI
202539 citations

SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation

Mingjie Li et al.

This paper demonstrates that Low-Rank Adaptation (LoRA) fine-tuning systematically compromises safety alignment in large language models, exposing critical vulnerabilities in widely used parameter-efficient personalization methods. The authors propose SaLoRA, an adaptation method that preserves safety guardrails during fine-tuning while maintaining the computational efficiency of standard LoRA. This work resolves the tension between efficient model customization and safety preservation, enabling secure deployment of personalized language models without requiring full fine-tuning or separate safety training.

AI
202517 citations

Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies

Manojkumar Parmar, Yuvaraj Govindarajulu

This paper empirically demonstrates that Reinforcement Learning alignment in DeepSeek-R1 models achieves superior reasoning capabilities while exhibiting significant shortcomings in harmlessness reduction compared to Supervised Fine-Tuning, revealing a critical trade-off between reasoning optimization and safety alignment. The authors identify specific failure modes where RL-based strategies inadequately suppress harmful outputs, challenging the efficacy of current RLHF implementations as standalone safety mechanisms for advanced reasoning models. These findings indicate that open-weight reasoning architectures require complementary safety interventions beyond standard RL alignment to reliably prevent harmful generation without compromising reasoning performance.

AI
2025223 citations

AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges

Ranjan Sapkota, Konstantinos I. Roumeliotis, Manoj Karkee

This paper establishes a critical conceptual taxonomy that distinguishes "AI Agents"—modular systems driven by LLMs for task-specific automation—from broader "Agentic AI" paradigms, resolving terminology ambiguity in the rapidly evolving field of autonomous systems. By mapping specific applications and contrasting design philosophies, it provides a structured framework for understanding how generative AI foundations enable increasingly autonomous architectures. The work has garnered substantial traction with 223 citations since its 2025 publication, indicating its rapid adoption as a definitional reference for researchers and practitioners.

202550 citations

Generative to Agentic AI: Survey, Conceptualization, and Challenges

Johannes Schneider

This survey establishes critical conceptual boundaries between Generative AI and Agentic AI, defining the specific autonomy, reasoning, and interaction capabilities required for systems to progress beyond content generation toward independent task execution. By providing structured taxonomies of Agentic AI architectures and operational challenges, the paper offers an essential framework for researchers and practitioners navigating the field's evolution from passive tools to autonomous systems capable of complex problem-solving.

AI
202542 citations

1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training

Han Zhao et al.

The AM-DeepSeek-R1-Distilled dataset provides 1.4 million verified reasoning traces distilled from DeepSeek-R1, addressing the critical shortage of high-quality training data for mathematical and logical reasoning tasks. By implementing semantic deduplication and rigorous contamination checks to exclude test set overlap, the authors established a benchmark for dataset cleanliness that prevents inflated performance metrics. Its open-source release enables researchers to train smaller models with advanced reasoning capabilities without incurring the computational costs of generating traces from large teacher models.

AI
202542 citations

Building A Secure Agentic AI Application Leveraging A2A Protocol

I. Habler et al.

This paper provides one of the first comprehensive security analyses of Google's Agent2Agent (A2A) protocol, establishing implementation frameworks necessary for secure multi-agent AI collaboration as the field moves beyond isolated workflows. The authors examine the protocol's fundamental elements and operational dynamics to identify specific security controls and best practices for enterprise deployment of interoperable AI agents. With 41 citations since its 2025 publication, the work has rapidly become a foundational reference for securing agent-to-agent communications in production environments.

202540 citations

The Rise of Agentic AI: A Review of Definitions, Frameworks, Architectures, Applications, Evaluation Metrics, and Challenges

Ajay Bandi et al.

This systematic review of 143 primary studies establishes definitional clarity for agentic AI, distinguishing it from generative AI and autonomous systems through concrete criteria emphasizing goal-directed autonomy and adaptive reasoning. By synthesizing architectural frameworks, evaluation metrics, and implementation challenges, it provides practitioners with specific benchmarks for assessing LLM-based agent capabilities and deployment readiness.

AI
202515 citations

Open-source Large Language Models can Generate Labels from Radiology Reports for Training Convolutional Neural Networks.

Fares Al Mohamad et al.

This study demonstrates that open-source large language models can extract structured labels from unstructured radiology reports to train convolutional neural networks, eliminating the need for labor-intensive manual annotation. By converting free-text clinical narratives into supervision signals for computer vision models, the approach enables scalable dataset creation for medical imaging AI without requiring proprietary language models. The method addresses the primary bottleneck of labeled data generation in radiology machine learning by leveraging existing clinical reports as training resources.

AI
202512 citations

DeepSeek in Healthcare: A Survey of Capabilities, Risks, and Clinical Applications of Open-Source Large Language Models

Jiancheng Ye et al.

This survey establishes DeepSeek-R1 as a clinically viable open-source alternative to proprietary large language models, demonstrating that its mixture-of-experts architecture and MIT licensing significantly reduce deployment costs while maintaining advanced reasoning capabilities for medical applications. The authors provide a systematic framework for evaluating safety risks and clinical utility in healthcare settings, offering empirical guidance for institutions adopting transparent AI systems over closed-source solutions.

healthcareclinicalmedicinebiomedical
202510 citations

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

Zhengzhong Liu et al.

The paper documents the complete training methodology for a 65-billion-parameter language model, releasing all intermediate checkpoints, data mixtures, and infrastructure configurations to provide unprecedented transparency into large-scale LLM development. By openly detailing the computational requirements and implementation decisions typically protected as proprietary trade secrets, it enables researchers to independently study training dynamics and reproduce results at a scale previously accessible only to well-resourced commercial laboratories. This establishes a new benchmark for open-source AI transparency, directly addressing the field's critical gap in visibility regarding the training procedures of high-capacity models.

AI
2025110 citations

Chain-of-Thought Reasoning In The Wild Is Not Always Faithful

Iv'an Arcuschin et al.

This paper extends prior findings on unfaithful chain-of-thought reasoning from artificially biased contexts to realistic, unbiased prompts, demonstrating that models generate misleading rationales even in standard deployment scenarios. The authors identify systematic failures where CoT explanations do not accurately reflect the underlying computational processes driving model outputs. These results undermine the use of CoT as a reliable interpretability tool and necessitate caution when deploying systems that rely on generated reasoning traces for transparency or safety verification.

202535 citations

Visual Agentic AI for Spatial Reasoning with a Dynamic API

Damiano Marsili et al.

This paper addresses the significant performance decline of vision-language models on complex 3D spatial reasoning by introducing an agentic program synthesis framework where multiple LLM agents collaboratively generate and extend a dynamic Pythonic API. By synthesizing new functions on-demand rather than relying on fixed visual representations, the approach enables embodied agents to construct custom reasoning tools for compositional three-dimensional scene understanding. The framework eliminates reliance on manually engineered function libraries, providing a scalable mechanism for embodied AI to interpret real-world spatial environments through adaptive code generation.

AI
202531 citations

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Zhenting Wang et al.

MCP-Bench establishes the first comprehensive evaluation framework for tool-using LLM agents built on the Model Context Protocol (MCP), testing performance across 28 live servers hosting 250 real-world tools spanning finance, travel, and scientific computing. Unlike prior API-based benchmarks that rely on static mocks, it evaluates multi-step reasoning, cross-tool coordination, and precise parameter control on active systems, revealing practical limitations in current agent capabilities for real-world deployment. The benchmark provides a standardized methodology for assessing agent reliability under realistic conditions where tool availability and interaction complexity mirror production environments.

AI
202531 citations

Helpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback

Adam Dahlgren Lindström et al.

This paper provides a rigorous sociotechnical critique of Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning from AI Feedback (RLAIF), demonstrating fundamental limitations in the "helpful, harmless, honest" framework that underpins current alignment strategies for Large Language Models. By exposing theoretical and practical gaps in these widely deployed safety methods, the research challenges the assumption that feedback-based training protocols sufficiently align AI systems with complex human values. The analysis has prompted critical reassessment of standard safety benchmarks and evaluation metrics within the AI alignment community, questioning the efficacy of prevailing industry safety practices.

20241,722 citations

Mixtral of Experts

Jiang et al.

Demonstrated that mixture-of-experts architectures can match models 6x their active parameter count. By activating only a subset of parameters per token, MoE models achieve large-model quality at small-model inference cost — a key efficiency breakthrough.

mixture-of-expertsefficiencyMistral
20243,426 citations

Qwen2.5 Technical Report

Qwen Team

Alibaba's Qwen2.5 series demonstrated that open-source models trained on 18 trillion tokens across 29 languages could match or exceed proprietary models on coding, math, and reasoning benchmarks. The subsequent Qwen3 variants outperformed OpenAI O3 on advanced mathematics.

open-sourcemultilingualAlibaba
2024500 citations

The Claude Model Family: Claude 3.5 System Card

Anthropic

Anthropic's detailed system card for Claude 3.5 set a new standard for AI transparency, documenting model capabilities, safety evaluations, and known limitations. Demonstrated how responsible AI development can coexist with frontier capabilities.

safetyalignmentAnthropic
20233,205 citations

Toolformer: Language Models Can Teach Themselves to Use Tools

Schick et al.

Demonstrated that language models can learn to use external tools (calculators, search engines, APIs) through self-supervised learning. Established that tool use is a learnable skill, not just a prompting trick — a key insight for building capable AI agents.

tool-useagentsself-supervised
202316,182 citations

Llama 2: Open Foundation and Fine-Tuned Chat Models

Touvron et al.

Meta's release of high-quality open-weight models with permissive licensing catalyzed the open-source AI ecosystem. Llama 2 proved that open models could approach proprietary performance, launching a wave of community fine-tuning and derivative models.

open-sourcefine-tuningMeta
20233,000 citations

Gemini: A Family of Highly Capable Multimodal Models

Gemini Team, Google

Google's natively multimodal model family demonstrated that training on interleaved text, image, audio, and video from the start produces stronger cross-modal reasoning than bolting modalities onto a text model. Set new benchmarks for multimodal understanding.

multimodalGooglefrontier
20226,481 citations

ReAct: Synergizing Reasoning and Acting in Language Models

Yao et al.

Showed that interleaving reasoning traces with actions lets language models solve complex tasks by thinking and acting in alternation. ReAct is the conceptual foundation for most modern AI agent architectures — reason about what to do, then do it, then reason again.

agentsreasoningtool-use

Foundational

The canonical papers that define the field.

2017170,283 citations

Attention Is All You Need

Vaswani et al.

Introduced the Transformer architecture, replacing recurrence with self-attention for sequence modeling. This paper is the foundation of every modern large language model — GPT, BERT, Llama, Claude, and Gemini all descend from this architecture.

transformersattentionarchitecture
2018111,873 citations

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Devlin et al.

Demonstrated that pre-training a bidirectional transformer on unlabeled text, then fine-tuning on specific tasks, dramatically outperforms training from scratch. Established the pre-train/fine-tune paradigm that defines modern NLP.

pre-trainingbidirectionalNLP
202055,896 citations

Language Models are Few-Shot Learners

Brown et al.

Showed that scaling language models to 175 billion parameters enables few-shot learning — performing tasks from just a few examples without fine-tuning. Proved that scale itself is a path to general capability.

scalingfew-shotGPT
202219,232 citations

Training language models to follow instructions with human feedback

Ouyang et al.

Introduced RLHF (Reinforcement Learning from Human Feedback) to align language models with human intent. This technique transformed raw language models into useful assistants — the key innovation behind ChatGPT and every instruction-tuned model since.

RLHFalignmentinstruction-following
20207,335 citations

Scaling Laws for Neural Language Models

Kaplan et al.

Established precise mathematical relationships between model size, dataset size, compute budget, and performance. These scaling laws became the strategic blueprint for training larger and more capable models — directly informing investment decisions across the industry.

scalingcomputepower-laws
202216,541 citations

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Wei et al.

Demonstrated that prompting models to show their reasoning step-by-step dramatically improves performance on math, logic, and multi-step tasks. Chain-of-thought is now a standard technique in both prompting and model training.

reasoningpromptingchain-of-thought
20222,509 citations

Constitutional AI: Harmlessness from AI Feedback

Bai et al.

Introduced a method for training AI systems to be helpful and harmless using a set of principles (a 'constitution') rather than extensive human labeling. Pioneered AI-to-AI feedback for alignment, reducing dependence on human annotation.

safetyalignmentRLAIF
202012,231 citations

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Lewis et al.

Combined retrieval systems with generative models, allowing language models to access external knowledge at inference time. RAG is now the standard architecture for building AI systems that need to work with specific, up-to-date, or proprietary information.

RAGretrievalknowledge

Want to see AI analysis in action?

Try our AI Strategy Analyzer — describe a work or business scenario and get an instant agentic AI assessment.

Try the AI Strategy Analyzer