We keep an eye on new AI papers on arXiv, pick one or two that really matter each day, and share the key ideas — no hype, just clear explanations.
2026-03-29 · 10 min · 13.0 MB
Excerpt — Heart diseases remain a leading cause of morbidity and mortality worldwide, necessitating accurate and trustworthy differential diagnosis. However, existing artificial intelligence-based diagnostic methods are often…
2026-03-29 · 10 min · 11.8 MB
Excerpt — Agent performance increasingly depends on harness engineering, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific…
2026-03-28 · 10 min · 13.5 MB
Excerpt — Applying large, proprietary API-based language models to text-to-SQL tasks poses a significant industry challenge: reliance on massive, schema-heavy prompts results in prohibitive per-token API costs and high latency,…
2026-03-28 · 10 min · 13.2 MB
Excerpt — LLM agents like Claude Code can not only write code but also be used for autonomous AI research and engineering rank2026posttrainbench, novikov2025alphaevolve. We show that an autoresearch-style pipeline…
2026-03-27 · 10 min · 15.3 MB
Excerpt — Context. Nowadays, artificial intelligence agent systems are transforming from single-tool interactions to complex multi-agent orchestrations. As a result, two competing communication protocols have emerged: a tool…
2026-03-27 · 10 min · 13.9 MB
Excerpt — Retrieval-Augmented Generation (RAG) improves the reliability of large language model applications by grounding generation in retrieved evidence, but it also introduces a new attack surface: corpus poisoning. In this…
2026-03-26 · 10 min · 13.4 MB
Excerpt — Empowering large language models with long-term memory is crucial for building agents that adapt to users' evolving needs. However, prior evaluations typically interleave preference-related dialogues with irrelevant…
2026-03-26 · 10 min · 11.5 MB
Excerpt — We introduce a new agentic artificial intelligence (AI) platform for portfolio management. Our architecture consists of three layers. First, two large language model (LLM) agents are assigned specialized tasks: one…
2026-03-25 · 10 min · 12.9 MB
Excerpt — Agentic multimodal large language models (MLLMs) (e.g., OpenAI o3 and Gemini Agentic Vision) achieve remarkable reasoning capabilities through iterative visual tool invocation. However, the cascaded perception,…
2026-03-25 · 10 min · 11.4 MB
Excerpt — Large Language Models (LLMs) are deployed in high-stakes settings but can show demographic, gender, and geographic biases that undermine fairness and trust. Prior debiasing methods, including embedding-space…
2026-03-24 · 10 min · 12.0 MB
Excerpt — We wish to measure the information coverage of an ad hoc retrieval algorithm, that is, how much of the range of available relevant information is covered by the search results. Information coverage is a central aspect…
2026-03-23 · 10 min · 13.0 MB
Excerpt — Explainable AI (XAI) research has experienced substantial growth in recent years. Existing XAI methods, however, have been criticized for being technical and expert-oriented, motivating the development of more…
2026-03-23 · 10 min · 11.3 MB
Excerpt — As large language models (LLMs) evolve into autonomous agents, persistent memory at the API layer is essential for enabling context-aware behavior across LLMs and multi-session interactions. Existing approaches force…
2026-03-22 · 10 min · 11.6 MB
Excerpt — While scaling individual Large Language Models (LLMs) has delivered remarkable progress, the next frontier lies in scaling collaboration through multi-agent systems (MAS). However, purely autonomous MAS remain ''closed-…
2026-03-22 · 10 min · 12.4 MB
Excerpt — Large Language Model (LLM)-based coding agents show promise in automating software development tasks, yet they frequently fail in ways that are difficult for developers to understand and debug. While general-purpose…