We keep an eye on new AI papers on arXiv, pick one or two that really matter each day, and share the key ideas — no hype, just clear explanations.
2026-04-05 · 10 min · 7.0 MB
Excerpt — Steering vectors offer a training-free mechanism for controlling reasoning behaviors in large language models, but constructing effective vectors requires identifying genuine behavioral signals in the model's hidden…
2026-04-05 · 10 min · 7.1 MB
Excerpt — Large Language Model (LLM)-based agents have achieved notable success on short-horizon and highly structured tasks. However, their ability to maintain coherent decision-making over long horizons in realistic and dynamic…
2026-04-04 · 10 min · 12.1 MB
Excerpt — Regulatory documents encode legally binding obligations that LLM-based systems must respect. Yet converting dense, hierarchically structured legal text into machine-readable rules remains a costly, expert-intensive…
2026-04-03 · 10 min · 14.0 MB
Excerpt — Retrieval-augmented generation (RAG) improves language model (LM) performance by providing relevant context at test time for knowledge-intensive situations. However, the relationship between parametric knowledge…
2026-04-03 · 10 min · 11.0 MB
Excerpt — Rerankers play a pivotal role in refining retrieval results for Retrieval-Augmented Generation. However, current reranking models are typically optimized on static human annotated relevance labels in isolation,…
2026-04-02 · 10 min · 8.3 MB
Excerpt — Long-horizon dialogue systems suffer from semanticdrift and unstable memory retention across extended sessions. This paper presents a Multi-Layer Memory Framework that decomposes dialogue history into working, episodic,…
2026-04-02 · 10 min · 6.7 MB
Excerpt — Existing benchmarks measure capability -- whether a model succeeds on a single attempt -- but production deployments require reliability -- consistent success across repeated attempts on tasks of varying duration. We…
2026-04-01 · 10 min · 10.5 MB
Excerpt — Accurate privacy evaluation of textual data remains a critical challenge in privacy-preserving natural language processing. Recent work has shown that large language models (LLMs) can serve as reliable privacy…
2026-04-01 · 10 min · 14.0 MB
Excerpt — Large language models (LLMs) facilitate the development of autonomous agents. As a core component of such agents, task planning aims to decompose complex natural language requests into concrete, solvable sub-tasks.…
2026-04-01 · 10 min · 8.4 MB
Excerpt — Cloud-native software delivery platforms orchestrate releases through complex, multi-stage pipelines composed of dozens of independently versioned tasks. When code is promoted between environments -- development to…
2026-03-30 · 10 min · 6.2 MB
Excerpt — Artificial intelligence is increasingly catalyzing scientific automation, with multimodal large language model (MLLM) agents evolving from lab assistants into self-driving lab operators. This transition imposes…
2026-03-30 · 10 min · 7.3 MB
Excerpt — Reinforcement Learning from Human Feedback (RLHF) has become the standard for aligning Large Language Models (LLMs), yet its efficacy is bottlenecked by the high cost of acquiring preference data, especially in low-…
2026-03-29 · 10 min · 13.0 MB
Excerpt — Heart diseases remain a leading cause of morbidity and mortality worldwide, necessitating accurate and trustworthy differential diagnosis. However, existing artificial intelligence-based diagnostic methods are often…
2026-03-29 · 10 min · 11.8 MB
Excerpt — Agent performance increasingly depends on harness engineering, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific…