ArXiv AI: Weekly Top Picks

1766007197438

Coverage: 2026-04-12 → 2026-04-19

This week in AI papers

We keep an eye on new AI papers on arXiv, pick one or two that really matter each day, and share the key ideas — no hype, just clear explanations.

Unpacked by our trio: Alex the plain-language host, Marc the hands-on power user, and Jamie the senior ML engineer.

LLM Daily – EvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

2026-04-16 · 10 min · 14.3 MB

Excerpt — Anthropic proposes the concept of skills for LLM agents to tackle multi-step professional tasks that simple tool invocations cannot address. A tool is a single, self-contained function, whereas a skill is a structured…

LLM Daily – EvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

📝 Article 📄 PDF

LLM Daily – RuleForge: Automated Generation and Validation for Web Vulnerability Detection a

2026-04-16 · 10 min · 11.3 MB

Excerpt — Security teams face a challenge: the volume of newly disclosed Common Vulnerabilities and Exposures (CVEs) far exceeds the capacity to manually develop detection mechanisms. In 2025, the National Vulnerability Database…

LLM Daily – RuleForge: Automated Generation and Validation for Web Vulnerability Detection a

📝 Article 📄 PDF

LLM Daily – Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Polici

2026-04-15 · 10 min · 11.1 MB

Excerpt — Test-Time Learning (TTL) enables language agents to iteratively refine their performance through repeated interactions with the environment at inference time. At the core of TTL is an adaptation policy that updates the…

LLM Daily – Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Polici

📝 Article 📄 PDF

LLM Daily – Reasoning-Driven Synthetic Data Generation and Evaluation

2026-04-14 · 10 min · 6.6 MB

Excerpt — Although many AI applications of interest require specialized multi-modal models, relevant data to train such models is inherently scarce or inaccessible. Filling these gaps with human annotators is prohibitively…

LLM Daily – Reasoning-Driven Synthetic Data Generation and Evaluation

📝 Article 📄 PDF

LLM Daily – An Isotropic Approach to Efficient Uncertainty Quantification with Gradient Norm

2026-04-14 · 10 min · 7.4 MB

Excerpt — Existing methods for quantifying predictive uncertainty in neural networks are either computationally intractable for large language models or require access to training data that is typically unavailable. We derive a…

LLM Daily – An Isotropic Approach to Efficient Uncertainty Quantification with Gradient Norm

📝 Article 📄 PDF

LLM Daily – ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation

2026-04-13 · 10 min · 10.8 MB

Excerpt — Interleaved text-and-image generation represents a significant frontier for Multimodal Large Language Models (MLLMs), offering a more intuitive way to convey complex information. Current paradigms rely on either image…

LLM Daily – ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation

📝 Article 📄 PDF

LLM Daily – When Is Collective Intelligence a Lottery? Multi-Agent Scaling Laws for Memetic

2026-04-12 · 10 min · 10.9 MB

Excerpt — Multi-agent systems powered by large language models (LLMs) are increasingly deployed in settings that shape consequential decisions, both directly and indirectly. Yet it remains unclear whether their outcomes reflect…

📝 Article 📄 PDF

Listen on Spotify (EN) Copy RSS (EN) Listen on Spotify (FR) Copy RSS (FR)

ArXiv AI: Weekly Top Picks

This week in AI papers

LLM Daily – EvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

LLM Daily – RuleForge: Automated Generation and Validation for Web Vulnerability Detection a

LLM Daily – Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Polici

LLM Daily – Reasoning-Driven Synthetic Data Generation and Evaluation

LLM Daily – An Isotropic Approach to Efficient Uncertainty Quantification with Gradient Norm

LLM Daily – ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation

LLM Daily – When Is Collective Intelligence a Lottery? Multi-Agent Scaling Laws for Memetic

Read more

Your Bankers Are Ready. Your Bank Isn't.

One Line in Shanghai: What Xi's AI Speech Tells European Banks Betting on Chinese Open Models

Article 50 Goes Live in Five Days — and It Stopped Being a Legal Problem

Stop Waiting: This Is the Best Time to Hire Junior Talent.