The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Mitosis Detection in the Wild: Multi-Tumor and Context-Aware Generalization in the MIDOG 2025 Challenge

Automated mitosis detection is a well-established task in computational pathology. While previous benchmarks focused on scanner-induced domain shift, clinical "real-world" application requires models to be robust across the vast variance to be expected in the histological landscape. The MItosis DOmain Generalization (MIDOG) 2025 challenge was designed to evaluate algorithmic performance across unprecedented biological and contextual diversity. We curated a test dataset of 365 cases, encompassing 12 distinct human, canine and feline tumor types, digitized across multiple scanning platforms. Mo...

Marc Aubreville·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Self-evolving LLM agents with in-distribution Optimization

Large Language Models (LLMs) have recently emerged as powerful controllers for interactive agents in complex environments, yet training them to perform reliable long-horizon decision making remains a fundamental challenge. A key difficulty lies in credit assignment: agents often receive delayed rewards only at the end of episodes. In this paper, we propose Q-Evolve, a self-evolving framework for LLM agents that unifies automatic process-reward labeling and policy learning within a principled in-distribution reinforcement learning paradigm. In each evolving iteration, our method learns an in-d...

Yudi Zhang·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Dash2Sim: Closed-Loop Driving Simulation from in-the-wild Dashcam Videos

Self-driving simulations typically rely on data collected in a small number of cities or on hand-authored synthetic scenarios. Dashcam videos cover a far broader range of locations and situations, including rare or long-tailed scenarios. They are considered less usable for simulation because it is difficult to recover accurate 4D scenes from monocular in-the-wild videos. Work zones are one such class of long-tailed situations that dashcams capture. We present Dash2Sim, a framework that turns in-the-wild monocular dashcam videos into metric, geo-referenced 4D driving logs compatible with exist...

Anurag Ghosh·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A robust PPG foundation model using multimodal physiological supervision

Photoplethysmography (PPG), a non-invasive measure of changes in blood volume, is widely used in both wearable devices and clinical settings. Recent PPG foundation models either use open-source ICU datasets with pretraining paradigms that require curated data and thus complicate generalization to field-like data, or use closed-source field-like PPG data. In contrast, we propose a PPG foundation model that does not require high-quality or field-like pretraining data, and instead leverages accompanying electrocardiogram and respiratory signals in ICU datasets to select contrastive samples durin...

Eloy Geenjaar·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Breaking the Ice: Analyzing Cold Start Latency in vLLM

As scalable inference services become popular, the cold start latency of an inference engine becomes important. Today, vLLM has evolved into the de facto inference engine of choice for many inference workloads. Although popular, due to its complexity and rapid evolution, there has not been a systematic study of its startup latency. With major architectural innovations such as the V1 API and the introduction of torch.compile, this paper presents the first detailed performance characterization of vLLM startup latency. We break down the startup process into six foundational steps and demonstrate...

Huzaifa Shaaban Kabakibo·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

DirectAudioEdit: Inversion-Free Text-Guided Audio Editing via Diffusion Prediction Contrast

Text-guided audio editing aims to modify the language-specified acoustic content while preserving edit-irrelevant source components. Existing training-free methods typically rely on inversion-based editing. While inversion-free editing is appealing as it decreases computational overhead and reconstruction errors, it remains largely unexplored for audio editing. The key challenge is to construct a source-to-target editing path through diffusion denoising dynamics. In this paper, we introduce DirectAudioEdit, the first attempt to develop a training-free and inversion-free method for audio editi...

Zhengkun Ge·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SleepExplain: Explainable Non-Rapid Eye Movement and Rapid Eye Movement Sleep Stage Classification from EEG Signal

Classification of sleep stages is one of the most important diagnostic approaches for a variety of sleep-related disorders. Electroencephalography (EEG) is regarded as a powerful tool for examining the association between neurological effects and sleep phases since it correctly identifies sleep-related neurological alterations. During Non-Rapid Eye Movement (NREM) and Rapid Eye Movement (REM) sleep phases, a number of nerve and bodily functions are affected and therefore hold an important role both in their functionalities. This work aims to classify NREM and REM sleep stages from sleep EEG d...

Rafsan Jany·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TabSwift: An Efficient Tabular Foundation Model with Row-Wise Attention

Tabular foundation models, exemplified by TabPFN, perform prediction via in-context learning, inferring test labels directly from labeled training examples. They have demonstrated competitive performance, particularly on small-to-medium datasets. However, recent tabular foundation models often improve accuracy with increasingly complex architectures, incurring higher inference cost and limiting practical deployment. In this work, we revisit the original TabPFN design and show that a lightweight row-wise attention-only backbone can remain highly competitive with two simple enhancements: a gate...

Si-Yang Liu·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LLM-Guided Evolution for Medical Decision Pipelines

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt and pipeline engineering. We study LLM-guided MAP-Elites evolution as an inference-time alternative for discovering medical decision strategies and provide an implementation repository at https://github.com/univanxx/llm_guided_evo_medical. We formulate urgency triage, interactive consultation, and medical image classification as evolutionary searches over executable artifacts optimized by task-specific fitness functions. Across all three settings, evolution improves over manually des...

Ivan Sviridov·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling

Harmony is a compact symbolic layer where mathematical pitch relations, acoustic consonance, and musical convention meet. This report treats chord-symbol sequences not as a complete representation of music, but as an interpretable, controllable time series for genre-local harmonic modeling. Starting from a frozen pop-jazz Music Transformer checkpoint, I evaluate how far small adaptation interfaces can extend the model to eleven target genres: blues, bossa nova, Bach chorales, country, electronic, folk, funk, gospel, hip-hop, R&B/soul, and rock. The main evaluation compares LoRA, IA3, BitFit, ...

Jinju Lee·19 days ago

TechCrunch AI· PRESS

The token bill comes due: Inside the industry scramble to manage AI’s runaway costs

"The whole conversation shifted from tokenmaxxing and 'go fast' to 'we need guardrails, how do we control this?'"

Rebecca Bellan·19 days ago

Google AI (Gemma)· FRONTIER

The latest AI news we announced in May 2026

Here are Google’s latest AI updates from May 2026

{"$":{"xmlns:author":"http://www.w3.org/2005/Atom"},"name":["The Keyword Team"],"title":[""],"department":[""],"company":[""]}·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Temporal Spatial Minimax Rate for Smoothly-Varying Distributions in Wasserstein Space

We study the minimax rate of estimating a future value $μ_{t_n+h}$ of a curve $t\mapstoμ_t$ in the $2$-Wasserstein space $\mathcal{P}_2(\mathbb{R}^d)$ from finitely many noisy snapshots of its past, under an adiabatic bound $\|\nabla_t^k v\|\le\varepsilon$ on the $k$-th covariant derivative of the velocity field. Our central result is a unified temporal-spatial minimax lower bound: over regular, locally transport-rich subclasses, every estimator incurs $W_2$-risk with $M$-exponent $γ_d(k+1)/(k+1+γ_d)$, $γ_d=\min(1/d,1/2)$ ($M$ the total sample size). It follows from a temporal-to-spatial redu...

Munsik Kim·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Hierarchical Certified Semantic Commitment for Byzantine-Resilient LLM-Agent Collaboration

Byzantine collaboration among large-language-model agents requires a finality-control primitive: given delivered stochastic, structured natural-language proposals, the protocol must decide whether the round supports a commit, what kind of commit, or a typed safe abort. Naive aggregation hides this choice behind a single verdict; classical Byzantine fault tolerance hides it behind byte-identity that LLM proposals do not satisfy. We introduce Hierarchical Certified Semantic Commitment (H-CSC), a BFT-inspired protocol that converts embedding-derived finality signals over verdict-conditioned prop...

Haoran Xu·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SV-Detect: AI-generated Text Detection with Steering Vectors

Detecting machine-generated text is especially difficult under distribution shift, such as transfer across domains, source models, and editing attacks. We propose a fake-text detector based on steering vectors extracted from the hidden representations of a frozen language model. At each layer, we construct a direction that separates human-written from machine-generated text, and represent each input by its layer-wise alignment with these directions. A lightweight classifier trained on these projection features yields the final detection score. Our method achieves strong performance both in-di...

Mikhail Vishnyakov·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CULTURESCORE: Evaluating Cultural Faithfulness in Video Generation Models

As video generation models like Veo 3.1 and LTX-2 advance, their ability to accurately represent diverse global cultures remains a critical yet understudied frontier. Current metrics, such as VideoScore, only measure visual quality but offer no mechanism for assessing cultural faithfulness. Consequently, a model that replaces a Namaste with a handshake receives the same score as one that generates the gesture correctly. We propose CultureScore, a compositional evaluation framework that decomposes cultural faithfulness into three granular dimensions: Identity (who is represented), Context (cul...

Anku Rani·19 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Acoustic Cue Alignment in Audio Language Models for Speech Emotion Recognition

Instruction-following audio language models (ALMs) can be augmented with explicit acoustic cues, yet it remains unclear whether such cues are used in a grounded way when the raw audio is already available. We study this question in speech emotion recognition (SER) by deriving six interpretable acoustic concept tokens from the standardised eGeMAPS paralinguistic feature set. These tokens summarise energy, pitch, dynamics, brightness, formants, and voice quality, and are appended to the textual prompt while the audio input is kept unchanged. Across the widely used FAU-Aibo and IEMOCAP benchmark...

Iosif Tsangko·19 days ago

TechCrunch AI· PRESS

The ‘together tech’ wave might be the most intriguing startup bet of 2026

While the AI fundraising machine keeps breaking its own records, some founders are building in the other direction. Mirror founder Brynn Putnam just raised money for Board, a startup focused on bringing people together through in-person games and social experiences. Cyberdeck creators are going viral crafting whimsical DIY computers that literally encourage users to touch grass. Unlike the AI-free browser crowd, this doesn’t just feel like backlash, […]

Theresa Loconsolo, Anthony Ha, Kirsten Korosec, Sean O'Kane·19 days ago·+ covered by others

The Verge AI· PRESS

Can AI tell if your script will make a hit film?

When Quilty hit the industry trades earlier this year, the AI startup promised that its tool could accurately predict a film's success just by reading the script. When people actually got a chance to experiment with Quilty's product, though, they were left skeptical. Even with all the available data in the world, it predicted the script for Christy, which would go on to be a box office flop, would outperform the script for Sinners, which became an Oscar-winning blockbuster. As many AI execs have pitched before, Quilty's founders believe that can help "democratize" their industry by giving up-...

Charles Pulliam-Moore·19 days ago

TechCrunch AI· PRESS

AirTrunk commits $30B to build 5GW of AI data centers in India

The Australian data center operator plans to set up 5GW of capacity in India.

Jagmeet Singh·19 days ago

Simon Willison· ANALYST

Quoting Andreas Kling

We will no longer accept public pull requests. [...] A substantial patch used to imply substantial effort, and that effort was a reasonable proxy for good faith. That assumption no longer holds. [...] Whether code was typed by hand is beside the point. What matters is who is responsible for it once it enters the browser. Ladybird is becoming a browser for real users. The people introducing changes to it must be the people who decide those changes belong in the project, and who will answer for the consequences. — Andreas Kling , Changing How We Develop Ladybird Tags: ladybird , ai-ethics...

Simon Willison·19 days ago

MIT Tech Review· PRESS

The Meta hack shows there’s more to AI security than Mythos

On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they controlled, and the agent complied. One attacker broke into the dormant Obama White House account and made pro-Iran…

Grace Huckins·19 days ago

Latent Space· ANALYST

[AINews] not much happened today

a quiet day

Latent Space·19 days ago·+ covered by others

TechCrunch AI· PRESS

Mira Murati steps back into the spotlight, carefully

In the current environment, remaining heads down has diminishing returns; at some point, you have to make some noise just to remind the market you exist.

Connie Loizos·19 days ago

Cohere· FRONTIER

Unpacking AI Safety for Enterprises

Navigate AI safety with confidence. This guide explains 7 key enterprise AI safety themes to help you address biases, misinformation, and other real-world risks.

Cohere·19 days ago

Simon Willison· ANALYST

AI enthusiasts are in a race against time, AI skeptics are in a race against entropy

AI enthusiasts are in a race against time, AI skeptics are in a race against entropy Charity Majors neatly captures the dynamic between AI enthusiasts and AI skeptics, both of whom are trying to build great software, often in the same teams: The enthusiasts are not wrong . We are starting to see real, non-imaginary, discontinuous leaps in capabilities from teams that lean in hard to working with AI. And this does not feel like a normal technology cycle where you can wait for the dust to settle; teams that sit this out while competitors are hustling could be out of business before the dust set...

Simon Willison·19 days ago

TechCrunch AI· PRESS

Ahead of its IPO, Anthropic’s Daniela Amodei shrugs off doubts about AI’s returns

The AI giant's co-founder explained why the company may tap the public market for capital and why tokenmaxxing pushback isn't a concern.

Marina Temkin·19 days ago

TechCrunch AI· PRESS

Airbnb’s Brian Chesky plans to launch a new AI lab

The Airbnb CEO said last year it hasn't struck an LLM partnership because existing products weren't quite ready.

Tim Fernholz·19 days ago

Ars Technica AI· PRESS

The skeptic’s guide to humanoid robots going viral on the Internet

Robot demonstrations can distort public perceptions of robotic capabilities.

Jeremy Hsu ·19 days ago

TechCrunch AI· PRESS

Defense tech, AI, and fundraising take center stage at StrictlyVC Los Angeles on June 18

With just two weeks to go, StrictlyVC Los Angeles is quickly approaching. On Thursday, June 18, at The Aerospace Corporation Campus in El Segundo. Investors, founders, and tech leaders will gather for an evening of conversations exploring some of the most consequential shifts taking place across venture capital, defense technology, artificial intelligence, and advanced industry. Secure your spot here. For […]

TechCrunch Events·20 days ago

← Front Page30 stories

← Newer Older →