The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Everything at Every Scale: Scale-Invariant Diffusion with Continuous Super-Resolution

SKILD unifies image generation and super-resolution via scale-invariant diffusion in K-space, leveraging scale invariance in natural and physical systems.

Zixin Jessie Chen·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

CausaLab: A Scalable Environment for Interactive Causal Discovery Toward AI Scientists

CausaLab environment benchmarks LLM agents on interactive causal discovery with validation of both solutions and underlying causal mechanisms.

Junlin Yang·1 month ago

r/singularity· COMMUNITY

Hyundai/Boston Dynamics is going to train Atlas the humanoid robot by watching football videos, and they'll document its progress in an online series called 'School of Football'

Boston Dynamics and Hyundai plan to train Atlas humanoid robot using football video learning, releasing progress via 'School of Football' documentary series.

u/Distinct-Question-16·1 month ago·103 pts / 23 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

A Multimodal 3D Foundation Model for Light Sheet Fluorescence Microscopy Enables Few-Shot Segmentation, Classification, and Deblurring

3D foundation model for light sheet fluorescence microscopy enables few-shot segmentation and classification of volumetric biological imaging data.

Adina Scheinfeld·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Retrieval-Augmented Detection of Potentially Abusive Clauses in Chilean Terms of Service

RAG framework detects potentially abusive clauses in Chilean Terms of Service using retrieval-augmented generation for legal document analysis.

Christoffer Loeffler·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

STORM: Internalized Modeling for Spatial-Temporal Reasoning in Video-Language Models

STORM internalizes spatial-temporal reasoning in video-language models via implicit visual memory instead of externalizing to textual chain-of-thought.

Yiming Liang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AdvantageFlow: Advantage-Weighted Least Squares for RL in Flow Models

AdvantageFlow applies advantage-weighted RL to forward-process diffusion optimization in Stable Diffusion, outperforming reverse-process baselines.

Branislav Kveton·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Learning in Low-Dimensional Subspaces: Orthogonal Bottlenecks for Reinforcement Learning

Orthogonal bottleneck representation prior constrains RL encoder features to low-dimensional subspaces without auxiliary objectives or pretraining.

Aleksandar Todorov·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Forgotten Words: Benchmarking NeoBERT for Dementia Detection in Low-Resource Conversational Filipino and English Speech

NeoBERT evaluated on dementia detection from Filipino-English code-switched speech, first systematic study in this low-resource clinical NLP setting.

Rez Samantha Z. Floresca·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MAGIC: Multimodal Alignment & Grounding-aware Instruction Coreset for Vision-Language Models

MAGIC: training-free coreset selection for vision-language model instruction tuning via multimodal alignment signals.

Shristi Das Biswas·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

AI-Assisted Systematization for Evaluating GenAI Systems

Framework for systematizing GenAI evaluation concepts (reasoning, fairness, creativity) into measurable definitions using AI assistance.

Dhruv Agarwal·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Statistical Inference for Stochastic Gradient Descent Beyond Finite Variance

Statistical inference methodology for SGD trajectories in infinite-variance regimes via weak convergence theory.

Jose Blanchet·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Causal methods for LLM development and evaluation

Causal inference methods for LLM development decisions: data mixtures, reward models, routing, and evaluation.

Dennis Frauen·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Deployment-complete benchmarking

Deployment-complete benchmarking: framework ensuring benchmark evidence resolves deployment decisions via conformal coverage.

El Mustapha Mansouri·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Fuzzy PyTorch: Rapid Numerical Variability Evaluation for Deep Learning Models

Fuzzy PyTorch: framework for evaluating numerical variability in deep learning via stochastic arithmetic integration.

Inés Gonzalez-Pepe·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

What Makes a Medical Checker Trainable? Diagnosing Signal Collapse and Reward Hacking in Checker-Guided RAG for Biomedical QA

Medical RAG training failure analysis: checker output distribution determines gradient quality; identifies signal collapse and reward hacking.

Yuelyu Ji·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Neural Scalable Symbolic Search Framework for Complex Logical Queries with Multiple Free Variables

Neural-symbolic framework for complex query answering over knowledge graphs with multiple free variables.

Weizhi Fei·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SafeCtrl-RL: Inference-Time Adaptive Behaviour Control for LLM Dialogue via RL-Driven Prompt Optimisation

SafeCtrl-RL: inference-time adaptive safety control for LLM dialogue via RL-driven prompt optimization without retraining.

Michael Orme·1 month ago

TechCrunch AI· PRESS

What ClickUp’s mass layoff tells us about the future of work

The nine-year-old startup is replacing hundreds of employees with thousands of AI agents.

Marina Temkin·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation

68-cell empirical study: LLM agents show +19.69pp higher sensitivity to semantic noise vs. surface noise across reasoning tasks.

Liyun Zhang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Creative Quality Alignment: Expert Tacit Knowledge Transfer via Chain-of-Thought Fine-Tuning

Empirical validation of creative quality alignment via chain-of-thought fine-tuning on small models with ~100 expert annotations.

Bo Zou·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Anticipate and Learn: Unleashing Idle-Time Compute in Proactive Agents

ProAct: proactive agent architecture using idle-time compute to predict and prepare for future user requests via dialogue history analysis.

Haoyi Hu·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Triplet-Block Diffusion RWKV

B³D-RWKV unifies causal RWKV with discrete diffusion via triplet-block layout, achieving O(L) inference with parallel bidirectional decoding.

Ke Lin·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Hidden in Plain Tokens: Simply Robust, Gradient-Free Watermark for Synthetic Audio

Gradient-free, training-free watermark for synthetic audio via token vocabulary redundancy, robust to discretization errors.

Georgios Milis·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Mapping the Schedule x Bit-Width Boundary in Sub-100M Quantisation-Aware Training

Large factorial grid study (720 runs) shows optimal learning-rate schedule for sub-100M QAT is invariant across FP16/INT8/INT6 bit-widths.

Christian Brandt Thomassen·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LECTOR: Joint Optimization of Scientific Reasoning Graphs and Introduction Generation

LECTOR grounds scientific introduction generation via reasoning graphs and structured content to reduce hallucinated citations.

Jiabei Xiao·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Continual Speaker Identity Unlearning with Minimal Interference

Continual unlearning method for speaker identity in zero-shot TTS, preventing revival of previously unlearned voices under sequential removal.

Jinju Kim·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PolyGnosis 2.0: Enhancing LLM Reasoning via Agentic Harness Engineering for Polymarket and OSINT Insight Extraction

PolyGnosis 2.0 multi-agent system detects predictive trading signals via Polymarket-GDELT narrative mismatches and harness engineering.

Daren Wang·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

QUIET: A Multi-Blank Cascaded Story Cloze Benchmark for LLM Creative Generation Capability

QUIET benchmark for evaluating LLM creative generation (not discriminative ability) via multi-blank cascaded story cloze with objective scoring.

Bo Zou·1 month ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Step-TP: A Grounded, Step-Level Dataset with Chain-of-Thought Reasoning for LLM-Guided Tensor Program Optimization

Step-TP: step-level dataset with CoT reasoning for LLM-guided tensor program optimization, enabling composable transformation decisions.

Mengfan Liu·1 month ago

← Front Page30 stories

← Newer Older →