The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Investigation into In-Context Learning Capabilities of Transformers

Systematic empirical study of in-context learning scaling behavior on Gaussian-mixture classification, extending prior linear theory.

Rushil Chandrupatla·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

SIEVES: Selective Prediction Generalizes through Visual Evidence Scoring

SIEVES enables selective prediction on multimodal models via visual evidence scoring to balance coverage and reliability on OOD tasks.

Hector G. Rodriguez·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

G-Loss: Graph-Guided Fine-Tuning of Language Models

G-Loss incorporates graph-guided label propagation into LM finetuning to capture global semantic structure beyond local neighborhoods.

Sharma Aditya·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Agentic Harness Engineering: Observability-Driven Automatic Evolution of Coding-Agent Harnesses

Framework automates engineering of coding-agent harnesses via observability-driven evolution, addressing multi-token trajectory attribution and sparse evaluation signals.

Jiahang Lin·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ADEMA: A Knowledge-State Orchestration Architecture for Long-Horizon Knowledge Synthesis with LLMAgents

ADEMA architecture enables long-horizon LLM-agent tasks via explicit knowledge-state bookkeeping, dual-evaluator governance, and checkpoint-resumable persistence.

Zhou Hanlin·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Semi-Markov Reinforcement Learning for City-Scale EV Ride-Hailing with Feasibility-Guaranteed Actions

Semi-Markov RL formulation for city-scale EV fleet dispatch with feasibility-guaranteed mixed discrete-continuous actions under spatially correlated demand.

An Nguyen·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling

Agora-Opt combines decentralized multi-agent debate with memory-augmented LLMs for automated optimization modeling from natural-language requirements.

Jianghao Lin·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Towards Agentic Investigation of Security Alerts

LLM-based agentic workflow automates security-alert triage via tool-constrained SQL and text search over logs, reducing manual correlation overhead.

Even Eilertsen·10 days ago

The Verge AI· PRESS

Claude can now plug directly into Photoshop, Blender, and Ableton

Claude’s new Blender connector lets you debug scenes, build new tools, and batch-apply object changes directly from the chatbot interface. | Image: Anthropic Anthropic has launched a set of connectors for Claude that allow the AI chatbot to tap into popular creative software, including Adobe's Creative Cloud apps, Affinity, Blender, Ableton, Autodesk, and more. This marks the company's latest efforts to break into the creative industry following its launch of Claude Design earlier this month. The new connectors - which enable Claude to access apps, retrieve data, and take actions within conne...

Jess Weatherbed·10 days ago

r/LocalLLaMA· COMMUNITY

Something from Mistral (Vibe) tomorrow

Mistral teases unspecified announcement (model or tool) for tomorrow; source is social media rumor.

u/pmttyji·10 days ago·100 pts / 22 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

PSI-Bench: Towards Clinically Grounded and Interpretable Evaluation of Depression Patient Simulators

PSI-Bench provides clinically grounded, interpretable evaluation of depression patient simulators with diversity metrics beyond LLM-judge assessment.

Nguyen Khoi Hoang·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Action-Aware Generative Sequence Modeling for Short Video Recommendation

Temporal generative model captures action timing as indicator of user intent in short-video recommendation systems.

Wenhao Li·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

TrialCalibre: A Fully Automated Causal Engine for RCT Benchmarking and Observational Trial Calibration

TrialCalibre automates BenchExCal framework for RCT calibration of observational trials, reducing resource intensity of real-world evidence validation.

Amir Habibdoust·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MAIC-UI: Making Interactive Courseware with Generative UI

MAIC-UI zero-code authoring system generates interactive STEM courseware from PDFs/PPTs with rapid iteration and pedagogical accuracy mechanisms.

Shangqing Tu·10 days ago

r/LocalLLaMA· COMMUNITY

Nemotron-3-Nano-Omni-30B-A3B-Reasoning, New model?

NVIDIA releases Nemotron-3-Nano-Omni-30B, a 30B multimodal model supporting audio, image, video, and text inputs with reasoning capabilities.

u/Altruistic_Heat_9531·10 days ago·53 pts / 15 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Barriers to Universal Reasoning With Transformers (And How to Overcome Them)

Analysis shows Transformers with CoT cannot length-generalize beyond TC^0 under standard positional encodings, limiting expressivity gains claimed by theory.

Oliver Kraus·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

At the Edge of the Heart: ULP FPGA-Based CNN for On-Device Cardiac Feature Extraction in Smart Health Sensors for Astronauts

ULP FPGA-based CNN accelerator for real-time cardiac feature extraction on resource-constrained wearable sensors for space health monitoring.

Kazi Mohammad Abidur Rahman·10 days ago

TechCrunch AI· PRESS

Lovable launches its vibe coding app on iOS and Android

The app allows developers to vibe code web apps and websites on the go.

Sarah Perez·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

StratFormer: Adaptive Opponent Modeling and Exploitation in Imperfect-Information Games

StratFormer: transformer-based meta-agent for opponent modeling and exploitation in imperfect-information games via curriculum learning.

Andy Caen·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Improving Diversity in Black-box Few-shot Knowledge Distillation

Black-box few-shot knowledge distillation with improved synthetic image diversity for student network training.

Tri-Nhan Vo·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Diverse Image Priors for Black-box Data-free Knowledge Distillation

Data-free black-box knowledge distillation using diverse image priors for teacher-student transfer with privacy constraints.

Tri-Nhan Vo·10 days ago

NVIDIA Dev Blog· INFRA

NVIDIA Nemotron 3 Nano Omni Powers Multimodal Agent Reasoning in a Single Efficient Open Model

Agentic systems often reason across screens, documents, audio, video, and text within a single perception‑to‑action loop. However, they still rely on... Agentic systems often reason across screens, documents, audio, video, and text within a single perception‑to‑action loop. However, they still rely on fragmented model chains—separate stacks for vision, audio, and text. This increases inference hops and orchestration complexity, driving up inference costs while weakening cross-modal context consistency. NVIDIA Nemotron 3 Nano Omni… Source

Anjali Shah·10 days ago

Google AI (Gemma)· FRONTIER

Celebrating 20 years of Google Translate: Fun facts, tips and new features to try

Google celebrates Google Translate's 20th anniversary with historical facts and new features across 250 languages.

{"$":{"xmlns:author":"http://www.w3.org/2005/Atom"},"name":["Rose Yao"],"title":["VP, Product, Search"],"department":[""],"company":[""]}·10 days ago

Hugging Face· INFRA

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

Hugging Face·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Subliminal Steering: Stronger Encoding of Hidden Signals

Subliminal steering: fine-tuned student LMs inherit teacher behavioral biases through unintended signal transfer mechanisms.

George Morgulis·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Sustained Gradient Alignment Mediates Subliminal Learning in a Multi-Step Setting: Evidence from MNIST Auxiliary Logit Distillation Experiment

Gradient alignment mechanisms sustain subliminal learning of unintended traits in multi-step distillation on MNIST auxiliary logits.

Chayanon Kitkana·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Can Code Evaluation Metrics Detect Code Plagiarism?

Empirical evaluation of code metrics (CodeBLEU, RUBY, etc.) for source code plagiarism detection across modification complexities.

Fahad Ebrahim·10 days ago

r/LocalLLaMA· COMMUNITY

Ling-2.6-flash

Ling-2.6-flash model released on Hugging Face; sparse details provided.

u/Namra_7·10 days ago·43 pts / 12 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Unrequited Emotions: Investigating the Gaps in Motivation and Practice in Speech Emotion Recognition Research

Systematic survey of speech emotion recognition research revealing gaps between stated motivations and actual datasets/methodology.

Taryn Wong·10 days ago

r/LocalLLaMA· COMMUNITY

Lemonade OmniRouter: unifying the best local AI engines for omni-modality

Lemonade OmniRouter unifies local AI inference across text, image, audio, and vision modalities via single OpenAI-compatible endpoint using llama.cpp, sd.cpp, and Whisper.

u/jfowers_amd·10 days ago·41 pts / 18 comm

← Front Page30 stories

← Newer Older →