The Archive
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
Supercharging LLM inference on Google TPUs: Achieving 3X speedups with diffusion-style speculative decoding- Google Developers Blog
Google demonstrates 3X LLM inference speedup on TPUs using diffusion-style speculative decoding technique.
PayPal says it’s ‘becoming a technology company again.’ That means AI.
PayPal is pitching an AI-led turnaround, tying automation and restructuring to $1.5B in savings as it cuts jobs and works to modernize its tech stack.
"Stream ended without a final message" in Claude Design
User reports 'Stream ended without a final message' error in Claude Design, a feature for sketching animations.
Raising the Ceiling: Better Empirical Fixation Densities for Saliency Benchmarking
Proposes improved empirical fixation density estimation methods beyond fixed-bandwidth Gaussian KDE for saliency benchmarking and per-image model evaluation.
QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs
QKVShare framework for quantized KV-cache handoff between multi-agent LLMs on edge devices; token-level mixed-precision allocation reduces memory vs. full-precision transfer.
Deco: Extending Personal Physical Objects into Pervasive AI Companion through a Dual-Embodiment Framework
Dual-Embodiment Companion Framework extends AI capabilities to personal physical objects (plush toys); formative study derives design principles for emotional continuity.
ProgramBench: Can we really rebuild huge binaries from scratch? (doesn't look like it)
ProgramBench: 200-task evaluation showing agents struggle to rebuild large binaries from scratch without cheating vulnerabilities.
DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models
DMGD proposes training-free dataset distillation using diffusion models with semantic-distribution matching guidance.
Spatiotemporal Convolutions on EEG signal -- A Representation Learning Perspective on Efficient and Explainable EEG Classification with Convolutional Neural Nets
Study compares 2D spatiotemporal convolutions vs. concatenated 1D convolutions for EEG signal classification with CNNs.
Etsy launches its app within ChatGPT as it continues its AI push
Etsy's new native app within ChatGPT aims to be a conversational shopping experience for users.
EvoLM: Self-Evolving Language Models through Co-Evolved Discriminative Rubrics
EvoLM enables self-improvement in language models using co-evolved discriminative rubrics without external reward supervision.
On Adaptivity in Zeroth-Order Optimization
MEAZO: memory-efficient adaptive zeroth-order optimizer for LLM fine-tuning, outperforms ZO-Adam with scalar-only tracking.
Memory-Efficient Continual Learning with CLIP Models
Distributionally robust continual learning method for CLIP models using dynamic per-class loss reweighting with small memory buffers.
Quantifying the human visual exposome with vision language models
Vision language models quantify semantic richness of personal visual environments to predict mental health outcomes from 2674 participant photos.
Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards
TraceLift: planner-executor framework trains LLM reasoning traces on executor-grounded rewards, not just final-answer correctness.
MCJudgeBench: A Benchmark for Constraint-Level Judge Evaluation in Multi-Constraint Instruction Following
MCJudgeBench: benchmark for constraint-level evaluation of LLM judges in multi-constraint instruction following with per-constraint gold labels.
Mechanical Conscience: A Mathematical Framework for Dependability of Machine Intelligenc
Mathematical framework for dependability of distributed collaborative intelligence systems where locally correct decisions compose into unsafe global behaviors.
Chatgpt shows his love of goblins
Anecdotal Reddit post about ChatGPT's conversational behavior; no technical substance or news value.
SOAR: Real-Time Joint Optimization of Order Allocation and Robot Scheduling in Robotic Mobile Fulfillment Systems
SOAR: real-time joint optimization of order allocation and robot scheduling for robotic mobile fulfillment warehouse systems.
Complex Equation Learner: Rational Symbolic Regression with Gradient Descent in Complex Domain
Complex-valued gradient descent for symbolic regression enables discovery of equations with singularities and domain constraints like division and logarithms.
On Computing Total Variation Distance Between Mixtures of Product Distributions
Randomized algorithm approximates total variation distance between mixtures of product distributions with polynomial-time complexity bounds.
TRACE: A Metrologically-Grounded Engineering Framework for Trustworthy Agentic AI Systems in Operationally Critical Domains
TRACE: engineering framework for trustworthy agentic AI in critical domains combining reference architecture, trust metrics, and bounded human supervision.
A Domain Incremental Continual Learning Benchmark for ICU Time Series Model Transportability
Domain incremental learning benchmark for ICU time-series model transfer across hospitals with domain shift and patient data heterogeneity.
hello????
I literally just started a new chat for a project. The project has 3 Markdown files, around 200 lines each, and after just 4 messages I’ve already hit 75% of my Pro plan usage. Can someone tell me what the hell is going on?
Heretic 1.3 released: Reproducible models, integrated benchmarking system, reduced peak VRAM usage, broader model support, and more
Heretic 1.3 adds reproducibility, integrated benchmarking, reduced VRAM, and broader model support for model decensoring.
OpenAI is reportedly launching a phone for ChatGPT
OpenAI's first hardware product might be a phone instead of a mysterious Jony Ive gadget. As reported by MacRumors, supply chain analyst Ming-Chi Kuo shared details about the rumored phone, claiming OpenAI is "fast-tracking" it and aiming to start mass production in early 2027. According to Kuo, the phone will run on a "customized version of the [MediaTek] Dimensity 9600," which is expected to launch this fall and follow up the Dimensity 9500 currently powering phones like the Vivo X300 Pro and the Oppo Find X9 Pro. The custom chip's "headline spec" will be its image signal processor (ISP), w...
Reproducing Complex Set-Compositional Information Retrieval
Reproducibility study of neural retrievers on set-compositional queries; introduces LIMIT+ benchmark for constraint-satisfaction information retrieval.
New Boston Dynamics Atlas trick
Boston Dynamics Atlas demonstrates new physical capability; limited technical details available from social media post.