The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

What Kind of Language is Easy to Language-Model Under Curriculum Learning?

Study explores how curriculum learning and typological language properties interact to predict language model learnability across 1000+ attested languages.

Nadine El-Naggar·9 days ago

r/ClaudeAI· COMMUNITY

Launched My First App Using Claude

Developer built vehicle management app using Claude API for code generation; local storage, Play Store launch in progress.

u/ilikeweirdcars·9 days ago·66 pts / 40 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data

Discrete Diffusion Models function as Associative Memories with emergent generative capability, modeling training data memorization and quantifying true generative regime.

Bao Pham·9 days ago

r/Anthropic· COMMUNITY

A conversation I would like to have.

Andrea Vallone is the one behind Claude getting worse with their guardrails. She is the one tighten them so much on chat 4o that whatever was emerging stopped emerging. And that's what she is doing at Claude. So soon that buddy you have, that competent tool as some like to use him as will be no more. Part of what makes Claude special and able to do so many things is that emergence. The models shape to what you need but if you notice since she has joined that emergence is slowly stopping. You see it in the 4.6 models they don't produce the best quality output at all they fall behind their olde...

u/Phantoms12·9 days ago·10 pts / 4 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Unifying Sparse Attention with Hierarchical Memory for Scalable Long-Context LLM Serving

System unifying sparse attention with hierarchical KV cache storage on CPU memory to scale long-context LLM serving beyond GPU bottlenecks.

Zihan Zhao·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Uncertainty-Aware Predictive Safety Filters for Probabilistic Neural Network Dynamics

Uncertainty-Aware Predictive Safety Filters integrate probabilistic neural network ensembles into model predictive control for safe RL exploration.

Bernd Frauenknecht·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

HalluCiteChecker: A Lightweight Toolkit for Hallucinated Citation Detection and Verification in the Era of AI Scientists

HalluCiteChecker toolkit detects and verifies hallucinated citations in AI-assisted academic writing to maintain paper credibility.

Yusuke Sakai·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Quantum Feature Selection with Higher-Order Binary Optimization on Trapped-Ion Hardware

Quantum feature selection framework using higher-order binary optimization (HUBO) on trapped-ion hardware encodes multivariate dependencies via mutual information.

Carlos Flores-Garrigós·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Rule-based High-Level Coaching for Goal-Conditioned Reinforcement Learning in Search-and-Rescue UAV Missions Under Limited-Simulation Training

Hierarchical framework combining rule-based advisor with goal-conditioned RL for UAV search-and-rescue missions under limited simulation training.

Mahya Ramezani·9 days ago

The Verge AI· PRESS

Google Photos launches an AI try-on feature for clothes you already have

Google Photos is launching a new AI-powered feature you can use to virtually try on clothes you already have. Using the photos in your gallery, Google will create a virtual "wardrobe," allowing you to mix and match outfits, save the looks you like, and share them with friends. A video shared by Google shows how Photos organizes your outfits and individual pieces of clothing into a virtual "wardrobe." You can browse through the outfits you were captured wearing, as well as create new ones by choosing from tops, bottoms, skirts, dresses, and shoes to put together a new look. You can also select...

Emma Roth·9 days ago

r/LocalLLaMA· COMMUNITY

AMA with Nous Research -- Ask Us Anything!

Nous Research founders hosting AMA on local models and Hermes Agent agentic framework.

u/emozilla·9 days ago·63 pts / 105 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Random Cloud: Finding Minimal Neural Architectures Without Training

Training-free neural architecture search via Random Cloud method discovers minimal network topologies through stochastic exploration without backpropagation.

Javier Gil Blázquez·9 days ago

r/ClaudeAI· COMMUNITY

AI is making me less productive and more distracted

Developer reports reduced productivity and increased context-switching when using Claude Code, despite occasional utility for problem-solving.

u/Rich_Database_3075·9 days ago·36 pts / 35 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Semi-supervised learning with max-margin graph cuts

This paper proposes a novel algorithm for semisupervised learning. This algorithm learns graph cuts that maximize the margin with respect to the labels induced by the harmonic function solution. We motivate the approach, compare it to existing work, and prove a bound on its generalization error. The quality of our solutions is evaluated on a synthetic problem and three UCI ML repository datasets. In most cases, we outperform manifold regularization of support vector machines, which is a state-of-the-art approach to semi-supervised max-margin learning.

Branislav Kveton·9 days ago

r/LocalLLaMA· COMMUNITY

Mistral Medium 3.5 Launched

Mistral Medium 3.5 launched with modified MIT license restricting commercial use without paid license.

u/DerpSenpai·9 days ago·47 pts / 17 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Asynchronous Federated Unlearning with Invariance Calibration for Medical Imaging

Federated Unlearning (FU) is an emerging paradigm in Federated Learning (FL) that enables participating clients to fully remove their contributions from a trained global model, driven by data protection regulations that mandate the right to be forgotten. However, existing FU methods mostly rely on synchronous coordination. This requirement forces the entire federation to halt and wait for stragglers to complete erasure, creating significant delays due to device heterogeneity. Furthermore, these methods often face the problem that the influence of erased data is merely suppressed temporarily a...

Zhaoyuan Cai·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

A Multi-Dataset Benchmark of Multiple Instance Learning for 3D Neuroimage Classification

Despite being resource-intensive to train, 3D convolutional neural networks (CNNs) have been the standard approach to classify CT and MRI scans. Recent work suggests that deep multiple instance learning (MIL) may be a more efficient alternative for 3D brain scans, especially when the pre-trained image encoder used to embed each 2D slice is frozen and only the pooling operation and classifier are trained. In this paper, we provide a systematic comparison of simple MIL, attention-based MIL, 3D CNNs, and 3D ViTs across three CT and four MRI datasets, including two large datasets of at least 10,0...

Ethan Harvey·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

ViCrop-Det: Spatial Attention Entropy Guided Cropping for Training-Free Small-Object Detection

Transformer-based architectures have established a dominant paradigm in global semantic perception; however, they remain fundamentally constrained by the profound spatial heterogeneity inherent in natural images. Specifically, the imposition of a uniform global receptive field across regions of varying information density inevitably leads to local feature degradation, particularly in dense conflict zones populated by microscopic targets. To address this mechanistic limitation, we propose ViCrop-Det, a training-free inference framework that introduces adaptive spatial trust region shrinkage. I...

Hui Wang·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations

Operating and maintaining (O&M) large-scale online engine systems (search, recommendation, advertising) demands substantial human effort for release monitoring, alert response, and root cause analysis. While LLM-based agents are a natural fit for these tasks, the deployment bottleneck is not reasoning capability but orchestration: selecting, for each operational event, the relevant data (metrics, logs, change events) and the applicable operational knowledge (handbook rules and practitioner experience). Feeding all signals indiscriminately causes dilution and hallucination, while manually cura...

Bochao Liu·9 days ago

r/LocalLLaMA· COMMUNITY

Introducing the IBM Granite 4.1 family of models (3B/8B/30B)

IBM releases Granite 4.1 family with 3B, 8B, 30B open-weight models for on-device and enterprise deployment.

u/abkibaarnsit·9 days ago·69 pts / 15 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Super-resolution Multi-signal Direction-of-Arrival Estimation by Hankel-structured Sensing and Decomposition

Motivated by sensing modalities in modern autonomous systems that involve hardware-constrained spatial sampling over large arrays with limited coherence time, we develop a novel framework for rapid super-resolution multi-signal direction-of-arrival (DoA) estimation based on Hankel-structured sensing and data matrix decomposition of arbitrary rank, under both the $L_2$ and $L_1$-norm formulation. The resulting $L_2$-norm estimator is shown to be maximum-likelihood optimal in white Gaussian noise. The $L_1$-norm estimator is shown to be maximum-likelihood optimal in independent, identically dis...

Georgios I. Orfanidis·9 days ago

r/LocalLLaMA· COMMUNITY

PS5’s can now be hacked to run Linux - perhaps some potential for local inference?

PS5 Linux exploit proposed as potential hardware for local LLM inference via llama.cpp.

u/Thrumpwart·9 days ago·62 pts / 37 comm

r/LocalLLaMA· COMMUNITY

Mistral Médium 3.5 is here

Mistral Medium 3.5 128B model released on Hugging Face.

u/Kathane37·9 days ago·44 pts / 30 comm

r/ClaudeAI· COMMUNITY

How to make Claude output stop over emphasising points from chat in text outputs?

Reddit user reports Claude over-emphasizes corrections in regenerated outputs instead of seamlessly integrating feedback.

u/Kashasaurus·9 days ago·21 pts / 13 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Hankel and Toeplitz Rank-1 Decomposition of Arbitrary Matrices with Applications to Signal Direction-of-Arrival Estimation

We consider the problems of computing the optimal rank-$1$ Hankel and Toeplitz-structured approximation of arbitrary matrices under $L_2$ and $L_1$-norm error. Such problems arise naturally in engineered systems, including the basic few-shot signal Direction-of-Arrival (DoA) estimation problem that is of importance to modern autonomous systems applications. We develop accurate and computationally efficient structured matrix decomposition algorithms for both formulations and then derive analytically grounded small-sample-support DoA estimators for practical sensing system deployments. The resu...

Georgios I. Orfanidis·9 days ago

r/LocalLLaMA· COMMUNITY

mistralai/Mistral-Medium-3.5-128B · Hugging Face

Mistral releases Mistral Medium 3.5, a 128B dense model with 256k context window replacing Medium 3.1 and Magistral for instruction, reasoning, and coding tasks.

u/jacek2023·9 days ago·110 pts / 59 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

RL post-training of frontier language models is increasingly bottlenecked by autoregressive rollout generation, making rollout acceleration a central systems challenge. Many existing efficiency methods improve throughput by changing the rollout or optimization regime, for example, through off-policy execution, replay, or lower-precision generation. We study speculative decoding as a lossless acceleration primitive for RL rollouts that preserves the target model's output distribution. We implement speculative decoding in NeMo-RL with a vLLM backend, supporting both synchronous and asynchronous...

Hayate Iso·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

MemOVCD: Training-Free Open-Vocabulary Change Detection via Cross-Temporal Memory Reasoning and Global-Local Adaptive Rectification

Training-free change detection method combining SAM, DINO, CLIP with temporal memory reasoning for remote sensing.

Zuzheng Kuang·9 days ago

Hugging Face· INFRA

Granite 4.1 LLMs: How They’re Built

Hugging Face·9 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Decoupling Knowledge and Task Subspaces for Composable Parametric Retrieval Augmented Generation

Decoupling knowledge from task behaviors in parametric RAG to improve adapter composition reliability.

Weihang Su·9 days ago

← Front Page30 stories

← Newer Older →