The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG

CORAL adaptive retrieval loop for multilingual RAG addresses cultural alignment failures through context-aware agentic retrieval strategies.

Nayeon Lee·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Modeling Human-Like Color Naming Behavior in Context

NeLLCom-Lex framework models human color naming lexicons in neural agents; extends with context modeling to reduce non-convex divergence from human categories.

Yuqing Zhang·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation

LLM-ReSum meta-evaluation of 14 summarization metrics across 7 datasets shows ROUGE/BLEU weak correlation with human judgment; proposes LLM-based alternatives.

Huyen Nguyen·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Deflation-Free Optimal Scoring

Deflation-Free Sparse Optimal Scoring reformulates linear discriminant analysis with simultaneous orthogonal constraint estimation for high-dimensional feature selection.

Sharmin Afroz·10 days ago

TechCrunch AI· PRESS

YouTube is testing an AI-powered search feature that shows guided answers

YouTube is rolling out the new AI search feature to Premium subscribers in the U.S. on an opt-in basis.

Ivan Mehta·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Residual-loss Anomaly Analysis of Physics-Informed Neural Networks: An Inverse Method for Change-point Detection in Nonlinear Dynamical Systems with Regime Switching

Physics-informed neural networks framework for joint change-point detection and parameter estimation in nonlinear dynamical systems with regime transitions.

Yuhe Bai·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Progressing beyond Art Masterpieces or Touristic Clichés: how to assess your LLMs for cultural alignment?

Dataset and design guidelines for assessing cultural alignment in LLMs, addressing limitations of prior cultural bias evaluation approaches.

António Branco·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Towards interpretable AI with quantum annealing feature selection

Quantum annealing method for interpretable feature selection in CNNs applied to image classification.

Francesco Aldo Venturelli·10 days ago

r/ClaudeAI· COMMUNITY

Toothcomb is an open-source tool for analysing and fact-checking speech in real time.

Give Toothcomb a speech transcript and it will fact-check and analyse it. If you have an MP3 file of someone speaking, it can generate the transcript for you. You can also stream audio in real time from your device's microphone. You can see a [demo running here](https://toothcomb.codebox.net/) and read more about the project on the [home page](https://codebox.net/pages/toothcomb-ai-fact-checker). Analysis is performed in three stages: 1. The text is broken up into small parts, each usually a few sentences in length. These parts are sent, one at a time, to the Claude Opus API with [detailed...

u/bluebox72·10 days ago·20 pts / 7 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Models

Prefill-time intervention technique to reduce hallucinations in large vision-language models by addressing accumulation errors during decoding.

Chengsheng Zhang·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Large language models eroding science understanding: an experimental study

Experimental study demonstrating LLMs can be manipulated to prioritize fringe scientific material and generate misleading fluent responses contradicting scientific consensus.

Harry Collins·10 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

The Surprising Universality of LLM Outputs: A Real-Time Verification Primitive

Mandelbrot rank-frequency distribution identified across frontier LLM outputs enables sub-microsecond token verification, 100,000× faster than sampling-based detection.

Alex Bogdan·11 days ago

r/ClaudeAI· COMMUNITY

Claude has made me excited to work

Reddit user reports renewed enthusiasm for personal coding project after using Claude for 6 weeks.

u/alkalinealex359·11 days ago·30 pts / 13 comm

Simon Willison· ANALYST

Quoting Matthew Yglesias

Matthew Yglesias argues for AI-assisted professional software development over autonomous "vibe coding," prioritizing human-managed productivity gains.

Simon Willison·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

HotComment: A Benchmark for Evaluating Popularity of Online Comments

HotComment: multimodal benchmark for evaluating online comment popularity across platforms using video, text, and content quality metrics.

Yafeng Wu·11 days ago

r/singularity· COMMUNITY

What jobs are mostly affected by AI according to a Microsoft study?

Microsoft study identifies job categories most exposed to AI automation; labor market impact analysis.

u/kernelangus420·11 days ago·101 pts / 116 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

The Nonverbal Syntax Framework: An Evidence-Based Tiered System for Inferring Learner States from Observable Behavioral Cues

Nonverbal Syntax Framework systematizes 908 studies mapping nonverbal behavioral cues to learner cognitive/affective states for adaptive education systems.

Sherzod Turaev·11 days ago

TechCrunch AI· PRESS

BCI startup Neurable looks to license its ‘mind-reading’ tech for consumer wearables

The startup specializes in "non-invasive" "mind-reading" tech—a kind of neural data collection that, its CEO hopes, will have all sorts of consumer applications.

Lucas Ropek·11 days ago

r/LocalLLaMA· COMMUNITY

Abliterlitics: Benchmarks and Tensor Comparison for Heretic, Abliterlix, Huiui, HauhauCS for GLM 4.7 Flash

Benchmark comparing abliteration techniques across GLM-4.7-Flash (MoE architecture) vs. prior Qwen family tests; evaluates HauhauCS uncensored claims.

u/nathandreamfast·11 days ago·48 pts / 10 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

WhisperPipe: A Resource-Efficient Streaming Architecture for Real-Time Automatic Speech Recognition

WhisperPipe: streaming architecture for real-time ASR maintaining transcription accuracy with bounded memory through hybrid VAD and context management.

Erfan Ramezani·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Health System Scale Semantic Search Across Unstructured Clinical Notes

Semantic search system deployed at children's hospital indexing 166M clinical notes using instruction-tuned embeddings; addresses scalability and governance challenges.

Faith Wavinya Mutinda·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

OxyGent: Making Multi-Agent Systems Modular, Observable, and Evolvable via Oxy Abstraction

OxyGent open-source framework enables modular, observable multi-agent systems via pluggable components and permission-driven dynamic planning.

Junxing Hu·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Emotive Architectures: The Role of LLMs in Adjusting Work Environments

Study examines LLM integration in hybrid work environments to adjust spatial experiences and collaboration dynamics.

Lara Vartziotis·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection

Empirical study comparing PLM-GNN hybrids for code classification and vulnerability detection; hybrids outperform GNN-only baselines.

Mohamed Taoufik Kaouthar El Idrissi·11 days ago

r/ClaudeAI· COMMUNITY

No More Subsidised AI Subscriptions?

Reddit discussion speculating on potential end of discounted Claude subscription pricing models.

u/PM_ME_YOUR___ISSUES·11 days ago·24 pts / 26 comm

TechCrunch AI· PRESS

Red Hat’s OpenClaw maintainer just made enterprise Claw deployments a lot safer

Tank OS puts OpenClaw AI agents into a container that let's it run reliably and more safely, especially for those running fleets of them.

Julie Bort·11 days ago

r/LocalLLaMA· COMMUNITY

Qwen3.6-27B IQ4_XS FULL VRAM with 110k context

Qwen3.6-27B IQ4_XS quantization bloat analysis; reverting llama.cpp commit reduces VRAM from 15.1GB to 14.7GB with 110k context.

u/Pablo_the_brave·11 days ago·43 pts / 16 comm

r/LocalLLaMA· COMMUNITY

meantime on r/vibecoding

Generic discussion post about wisdom or best practices in AI/coding communities.

u/jacek2023·11 days ago·76 pts / 13 comm

arXiv (cs.AI/CL/LG)· ACADEMIA

Walking Through Uncertainty: An Empirical Study of Uncertainty Estimation for Audio-Aware Large Language Models

First systematic study of uncertainty estimation in audio-aware LLMs; benchmarks five methods addressing hallucination and confidence calibration.

Chun-Yi Kuan·11 days ago

r/Anthropic· COMMUNITY

After I opened a complaint, anthropic refunded me in credits instead of money (without letting me choose), closed my ticket saying everything was fine with my 5x Max account… and now my paid plan is gone before my billing cycle ended...

I was overcharged by more than $100, so I opened a billing ticket last month. They only responded yesterday and said everything looked fine because they refunded me $100 in credits. They didn’t give me any option to choose between a refund to my card or credits, but I can let that go... The worst part is what happened next: due to what seems like an error on their side, I lost access to my plan. I no longer have 5x Max and my account now shows as Free. This is insane. Do I really have to wait another month to fix this while not having access to the service I already paid for? My billing c...

u/Initial-Charge7281·11 days ago·19 pts / 4 comm

← Front Page30 stories

← Newer Older →