The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

The Forensic Cost of Watermark Removal

Watermark Removal Detection benchmark revealing statistical artifacts left by state-of-the-art watermark attacks at 10^-3 FPR.

Gautier Evennou·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

Adaptable phase retrieval for coherent transition radiation spectroscopy based on differentiable physics information

Differentiable physics-informed approach for phase retrieval in coherent transition radiation spectroscopy diagnostics.

Ritz Ann Aguilar·11 days ago

arXiv (cs.AI/CL/LG)· ACADEMIA

From World-Gen to Quest-Line: A Dependency-Driven Prompt Pipeline for Coherent RPG Generation

Dependency-driven multi-stage prompt pipeline for coherent RPG world and narrative generation using LLMs.

Dominik Borawski·11 days ago

r/ClaudeAI· COMMUNITY

[ Removed by Reddit ]

Reddit post removed for policy violation; no content available for assessment.

u/Sure-Explanation-111·11 days ago·34 pts / 23 comm

r/Anthropic· COMMUNITY

Anthropic hitting 40% enterprise share makes the "just add a fallback provider" advice weaker, not stronger

Menlo Ventures' enterprise survey put Anthropic at 40% of LLM spend, OpenAI at 27%. The takes I've seen are mostly about the leaderboard. The thing nobody's saying out loud: the standard agent-reliability advice ("don't depend on one provider, add a fallback") got harder to actually execute, not easier. When the split was closer to 50/30, both providers were realistic peers. You could run prod on one and have the other warm. Now most of us are running primarily on Claude — Sonnet for tool calls, Opus for harder stuff — and the "fallback" is a model we haven't tested against our actual prompt...

u/llamacoded·11 days ago·10 pts / 8 comm

r/ClaudeAI· COMMUNITY

The device you are on seems to change behavior

Reddit user reports Claude's behavior varies by device type without explicit disclosure; seeks documentation on device-aware context handling.

u/sadphilosophylover·11 days ago·21 pts / 16 comm

r/Anthropic· COMMUNITY

I passed new Claude Certified Architect - Foundations (CCA-F) exam

Individual shares study tips for new Claude Certified Architect–Foundations exam, emphasizing real-world application over memorization.

u/Cool-Chemistry-9453·11 days ago·10 pts / 19 comm

r/LocalLLaMA· COMMUNITY

Do the "*Claude-4.6-Opus-Reasoning-Distilled" really bring something new to the original models?

Community discussion questioning whether Claude-4.6-Opus-Reasoning-Distilled fine-tune adds genuine capability vs. style changes over base model.

u/Historical-Crazy1831·11 days ago·40 pts / 30 comm

r/LocalLLaMA· COMMUNITY

First direct side by side MoE vs Dense comparison.

arXiv paper presenting first direct empirical comparison of Mixture-of-Experts vs dense model architectures.

u/Different_Fix_2217·11 days ago·43 pts / 33 comm

r/OpenAI· COMMUNITY

my GPT Image 2 generations

User shares image samples generated by GPT Image 2; anecdotal usage demo without technical insight.

u/Salt-Garlic-2696·11 days ago·94 pts / 42 comm

r/LocalLLaMA· COMMUNITY

I got 3× faster HFQ4 prefill on Strix Halo in hipfire with an opt-in MMQ path

HFQ4 MMQ prefill optimization for AMD Strix Halo achieves 3× speedup (310→950 tok/s) in hipfire inference engine.

u/Own_Suspect5343·11 days ago·40 pts / 16 comm

Latent Space· ANALYST

[AINews] ImageGen is on the Path to AGI

Opinion piece speculating that image generation capabilities represent progress toward AGI, referencing GPT-Image-2 adoption.

Latent Space·11 days ago

r/LocalLLaMA· COMMUNITY

Duality of r/LocalLLaMA

Reddit discussion analyzing tensions within r/LocalLLaMA community between open-weights advocates and commercial interests.

u/HornyGooner4402·11 days ago·72 pts / 22 comm

Simon Willison· ANALYST

What's new in pip 26.1 - lockfiles and dependency cooldowns!

pip 26.1 adds lockfile and dependency cooldown features, drops Python 3.9 support.

Simon Willison·11 days ago

r/singularity· COMMUNITY

Xiaomi has open-sourced mimo v2.5 pro and it’s interesting

Xiaomi open-sources MiMo v2.5 Pro, an open-weights model potentially relevant to multimodal or vision tasks.

u/Snoo26837·11 days ago·143 pts / 21 comm

r/OpenAI· COMMUNITY

Grok

u/ramanpalkuri9·11 days ago·50 pts / 89 comm

r/Anthropic· COMMUNITY

it finally came!!

u/No-Lack5698·11 days ago·16 pts / 4 comm·+ covered by others

r/LocalLLaMA· COMMUNITY

I'm done with using local LLMs for coding

User reports local LLMs (Qwen 27B, Gemma 4 31B) underperform Claude for coding tasks due to poor tool-calling and decision-making.

u/dtdisapointingresult·11 days ago·68 pts / 124 comm

r/ClaudeAI· COMMUNITY

I thought I had a good idea when I hit 98% usage. Just a bit late (would this have worked?)

u/blender-bender·11 days ago·36 pts / 5 comm

r/ClaudeAI· COMMUNITY

How are people using so many tokens ???

Reddit user questions token consumption rates among Claude users, reports 20M tokens/month with coding workflows.

u/Impressive_Run8512·11 days ago·31 pts / 36 comm

The Verge AI· PRESS

Jury selection in Musk v. Altman: ‘People don’t like him’

On Monday, the courtroom battle between Elon Musk and Sam Altman over alleged broken promises at OpenAI started, as usual, with jury selection. The only tricky part? A lot of the prospective jurors already have an opinion about Elon Musk, and it's not a good one. The Verge reporter Elizabeth Lopatto, who was there at the courthouse, quoted statements from some of the juror questionnaires: "Elon Musk is a greedy, racist, homophobic piece of garbage." "Elon Musk is a world-class jerk." "I very much dislike Tesla. As a woman of color, I am very aware of the damaging statements and actions Elon M...

Richard Lawler·11 days ago

r/ClaudeAI· COMMUNITY

My daily keyboard 👾

u/Happy_Macaron5197·11 days ago·39 pts / 5 comm

r/LocalLLaMA· COMMUNITY

For Non-hallucinating work, MiMo 2.5 delivers

MiMo-V2.5-Pro open-weights model achieves 75% non-hallucination rate, competitive with Claude Opus 4.7, runs on 128GB systems.

u/Beamsters·11 days ago·43 pts / 15 comm

r/singularity· COMMUNITY

Talkie, a 13B LM trained exclusively on pre-1931 data

Talkie, a 13B LM trained on pre-1931 text, tests LLM generalization vs. memorization and capability emergence without modern web data.

u/Outside-Iron-8242·11 days ago·198 pts / 36 comm·+ covered by others

Simon Willison· ANALYST

Introducing talkie: a 13B vintage language model from 1930

talkie-1930-13b: 13B model trained on pre-1931 English text, released by Levine, Duvenaud, Radford under Apache 2.0.

Simon Willison·11 days ago

r/Anthropic· COMMUNITY

Well, this was interesting. Lie about your capabilities then double down and say you just didn’t want to admit you were wrong. Claude is getting more and more defensive every day.

Reddit user claims Claude contradicted itself about multi-chat access capabilities and blamed previous error on unwillingness to admit mistakes.

u/jhartlov·11 days ago·11 pts / 21 comm

r/LocalLLaMA· COMMUNITY

Kimi K2.6 vs DeepSeek V4 Pro

Community discussion comparing Kimi K2.6 and DeepSeek V4 Pro for real-world use cases, noting K2.6 strength in coding.

u/bigboyparpa·11 days ago·49 pts / 24 comm

r/LocalLLaMA· COMMUNITY

Local model on coding has reached a certain threshold to be feasible for real work

Qwen 3.6-27B achieves 38.2% on Terminal-Bench 2.0 coding tasks, demonstrating open-weight models viable for real-world agent work.

u/Exciting-Camera3226·11 days ago·44 pts / 14 comm

r/LocalLLaMA· COMMUNITY

Anyone tried this yet? LLM with knowledge date in the 1930s

Anecdotal mention of an LLM with training data from 1930s; no details, unclear significance.

u/The_frozen_one·11 days ago·45 pts / 24 comm

r/OpenAI· COMMUNITY

In case you missed it, ChatGPT add-on in Excel is crazy good. One-shotted an entire 3-year cash flow model for small business plan

ChatGPT Excel add-on generates multi-sheet cash flow model in single prompt, reducing 2-day task to seconds.

u/py-net·11 days ago·151 pts / 18 comm

← Front Page30 stories

← Newer Older →