The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

r/OpenAI· COMMUNITY

just shut up and trust us

Reddit post expressing skepticism about OpenAI's communication transparency; lacks substantive technical or policy detail.

u/tombibbs·17 days ago·60 pts / 10 comm

r/LocalLLaMA· COMMUNITY

Confirmed: SWE Bench is now a benchmaxxed benchmark

Reddit post claims SWE-Bench benchmark has become saturated or overfitted; lacks specifics on methodology or evidence.

u/rm-rf-rm·17 days ago·72 pts / 22 comm

r/ClaudeAI· COMMUNITY

How do you get AI to generate UI that actually feels designed?

Developer discusses limitations of AI-generated UI: functional but lacks design hierarchy, visual intentionality, and product craft despite system prompts.

u/ProfessionalNinja876·17 days ago·20 pts / 21 comm

r/ClaudeAI· COMMUNITY

What Claude Design does really well (and not so well)

I did a deep dive on Claude Design and below are my thoughts. What it does extremely well: * **Improves your prompt** \- similar to "ask me questions" when chatting to an LLM. Can make the difference between slop and actually useful. * **Invokes agent skills for you** \- a game changer for people who don't live in the terminal * **Claude Code handoff** \- easily get Claude Code to build it for real with a simple link share. Genius. * **Comment feature** \- spatial editing (similar to Cursor and a few others), but selection is very accurate and I like how you can queue up edits and select wh...

u/the-design-engineer·17 days ago·21 pts / 7 comm

r/singularity· COMMUNITY

geoguessr time travel clone with gpt-image-2

Reddit user describes using GPT's image generation to create 360° panoramas for a GeoGuessr-style application, enabling synthetic historical scene recreation.

u/Proof-Square7528·17 days ago·210 pts / 26 comm

OpenAI· FRONTIER

Our Principles

OpenAI publishes five guiding principles for AGI development under Sam Altman's leadership, emphasizing broad humanity benefit.

OpenAI·17 days ago

TechCrunch AI· PRESS

To buy this Bay Area home, you’ll need Anthropic equity

Someone’s offering an unusual deal for a 13-acre property in Mill Valley, just north of South Francisco.

Anthony Ha·17 days ago

r/ClaudeAI· COMMUNITY

Is running multiple Claude chats actually making you more productive?

Reddit discussion on productivity trade-offs of parallel Claude conversations; anecdotal user experiences, no empirical data.

u/andregustavoxs·17 days ago·21 pts / 48 comm

The Archive

just shut up and trust us

Confirmed: SWE Bench is now a benchmaxxed benchmark

How do you get AI to generate UI that actually feels designed?

What Claude Design does really well (and not so well)

geoguessr time travel clone with gpt-image-2

Our Principles

To buy this Bay Area home, you’ll need Anthropic equity

Is running multiple Claude chats actually making you more productive?

Agentic Fusion of Large Atomic and Language Models to Accelerate Materials Discovery

Why do only big ML labs dominate widely-used models despite many open-source pretrained models smaller labs could do RL on? [D]

Modeling Induced Pleasure through Cognitive Appraisal Prediction via Multimodal Fusion

Made by new ChatGPT image generation, Jesus

The Override Gap: A Magnitude Account of Knowledge Conflict Failure in Hypernetwork-Based Instant LLM Adaptation

SFT-then-RL Outperforms Mixed-Policy Methods for LLM Reasoning

Fixed-Reservoir vs Variational Quantum Architectures for Chaotic Dynamics: Benchmarking QRC and QPINN on the Lorenz System

Transformer as an Euler Discretization of Score-based Variational Flow

Multimodal QUD: Inquisitive Questions from Scientific Figures

Anyone else notice that the most capable models aren't actually available to us anymore?

Are Unsloth models as good as I read?

Impact of Age Specialized Models for Hypoglycemia Classification

Expert Evaluation of LLM's Open-Ended Legal Reasoning on the Japanese Bar Exam Writing Task

Claude 4.7 named a journalist from 125 words of unpublished writing

ESIA: An Energy-Based Spatiotemporal Interaction-Aware Framework for Pedestrian Intention Prediction

Zoom In, Reason Out: Efficient Far-field Anomaly Detection in Expressway Surveillance Videos via Focused VLM Reasoning Guided by Bayesian Inference

Quasi-Equivariant Metanetworks

AIPsy-Affect: A Keyword-Free Clinical Stimulus Battery for Mechanistic Interpretability of Emotion in Language Models

HeadRouter: Dynamic Head-Weight Routing for Task-Adaptive Audio Token Pruning in Large Audio Language Models

Information-Theoretic Measures in AI: A Practical Decision Guide

OptProver: Bridging Olympiad and Optimization through Continual Training in Formal Theorem Proving

Amazon plans to invest up to $25 billion in Anthropic