[New Optimizer] 🌹 Rose: low VRAM, easy to use, great results, Apache 2.0 [P]
Rose: stateless PyTorch optimizer with low VRAM footprint and fast convergence, released under Apache 2.0.
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
Rose: stateless PyTorch optimizer with low VRAM footprint and fast convergence, released under Apache 2.0.
DeepSeek V4 Pro shows weaker-than-expected performance on LMSYS Arena user preference voting, a crowdsourced benchmark distinct from capability measurement.
Technical deep-dive on DeepSeek V4 architecture: hybrid sparse attention, manifold-constrained connections, and FP4 quantization innovations vs. V3.
User shares local hardware build specs for AI workloads including CPU, GPU setup, and thermal management configuration.
Reddit comparison thread between DS4-Flash and Qwen3.6 models lacking substantive analysis or benchmark data.
Anthropic outlines safeguards for Claude during US midterms and global elections to mitigate disinformation and manipulation risks.
Chinese AI company DeepSeek released a preview of its hotly anticipated next-generation AI model V4 on Friday, saying that the open-source model can compete with leading closed-source systems from US rivals including Anthropic, Google, and OpenAI. DeepSeek says V4 marks a major improvement over prior models, especially in coding, a capability that has become central to AI agents and helped drive the success of tools like ChatGPT Codex and Claude Code. The release is also a milestone for China's chip industry, with DeepSeek explicitly highlighting compatibility with domestic Huawei technology....
The three finalists for the World Press Photo of the year. | Image: World Press Photo We love to muse over how "real" photography is defined here at The Verge now that generative AI is so prolific, and the World Press Photo competition might have the answer. The prestigious award celebrates the best of photojournalism, where capturing reality is paramount. The winning entry for 2026 - "Separated by ICE," captured by photojournalist Carol Guzy - was announced yesterday. The harrowing photograph shows children clinging to their father after an immigration hearing. The photo had to abide by spec...
Reddit discussion comparing OpenCode vs ClaudeCode inference tools for Qwen 3.5 27B on Linux.
Reddit discussion questioning GPT 5.5 Pro underperforming GPT 5.4 Pro on HLE benchmark with tools.
Qwen 3.6 35B-A3B MoE model achieves 250+ tok/s on AMD Radeon 780M iGPU via llama.cpp Vulkan.
DeepSeek-v4 demonstrates 384K token context window by generating 100KB single-file HTML application on user request.
Reddit discussion speculating on ICML 2026 acceptance score thresholds before notification on April 30.
Reddit user argues GPT 5.5 feels more intuitive despite lower-than-expected benchmark gains, citing improved argument coverage.
I have a 20x Claude account and have been using Opus 4.7 exclusively for all code. I noticed even after asking multiple times to do code review, Opus would still not get there 100%. Here is what I did: 1. Installed Codex cli and ran it in a Tmux session 2. Claude created PR for Codex to review 3. Claude pinged Codex via shell so I can see the Codex thinking and approve any file permission. Claude set a wake up window. 4. Codex reviewed and updated comments in PR. 5. Claude woke up and validated the comments before editing code. Surprisingly Claude missed a lot of things...
Reddit discussion thread about Deepseek v4; lacks substantive detail or official announcement.
DeepSeek releases V4-Pro (1.6T params, 49B active) and V4-Flash (284B/13B) with 1M context, largest open-weights models, MIT licensed.
Reddit user reports Claude refusing game dev prompts, suspects safety filter over-blocking benign 'self-destruct' game mechanic naming.
Latent Space newsletter item referencing GPT 5.5 and OpenAI Codex Superapp with minimal detail; unclear if announcement or speculation.
DeepSeek-V4 does not include multimodal capabilities; user speculates on future roadmap.
DeepSeek v4 Flash offers competitive pricing for its model size on the official API.
DeepSeek plans Huawei V4 inference scale-up with 950 supernodes H2 2024, targeting Pro pricing reduction.
Simon Willison releases a utility tool to convert millisecond durations to human-readable time formats.
DeepSeek V4 benchmark results released; comparative performance data on frontier model capability.
Reddit user reports subjective quality regression in Claude Opus 4.7 compared to 4.5, citing reduced intuition and increased need for explicit guidance.
Simon Willison's newsletter includes a new chapter on Agentic Engineering Patterns plus curated links and blog posts.