The State of Fable, The Jailbreak Problem, SpaceX Acquires Cursor
The administration is very likely wrong about Fable, but that is ultimately Anthropic's responsibility.
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
The administration is very likely wrong about Fable, but that is ultimately Anthropic's responsibility.
Separately, neither could compete. Now they hope they can.
The deal is supposed to help SpaceX's struggling AI division. The company told IPO investors it sees a $26 trillion addressable market in AI.
AI coding agents are increasingly used to generate pull requests (PRs) that propose code fixes in software projects. From a first exploration of the AIDev dataset, we find that 46.41\% of the fixes proposed by the agents Copilot, Devin, Cursor, and Claude are rejected. This represents a significant amount of wasted resources that require human reviews, verifications, and running tests and validations for fixes that are merely discarded. Our goal in this paper is to understand the failure modes of AI-agents, an understanding that is crucial for better integrating AI-agents as efficient teammat...
Reward hacking is usually studied after it becomes visible, once a model earns high proxy reward while failing the intended task. We instead study what proxy RL teaches before that failure appears. We introduce Proxy Reward Internalization and Mechanistic Exploitation (PRIME), a learned capability to assess task correctness, predict proxy acceptance, and reason about exploitable proxy--gold gaps. In coding RL environments with exploitable pytest rewards, we measure PRIME through chain-of-thought monitoring, direct probes, and activation-level concept vectors. We find that PRIME emerges in a s...
We introduce an explainable machine-learning approach that forecasts the structural precursors of scientific breakthroughs -- the emergence and intensification of links between research concepts -- by modelling how OpenAlex concept networks evolve over time. Using 59 semantic and topological features, a two-stage LightGBM model jointly predicts the formation and the future weight of concept pairs, adding a regression stage that quantifies expected intensity to prior link-existence forecasts. Relative to the state of the art, the approach improves accuracy and explainability at once: comparati...
Unscheduled trips of high-power pulsed converters are a leading source of downtime at large accelerator facilities. At the Spallation Neutron Source (SNS), the High Voltage Converter Modulators (HVCMs) are consistently the second-largest contributor to lost beam time. Each HVCM pulse is recorded across sensor channels spanning currents, voltages, and magnetic fluxes, whose mutual interactions encode the operating state of the system. Fault precursors do not manifest uniformly across these channels: depending on fault type, they may alter the temporal structure of individual signals, change th...
Going through the credit card statement, here's what I had active: Claude Pro (40), ChatGPT Plus (20), Cursor (20), Perplexity Pro (20), Notion AI (10), Granola (20), ElevenLabs Starter (5), Midjourney Basic (10), Gamma Pro (10), Beautiful.ai (12), Otter Pro (17), Loom Business (15), Zapier Pro (30), Make Core (10), Tactiq Pro (8), Descript Creator (15), Reclaim.ai Pro (8), Motion (19), Superhuman (30), one i can't remember the name of (10), some ai-something for instagram captions (11) Then I sat down and wrote next to each one the last time I'd actually used it. Not opened it, used it for...
Hi all, Sorry for going missing — we’ve been collecting a larger, higher-quality set of more complex tasks. We’re excited to share a major leaderboard update covering the past three months. We’ve updated the **SWE-rebench leaderboard** with **110 fresh Python tasks** from GitHub PRs created in **March, April, and part of May**. The setup follows the standard SWE-bench format: models read real PR issues, edit code, run tests, and must make the full test suite pass. This time, instead of our usual monthly updates with a smaller number of tasks, we collected a larger batch so we could evalua...
I hate recording demo videos, so I made an open source skill for it: [https://github.com/MobAI-App/desktop-recorder-skill](https://github.com/MobAI-App/desktop-recorder-skill) Now I can give Claude a prompt like: Record a short demo of this app flow And it handles the annoying parts for me: preparing the app state, clicking through the flow, recording, adding cursor/click effects and captions, then exporting the video. So instead of spending time setting everything up and recording the same demo manually, I can let Claude do it while I work on something else. It also has Remotion integr...
HarnessAPI unifies LLM tool and HTTP API definitions from single Python source; eliminates duplication across Claude, Cursor agent runtimes.
Cursor evals show Gemini 3.5 Flash underperforms on coding tasks vs. competitors.
For a quarter century, the Google search box has been one of the most recognizable interfaces in computing: a thin white rectangle, a blinking cursor, a few typed words, and a list of blue links. On Tuesday, Google will formally retire that paradigm. At its annual I/O developer conference, Google announced a sweeping redesign of the search box itself — the literal text field where billions of queries begin every day — transforming it from a simple keyword input into a dynamic, AI-driven conversation starter that can accept text, images, PDFs, videos, and even open Chrome tabs as inputs. The c...
Codegraph tool uses pre-indexed knowledge graphs to reduce Claude API tool calls by 94% and latency by 82% for code analysis tasks.
Solo freelancer tracked 60-day AI coding tool spend and productivity ROI across Cursor, Claude, and other services.
Anyone know of a solution for tying in multiple IDE sessions with a multi-repo project so that they work cooperatively with a single shared inbox/memory? Here is my use case (whether it’s with or without the use of Storybloq): \- all sessions are running Storybloq which saves root level /.story tickets and issues or if I have multiple projects I store each of them in /projects/<project\_name>/.story \- have three repos open in Cursor with 1-2 sessions each \- have a master Cursor session open that at the root level with /Sites/.story I use the master session for any multi-repo or...
Developer built 3 browser games with Claude/Cursor in 3 months (no prior coding), reaching 25M+ plays; documents rapid prototyping and user adoption.
User reports high API costs for Claude Opus and GPT-5.5 on Cursor, predicts open-source models will displace proprietary tools by end of 2024.
Developer reports local Qwen 27B setup with llama-server now competitive with Claude Code and Cursor for coding tasks, driven by cloud provider cost increases.
At TechCrunch's sold-out StrictlyVC event in San Francisco on Thursday night, we covered a lot of ground in a short time, beginning with the question everyone in the industry is asking right now: in a world where rival Cursor is reportedly in talks to be acquired by SpaceX for $60 billion, is Replit also bound to sell?
I did a deep dive on Claude Design and below are my thoughts. What it does extremely well: * **Improves your prompt** \- similar to "ask me questions" when chatting to an LLM. Can make the difference between slop and actually useful. * **Invokes agent skills for you** \- a game changer for people who don't live in the terminal * **Claude Code handoff** \- easily get Claude Code to build it for real with a simple link share. Genius. * **Comment feature** \- spatial editing (similar to Cursor and a few others), but selection is very accurate and I like how you can queue up edits and select wh...
A new era is on the way for Apple as Tim Cook plans to step down from his CEO role in September, handing the reins to hardware chief John Ternus. Ternus may be inheriting one of the most durable businesses in tech, but he’s also stepping into a very different ecosystem than the one Cook spent decades shaping. The App […]
Stratechery weekly digest covering Tim Cook's Apple departure, Cursor IDE, SpaceX developments, and geopolitical competition.
Cursor was on track to close a $2 billion funding round this week but chose to halt discussions after SpaceX offered a $10 billion "collaboration fee" and a path to a $60 billion acquisition.
Commentary on Apple's John Ternus appointment and its implications for hardware-AI strategy, with tangential reference to SpaceX-Cursor partnership.
OpenAI launches GPT-Image-2; Cursor secures $10B contract with xAI and $60B acquisition option.
Only Elon would do this before an IPO.
With an IPO looming for Elon Musk's SpaceX / xAI / X combo platter of companies, SpaceX has announced an odd arrangement to either acquire the automated programming platform Cursor for $60 billion or pay a fee of $10 billion. Buying this startup that's focused on AI coding could help xAI's tools compete with market leader Anthropic, as well as the other competitors. A report by The Information this week said Sergey Brin has directed Google's "strike team" to help its agentic AI tools catch up, while Sam Altman reportedly declared a "code red" at OpenAI last year before shutting down Sora to f...
I’ve been building plugins for Claude Code, and the first version of the idea was very Claude-focused. That made sense at the start. Claude Code has a real plugin model, hooks are useful, and it is one of the few agent tools where plugins can actually become part of a daily workflow. But after building a few integrations, I kept running into the same uncomfortable question: If I write the useful part of a plugin once, why should I rewrite or repackage the same thing again for Codex, Gemini, Cursor, OpenCode, and whatever comes next? The actual plugin logic is often not Claude-specific. Th...