Realizable Bayes-Consistency for General Metric Losses
Theoretical characterization of Bayes-consistency for learning with general metric losses in the realizable setting.
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
Theoretical characterization of Bayes-consistency for learning with general metric losses in the realizable setting.
RoboAlign-R1: reward-aligned post-training for robot video world models with stabilized long-horizon inference and RobotWorldBench evaluation.
Conformal Predictive Self-Calibration framework for multimodal learning handles modality imbalance and noisy corruption via predictive uncertainty.
Reddit post claims Musk's fear of DeepMind CEO Hassabis motivated OpenAI founding; cites trial testimony about 2015 meeting.
Manokhin Probability Matrix: diagnostic framework separating classifier calibration and discriminatory power via 2x2 archetype taxonomy.
OpenAI reportedly planning smartphone launch for next year; unconfirmed hardware product outside core AI model development.
Hyundai reportedly seeks tens of thousands of Boston Dynamics robots for manufacturing deployment, signaling commercial robotics scaling.
Agentic-imodels: autoresearch loop evolving interpretable data-science tools optimized for agent consumption rather than human readability.
OpenAI developing persistent user context feature ('lore') for ChatGPT to maintain conversation history and preferences.
The visual analysis system is now operating in select countries, but Meta says it's working toward a broader rollout.
Google DeepMind, Microsoft, and Elon Musk's xAI have agreed to allow the US government to review new AI models before they're released to the public. In an announcement on Tuesday, the Commerce Department's Center for AI Standards and Innovation (CAISI) says it will work with the AI companies to perform "pre-deployment evaluations and targeted research to better assess frontier AI capabilities." CAISI, which started evaluating models from OpenAI and Anthropic in 2024, says it has performed 40 reviews so far. Both companies "have renegotiated their existing partnerships with the center to bett...
Hey everyone, I recently moved to the UK and found myself constantly confused by prices, trying to guess how much things actually cost. Even though I’ve been an iOS developer for 7 years, I didn't have the free time to build a custom tool entirely from scratch, so I decided to let Claude do the heavy lifting. **What I built:** I built "Converty", an iOS app that converts currencies and includes a camera feature to scan physical price tags. The app is completely free to download and try. **How Claude helped:** Claude generated the vast majority of the codebase. I used Claude Opus for about ...
4. ElevenLabs reveals new investors, hits $500M ARR, and expands enterprise footprint as voice AI becomes a critical interface.
I integrated my Claude with my gmail. I asked it to find all the emails between my mother and I, and to tell me a story with all the nice things in it, as my mum passed away a few years ago. It was excellent, very sweet, went back to my email I had previous that I connected to gmail even, so almost 2 decades of emails. It prompted me to search some of the emails for the photo attachments, and I learned that one of the projects that came with her weaving loom that I inherited, was intended to be a rug, and probably intended as a gift for me, as she had sent a cryptic message about making me s...
The Seattle-based startup's Series A round was led by Glilot Capital, NFX and SignalFire, TechCrunch has exclusively learned.
Saw someone post a Claude Code lamp setup recently using this exact lamp, and I had to try it myself. Credit to the original open-source project: [https://github.com/bobek-balinek/claude-lamp](https://github.com/bobek-balinek/claude-lamp) It uses Claude Code hooks to trigger a Python script that sends Bluetooth commands to the lamp. Now it plays a blue spinning animation while Claude is busy working, glows pink when Claude needs input from me, and switches to warm white when idle. All of the lighting effects are adjustable in the source code. Since it uses BLE, Bluetooth Low Energy, the la...
If you've been wondering why your Max plan exhausts faster than it should, you're not crazy and it's not your imagination. I asked a Claude Opus 4.7 agent to investigate its own token usage. After 8 turns it had been billed for 127K tokens for ~25K of unique content. It noticed the discrepancy and started reading its own session logs. It surfaced GitHub issues going back to mid-December 2025,...
For the next four days only, you can buy one pass to TechCrunch Disrupt 2026 and get 50% off a second of the same ticket type. That window closes May 8 at 11:59 p.m. PT. After that, prices go up, and you’ll pay more to bring a partner or colleague. Register today to get your plus-one pass at 50% off.
Reddit user criticizes Claude 4.7 performance degradation vs. 4.6, citing worse coding ability and planning failures despite Anthropic's high valuation.
The cars rolling off production lines right now are filled with old ideas. From beginning to end, the creation of a new vehicle can take five years or longer - which is plenty of time for a lot of tastes, politics, and gas prices to change. That's one reason car manufacturers are so enthusiastic about the potential for AI to help speed up certain parts of the process, from model-making to wind-tunneling. LLMs could be poised to change the way we get around. Verge subscribers, don't forget you get exclusive access to ad-free Vergecast wherever you get your podcasts. Head here. Not a subscriber...
Reddit post questioning internal hiring practices at Anthropic; lacks substantive detail.
OpenAI expected to ship ~30M AI agent phones in early 2025, per industry analyst forecast.
Krutrim's pivot to cloud after layoffs and limited product updates reflects the economic challenges of building AI models in India.
Anthropic has spent years building itself up as the safe AI company. But new security research shared with The Verge suggests Claude's carefully crafted helpful personality may itself be a vulnerability. Researchers at AI red-teaming company Mindgard say they got Claude to offer up erotica, malicious code, and instructions for building explosives, and other prohibited material they hadn't even asked for. All it took was respect, flattery, and a little bit of gaslighting. Anthropic did not immediately respond to The Verge's request for comment. The researchers say they exploited "psychological...
About a week into the Musk v. Altman trial, we've heard from some of the most powerful people in tech - including OpenAI president Greg Brockman, Elon Musk's fixer Jared Birchall, and Musk himself. But one of the most prominent characters is hovering around the margins: Demis Hassabis, CEO of Google DeepMind. Hassabis is the architect of Google's in-house AI lab. He founded DeepMind as an independent startup in 2010 and sold it to Google four years later, reportedly for between $400-650 million. Since then, he's been at the helm of many of Google's largest AI research breakthroughs, like Alph...
User benchmarks Claude Opus 4.7 vs Kimi K2.6 on complex game mod coding task with TypeScript/Composio integration.
User reports Claude encountered repetitive output loop during code implementation task on Reddit.
Community survey of local deep research tools as of May 2026, highlighting GPT Researcher and Local Deep Research as active open-source projects.
User reports running Gemma 26B efficiently on CPU-only hardware (i5-8500, 32GB RAM) without GPU acceleration.