GPT-5.5 matches heavily hyped Mythos Preview in new cybersecurity tests
New results suggest Mythos' cyber threat isn't "a breakthrough specific to one model."
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
New results suggest Mythos' cyber threat isn't "a breakthrough specific to one model."
Decoupled relation alignment method (DRSA) for extending graph foundation models to multi-domain heterogeneous graphs.
Companies are taking control of their own data to tailor AI for their needs. The challenge lies in balancing ownership with the safe, trusted flow of high‑quality data needed to power reliable insights. This conversation from MIT Technology Review’s EmTech AI conference examines how AI factories unlock new levels of scale, sustainability, and governance—positioning data…
Anthropic gates new Claude capabilities (Ultraplan, Ultrareview, Cloud Security) behind paid Cloud plans rather than open releases, fragmenting the skill ecosystem and limiting composability.
Combinatorial Complex Weisfeiler-Lehman test extending WL expressiveness framework to unified topological structures.
Apple's support app includes claude.md files, indicating internal Claude integration or documentation.
Decentralized MCMC algorithm for constrained sampling via proximal stochastic gradient Langevin dynamics with convergence guarantees.
ICASSP 2025 challenge entry using diffuse RIR generation and quality filtering to improve speaker distance estimation models.
Compositional graph embedding framework using Aitchison geometry for interpretable node representations as simplex-valued mixtures.
Reddit speculation on status of rumored OpenAI-Jony Ive hardware/product collaboration; no new information.
PFlash: speculative prefill technique achieves 10x speedup on 128K context with quantized 27B models on RTX 3090, open-source C++/CUDA implementation.
Deep kernel learning with transformer embeddings stratifies glaucoma patient risk from sparse EHR data; medical ML application without LLM/frontier AI component.
FinSafetyBench: bilingual red-teaming benchmark (14 subcategories) for evaluating LLM refusal of financial crimes and ethics violations grounded in real cases.
MemCoE: cognition-inspired two-stage memory optimization for LLM agents to learn personalized long-term user preferences within context windows.
FedKPer addresses generalization/personalization in medical federated learning via knowledge personalization; healthcare ML infrastructure without LLM focus.
Persona-induced latent variable model for adaptive user querying under budget constraints; ML methodology tangential to frontier LLM research.
ML-Bench&Guard: policy-grounded multilingual safety benchmark (14 languages) aligning LLMs with region-specific regulations and cultural context.
Reddit user reports severe hallucinations and task non-compliance in Claude Opus 4.7 on May 1st; anecdotal complaint without reproduction details.
Developer demo of generative game engine using Gemini 3 for spell generation with 6-player multiplayer physics simulation.
The Pentagon has struck deals with OpenAI, Google, Microsoft, Amazon, Nvidia, Elon Musk's xAI, and the startup Reflection, allowing the agency to use their AI tools in classified settings, according to an announcement on Friday. At the same time, the Defense Department has left out Anthropic - which it previously used for classified information - after declaring it a supply-chain risk. This builds upon deals with OpenAI and xAI, which have already reached agreements with the Pentagon for the "lawful" use of their AI systems. A report from The Information suggests Google has struck a similar a...
Intel releases AutoRound, a low-bit quantization algorithm optimized for CPU/XPU/CUDA with vLLM and Transformers compatibility.
Obfuscated Natural Number Game benchmarks LLM prover architectural reasoning vs. pattern matching; evaluates formal theorem-proving capabilities beyond saturation.
Elon Musk spent the better part of three days on the witness stand this week in his lawsuit against OpenAI, and it’s already getting messy. Emails, texts, and his own tweets are surfacing in court, and there are plenty more witnesses to come. Musk’s argument against OpenAI? By converting the company to a for-profit model, Sam Altman betrayed the “nonprofit for the […]
MathArena: continuously-maintained evaluation platform aggregating mathematics benchmarks to track LLM progress; successor to static math benchmarks.
Augmented Lagrangian Multiplier Network stabilizes state-wise constraint enforcement in RL; safety optimization methodology without LLM specificity.
InpaintSLat: training-free 3D inpainting via initial noise optimization in latent diffusion; computer vision task orthogonal to LLM/frontier AI focus.
Formalizes Phase-Latency Isomorphism showing spiking sparse distributed memory and transformers share five functional operations with cosine similarity retrieval.
Introduces mini-batch Markov risk measures and multipattern Q-learning with regret bounds for risk-averse finite-horizon MDPs.
Elon Musk is the one who wanted this trial. He has spent months claiming OpenAI "stole a nonprofit," and saying he was the actual driving force behind one of the most important companies currently in tech. All indications are that he won't win his case against the company, but he's fighting it anyway. So you'd think he'd have done better when it was his time to take the stand. Verge subscribers, don't forget you get exclusive access to ad-free Vergecast wherever you get your podcasts. Head here. Not a subscriber? You can sign up here. Instead, Musk spent much of the week arguing with lawyers ...
AdaMeZO enables Adam-style zeroth-order LLM fine-tuning without storing moment estimates, reducing GPU memory while maintaining convergence.