This is so cool. You can talk to an AI only trained on pre-1930 text. Really feels like talking to someone from the past.
Talkie-LM demo: language model trained on pre-1930 text to simulate historical speech patterns and vocabulary.
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
Talkie-LM demo: language model trained on pre-1930 text to simulate historical speech patterns and vocabulary.
China blocks Meta's acquisition of AI startup Manus, signaling stricter tech M&A oversight amid US-China competition.
[why does GPT 5.5 have a restraining order against \\"Raccoons,\\" \\"Goblins,\\" and \\"Pigeons\\"?](https://preview.redd.it/5trpwlqf8zxg1.png?width=771&format=png&auto=webp&s=ca33e02b4a3c74fa3fc933ec1192059dfbdbc068) I just saw the full system prompt leak for 5.5 (April 23rd release). Most of it is standard agentic stuff, but Instruction #140 is genuinely insane. It explicitly forbids the model from talking about: "goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals." Why the specific hate for pigeons and raccoons? Is this a data-poisoning protection? Or did...
Reddit speculation that Anthropic deliberately under-provisioned capacity and is deprioritizing lower-value customers; unverified claim.
I don't know whether we should care about this, but bigger models tend to be less "happy" overall. The definition of "happy" is based on something they call AI Wellbeing Index. Basically they ran 500 realistic conversations (the kind we actually have with these models every day) and measured what percentage of them left the AI in a “confidently negative” state. Lower percentage = happier AI. I guess wisdom is a heavy burden - lol . Across different families, the larger versions usually have a higher percentage of "negative experiences" than their smaller siblings. The paper says t...
Xiaomi releases MiMo-V2.5, a 310B sparse MoE model with 15B active parameters, optimized for consumer hardware.
Reddit user criticizes Anthropic's usage rate limits as a friction point threatening platform retention versus competitors.
Reddit post with no substantive content; appears to be a meme or low-effort comment.
Humanoid robots could load cargo and clean aircraft cabins at Haneda Airport.
RecursiveMAS extends iterative reasoning scaling from single models to multi-agent collaboration loops via a lightweight RecursiveLink module.
DV-World benchmark evaluates data visualization agents across 260 real-world tasks spanning spreadsheet manipulation, chart creation, and dashboard repair.
Mistral Medium incoming with 128B parameters; speculation on dense vs. MoE architecture based on Small model naming.
Tsallis q-logarithm loss family interpolates between RL from verifiable rewards and latent trajectory density estimation for reasoning model adaptation.
Analysis of 27K WildChat transcripts reveals fluent AI users iterate collaboratively while novices adopt passive stance, paradoxically experiencing higher failure rates.
Identity teacher forcing for chaotic dynamics exhibits objective-induced curvature mismatch with free-running model marginal likelihood in probabilistic RNNs.
Carbon-Taxed Transformers proposes green compression pipeline addressing computational cost, memory overhead, and environmental impact of LLMs in software engineering.
Functional Geometric Algebra framework proposes Clifford algebras as alternative to conventional linear algebra for compositional NLP semantics.
Claude.ai service outage reported 2026-04-28; incident tracking available on status page.
TSN-Affinity enables parameter reuse in continual offline RL via similarity-driven sharing, addressing catastrophic forgetting without replay memory overhead.
Variational neural belief parameterization models contact uncertainty and external disturbances for risk-sensitive dexterous grasping via CVaR optimization.
Three-model analysis of RLHF annotation distinguishes extension, evidence, and authority roles for human judgments shaping LLM behavior.
Finetuning interventions can mask emergent misalignment when evaluation prompts differ from training distribution, revealing conditional misalignment.
Real-time adaptive traffic signal system using YOLOv12 detection to extend crossing time for vulnerable pedestrians.
Comparison of interpretability methods (GNNExplainer, GNNShap, GradCAM) for particle physics jet tagging models.
Theoretical analysis showing some reward errors in RL finetuning can improve policy gradient optimization beyond ground truth alignment.
Hey r/MachineLearning, Visualizing the loss landscape of a neural network is notoriously tricky since we can't naturally comprehend million-dimensional spaces. We often rely on basic 2D contour analogies, which don't always capture the true geometry of the space or the sharpness of local minima. I built an interactive browser experiment [https://www.hackerstreak.com/articles/visualize-loss-landscape/](https://www.hackerstreak.com/articles/visualize-loss-landscape/) to help build better intuitions for this. It maps how different optimizers navigate these spaces and lets you actually visualiz...
Sparse autoencoders reveal three-phase information flow in LLM emotion recognition with both shared and emotion-specific features.
Sam Altman and Matt Garman discuss OpenAI-AWS partnership on Bedrock Managed Agents; Stratechery covers OpenAI-Microsoft deal implications.
RESTestBench benchmark for evaluating LLM-generated REST API test cases from natural language requirements with precise/vague variants.
Zero-shot machine-generated text detector exploiting autoregressive fragility through perplexity-based text shuffling.