Skill Neologisms: Towards Skill-based Continual Learning
Skill neologisms—soft tokens optimized for new capabilities—enable selective LLM skill extension without catastrophic forgetting or context limits.
Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.
Skill neologisms—soft tokens optimized for new capabilities—enable selective LLM skill extension without catastrophic forgetting or context limits.
ReshapeOT improves optimal transport for distribution shifts by reshaping ground metrics using observed sample displacements.
Simon Willison observes convergence between vibe coding and agentic engineering in practical AI-assisted development workflows.
TabEmbed introduces first generalist embedding model for tabular data and TabBench, a comprehensive benchmark for tabular understanding evaluation.
EP-GRPO fixes credit assignment failures in GRPO-based LLM reasoning via token-level entropy, polarity-aware rewards, and zero-variance collapse mitigation.
Conformal prediction applied to graph-structured time series; addresses non-exchangeability via spectral graph theory for rigorous uncertainty quantification.
KernelBench-X evaluates LLM-generated Triton GPU kernels across 176 tasks; finds task structure explains 3x more correctness variance than method design.
First order-based rehearsal learning method for avoiding undesired futures; uses ordinal structures instead of graph estimation.
Study determines optimal feature computation budget fraction for per-instance algorithm selection in black-box optimization.
AIR-MoE uses vector quantization for efficient routing in granular mixture-of-experts, reducing computational overhead of token-to-expert assignment.
Comparative study of LoRA and QLoRA fine-tuning on Bashkir, a low-resource Turkic language, using models from DistilGPT2 to Qwen2.5-7B.
Theoretical analysis of batch normalization's effect on geometry of piecewise-affine networks during training via hyperplane switching.
DART, a vision-language foundation model for synthetic fiber rope condition monitoring, provides severity estimates, maintenance recommendations, and automated reports.
Neuro-symbolic system combining LLM parser with automated theorem prover for syllogistic reasoning in SemEval-2026 Task 11.
User demonstrates Qwen3.6 27B running 200k context on single RTX 5090 with NVFP4 quantization in vLLM, sharing exact configuration and parameters.
Modular multi-agent reinforcement learning approach for cooperative robot swarms with limited communication and local interaction.
South Korean humanoid robot programmed with Buddhist practices; novelty claim lacks technical substance or robotics advancement details.
Three days left to lock in 50% off a second ticket to Disrupt 2026. Buy one TechCrunch Disrupt 2026 ticket, and get a second ticket at 50% off. Gain more visibility in the tech industry. Offer ends May 8 at 11:59 p.m. PT.
Drift-aligned tangent regularization (DTR) bounds deployment risk under covariate shift using Jacobian-velocity theorem and Poincaré inequalities.
Controlled benchmark study diagnosing when causal vs. correlational methods fail for gene regulatory network inference from single-cell RNA-seq.
Samsung crossed the $1 trillion valuation mark after shares surged on AI-driven chip demand, making it only the second Asian company after TSMC to hit the milestone.
Large-scale USPTO study finds promotional language in patents negatively correlates with approval probability, contrary to science communication norms.
Evolving Idea Graphs (EIG), a multi-agent LLM framework using learnable graph edits for scientific ideation with novelty, feasibility, clarity metrics.
Reddit user reports Claude Opus struggles to distinguish word obscurity via corpus frequency vs. human recognition familiarity.
Reddit user expresses frustration with detectability and stylistic uniformity of AI-generated text across news and government documents.
Mechanic argues blue-collar work faces AI displacement risk through task simplification rather than machine capability escalation, challenging consensus on trade job resilience.
Want real human feedback related to your search results? Google’s AI now fetches it for you. | Image by Google / The Verge Google is updating its AI Search features to make it easier for users to find information from sources they know and trust. One of the more notable changes introduces "a preview of perspectives" from firsthand sources like social media, Reddit, and other web forums, effectively linking your search queries with online conversations around similar topics. Google says this update aims to address that "people are increasingly seeking out advice from others" when searching for...
EnterpriseRAG-Bench: 500k-document synthetic dataset benchmarking RAG systems on realistic internal company data (Slack, email, tickets, PRs) vs. public corpora.
Developer describes workflow using Claude voice for brainstorming during walks, then Claude Code for implementation.
I'm a fast typer, but I find my projects go a lot better when I'm able to really dictate with Claude. I appreciate this won't be the case for all of you. At the moment I'm much more productive if I'm working from home or in a quiet space. There is a sensitivity setting on FluidVoice so I try to whisper, but so far it just ends up feeling too awkward and I go immediately back to typing. Also someone inevitably starts talking louder somewhere else in the office and the acoustics can impact what I'm saying. You can't express your questions and theories as freely as you'd like, because you'...