The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

In this post, we dive into one of the most critical workloads in modern AI: Flash Attention, where you’ll learn: How to implement Flash Attention using NVIDIA... In this post, we dive into one of the most critical workloads in modern AI: Flash Attention, where you’ll learn: Environment requirements: See the quickstart doc for more information on installing cuTile Python. The attention mechanism is the computational heart of transformer models. Given a sequence of tokens, attention enables each token to “look at” every other… Source

Alessandro Morari·4 months ago

NVIDIA Dev Blog· INFRA

Controlling Floating-Point Determinism in NVIDIA CCCL

A computation is considered deterministic if multiple runs with the same input data produce the same bitwise result. While this may seem like a simple property... Source

Nader Al Awar·4 months ago

Hugging Face· INFRA

Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations

Hugging Face·4 months ago

OpenAI· FRONTIER

Introducing GPT-5.4

OpenAI releases GPT-5.4, frontier model with 1M-token context, state-of-the-art coding, computer use, and tool search capabilities.

OpenAI·4 months ago

OpenAI· FRONTIER

GPT-5.4 Thinking System Card

No content provided for evaluation.

OpenAI·4 months ago

OpenAI· FRONTIER

Reasoning models struggle to control their chains of thought, and that’s good

OpenAI introduces CoT-Control, finding reasoning models struggle to control chains-of-thought, highlighting monitorability as a safety safeguard.

OpenAI·4 months ago

OpenAI· FRONTIER

Ensuring AI use in education leads to opportunity

OpenAI introduces AI education tools, certifications, and measurement resources to help schools address AI capability gaps.

OpenAI·4 months ago

Hugging Face· INFRA

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

Hugging Face·4 months ago

OpenAI· FRONTIER

The five AI value models driving business reinvention

OpenAI outlines five sequential AI value models for business transformation from workforce fluency to process reinvention.

OpenAI·4 months ago

OpenAI· FRONTIER

Introducing the Adoption news channel

OpenAI launches Adoption news channel offering frameworks and insights for AI business deployment.

OpenAI·4 months ago

OpenAI· FRONTIER

VfL Wolfsburg turns ChatGPT into a club-wide capability

VfL Wolfsburg scales ChatGPT across operations by prioritizing people and organizational capability over isolated pilots.

OpenAI·4 months ago

OpenAI· FRONTIER

Introducing ChatGPT for Excel and new financial data integrations

OpenAI ships ChatGPT for Excel with GPT-5.4 and financial data integrations for modeling and regulated environments.

OpenAI·4 months ago

OpenAI· FRONTIER

Extending single-minus amplitudes to gravitons

GPT-5.2 Pro assists in deriving and verifying graviton tree amplitudes, extending single-minus amplitudes to quantum gravity.

OpenAI·4 months ago

OpenAI· FRONTIER

Understanding AI and learning outcomes

OpenAI introduces Learning Outcomes Measurement Suite to assess AI's impact on student learning across educational contexts.

OpenAI·4 months ago

OpenAI· FRONTIER

How Axios uses AI to help deliver high-impact local journalism

Axios uses AI to augment local journalism workflows and reporter productivity while maintaining editorial standards at scale.

OpenAI·4 months ago

NVIDIA Dev Blog· INFRA

How to Minimize Game Runtime Inference Costs with Coding Agents

NVIDIA ACE is a suite of technologies for building AI agents for gaming. ACE provides ready-to-integrate cloud and on-device AI models for every part of in-game... NVIDIA ACE is a suite of technologies for building AI agents for gaming. ACE provides ready-to-integrate cloud and on-device AI models for every part of in-game characters, from speech to intelligence to animation. To run these models alongside the game engine efficiently, the NVIDIA In-Game Inferencing (NVIGI) SDK includes a set of performant libraries that developers can integrate into C++… Source

Brandon Rowlett·4 months ago

NVIDIA Dev Blog· INFRA

cuTile.jl Brings NVIDIA CUDA Tile-Based Programming to Julia

NVIDIA CUDA Tile is one of the most significant additions to NVIDIA CUDA programming and unlocks automatic access to tensor cores and other specialized... NVIDIA CUDA Tile is one of the most significant additions to NVIDIA CUDA programming and unlocks automatic access to tensor cores and other specialized hardware. Earlier this year, NVIDIA released cuTile for Python, giving Python developers a natural way to write high-performance GPU kernels. Now, the same programming model is available in Julia through cuTile.jl. In this blog post… Source

Tim Besard·4 months ago

Hugging Face· INFRA

PRX Part 3 — Training a Text-to-Image Model in 24h!

Hugging Face·4 months ago

Google DeepMind· FRONTIER

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Google DeepMind releases Gemini 3.1 Flash-Lite, the fastest and most cost-efficient model in the Gemini 3 series.

Google DeepMind·4 months ago

OpenAI· FRONTIER

GPT-5.3 Instant System Card

OpenAI publishes system card documenting GPT-5.3 Instant capabilities, limitations, and safety properties.

OpenAI·4 months ago·+ covered by others

OpenAI· FRONTIER

GPT-5.3 Instant: Smoother, more useful everyday conversations

OpenAI releases GPT-5.3 Instant, a faster variant optimized for everyday conversational use cases.

OpenAI·4 months ago

Cohere· FRONTIER

Advantage of AI in Business: How Enterprises Win with AI in 2026

Cohere C-suite guide on enterprise AI advantages: productivity, competitive advantage, and 2026 adoption strategies.

Cohere·4 months ago

Import AI· ANALYST

Import AI 447: The AGI economy; testing AIs with generated games; and agent ecologies

Import AI 447 explores AGI economic models, game-based AI evaluation, and multi-agent ecosystem dynamics.

Jack Clark·4 months ago

NVIDIA Dev Blog· INFRA

Building Telco Reasoning Models for Autonomous Networks with NVIDIA NeMo

Autonomous networks are quickly becoming one of the top priorities in telecommunications. According to the latest NVIDIA State of AI in Telecommunications... Autonomous networks are quickly becoming one of the top priorities in telecommunications. According to the latest NVIDIA State of AI in Telecommunications report, 65% of operators said AI is driving network automation, and 50% named autonomous networks as the top AI use case for ROI. Yet many telcos still report gaps in AI and data science expertise. This makes it difficult to scale safe… Source

Aiden Chang·4 months ago

NVIDIA Dev Blog· INFRA

5 New Digital Twin Products Developers Can Use to Build 6G Networks

To make 6G a reality, the telecom industry must overcome a fundamental challenge: how to design, train, and validate AI-native networks that are too complex to... To make 6G a reality, the telecom industry must overcome a fundamental challenge: how to design, train, and validate AI-native networks that are too complex to be tested in the physical world. The NVIDIA Aerial Omniverse Digital Twin (AODT) solves this by enabling a continuous integration/continuous development (CI/CD)-style workflow where Radio Access Network (RAN) software is trained… Source

Cindy Goh·4 months ago

OpenAI· FRONTIER

Our agreement with the Department of War

OpenAI signs agreement with Department of War establishing safety red lines and deployment protocols for classified AI systems.

OpenAI·4 months ago

Anthropic· FRONTIER

Statement on the comments from Secretary of War Pete Hegseth

Anthropic responds to Secretary of War Pete Hegseth's comments on AI policy and provides guidance to customers.

Anthropic·4 months ago

NVIDIA Dev Blog· INFRA

Develop Native Multimodal Agents with Qwen3.5 VLM Using NVIDIA GPU-Accelerated Endpoints

Alibaba has introduced the new open source Qwen3.5 series built for native multimodal agents. The first model in this series is a ~400B parameter native... Alibaba has introduced the new open source Qwen3.5 series built for native multimodal agents. The first model in this series is a ~400B parameter native vision-language model (VLM) with reasoning built with a hybrid architecture of mixture of experts (MoE) and Gated Delta Networks. Qwen3.5 can understand and navigate user interfaces, which improves on the previous generation of VLMs. Qwen3.5… Source

Anu Srivastava·4 months ago

NVIDIA Dev Blog· INFRA

Maximizing GPU Utilization with NVIDIA Run:ai and NVIDIA NIM

Organizations deploying LLMs are challenged by inference workloads with different resource requirements. A small embedding model might use only a few gigabytes... Organizations deploying LLMs are challenged by inference workloads with different resource requirements. A small embedding model might use only a few gigabytes of GPU memory, while a 70B+ parameter LLM could require multiple GPUs. This diversity often leads to low average GPU utilization, high compute costs, and unpredictable latency. The problem isn’t just about packing more workloads onto… Source

Shwetha Krishnamurthy·4 months ago

OpenAI· FRONTIER

OpenAI and Amazon announce strategic partnership

OpenAI and Amazon expand partnership: Frontier platform, custom models, and enterprise agents on AWS.

OpenAI·4 months ago

← Front Page30 stories

← Newer Older →

The Archive

Tuning Flash Attention for Peak Performance in NVIDIA CUDA Tile

Controlling Floating-Point Determinism in NVIDIA CCCL

Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations

Introducing GPT-5.4

GPT-5.4 Thinking System Card

Reasoning models struggle to control their chains of thought, and that’s good

Ensuring AI use in education leads to opportunity

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

The five AI value models driving business reinvention

Introducing the Adoption news channel

VfL Wolfsburg turns ChatGPT into a club-wide capability

Introducing ChatGPT for Excel and new financial data integrations

Extending single-minus amplitudes to gravitons

Understanding AI and learning outcomes

How Axios uses AI to help deliver high-impact local journalism

How to Minimize Game Runtime Inference Costs with Coding Agents

cuTile.jl Brings NVIDIA CUDA Tile-Based Programming to Julia

PRX Part 3 — Training a Text-to-Image Model in 24h!

Gemini 3.1 Flash-Lite: Built for intelligence at scale

GPT-5.3 Instant System Card

GPT-5.3 Instant: Smoother, more useful everyday conversations

Advantage of AI in Business: How Enterprises Win with AI in 2026

Import AI 447: The AGI economy; testing AIs with generated games; and agent ecologies

Building Telco Reasoning Models for Autonomous Networks with NVIDIA NeMo

5 New Digital Twin Products Developers Can Use to Build 6G Networks

Our agreement with the Department of War

Statement on the comments from Secretary of War Pete Hegseth

Develop Native Multimodal Agents with Qwen3.5 VLM Using NVIDIA GPU-Accelerated Endpoints

Maximizing GPU Utilization with NVIDIA Run:ai and NVIDIA NIM

OpenAI and Amazon announce strategic partnership