The Archive

Search the full wire by company, model, lab, or keyword. Every story we have ever aggregated.

Claude OpenAI Anthropic Gemini Mistral Cursor

Hugging Face· INFRA

Introducing Agents.js: Give tools to your LLMs using JavaScript

Hugging Face·3 years ago

Hugging Face· INFRA

Introducing ⚔️ AI vs. AI ⚔️ a deep reinforcement learning multi-agents competition system

Hugging Face·3 years ago

OpenAI· FRONTIER

Learning to play Minecraft with Video PreTraining

We trained a neural network to play Minecraft by Video PreTraining (VPT) on a massive unlabeled video dataset of human Minecraft play, while using only a small amount of labeled contractor data. With fine-tuning, our model can learn to craft diamond tools, a task that usually takes proficient humans over 20 minutes (24,000 actions). Our model uses the native human interface of keypresses and mouse movements, making it quite general, and represents a step towards general computer-using agents.

OpenAI·4 years ago

Hugging Face· INFRA

Introducing Snowball Fight ☃️, our first ML-Agents environment

Hugging Face·4 years ago

OpenAI· FRONTIER

Safety Gym

We’re releasing Safety Gym, a suite of environments and tools for measuring progress towards reinforcement learning agents that respect safety constraints while training.

OpenAI·6 years ago

OpenAI· FRONTIER

Emergent tool use from multi-agent interaction

We’ve observed agents discovering progressively more complex tool use while playing a simple game of hide-and-seek. Through training in our new simulated hide-and-seek environment, agents build a series of six distinct strategies and counterstrategies, some of which we did not know our environment supported. The self-supervised emergent complexity in this simple environment further suggests that multi-agent co-adaptation may one day produce extremely complex and intelligent behavior.

OpenAI·7 years ago

OpenAI· FRONTIER

Neural MMO: A massively multiagent game environment

We’re releasing a Neural MMO, a massively multiagent game environment for reinforcement learning agents. Our platform supports a large, variable number of agents within a persistent and open-ended task. The inclusion of many agents and species leads to better exploration, divergent niche formation, and greater overall competence.

OpenAI·7 years ago

OpenAI· FRONTIER

Reinforcement learning with prediction-based rewards

We’ve developed Random Network Distillation (RND), a prediction-based method for encouraging reinforcement learning agents to explore their environments through curiosity, which for the first time exceeds average human performance on Montezuma’s Revenge.

OpenAI·8 years ago

OpenAI· FRONTIER

AI safety via debate

We’re proposing an AI safety technique which trains agents to debate topics with one another, using a human to judge who wins.

OpenAI·8 years ago

OpenAI· FRONTIER

Evolved Policy Gradients

We’re releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test time that were outside their training regime, like learning to navigate to an object on a different side of the room from where it was placed during training.

OpenAI·8 years ago

OpenAI· FRONTIER

Learning to model other minds

We’re releasing an algorithm which accounts for the fact that other agents are learning too, and discovers self-interested yet collaborative strategies like tit-for-tat in the iterated prisoner’s dilemma. This algorithm, Learning with Opponent-Learning Awareness (LOLA), is a small step towards agents that model other minds.

OpenAI·9 years ago

OpenAI· FRONTIER

Learning to cooperate, compete, and communicate

Multiagent environments where agents compete for resources are stepping stones on the path to AGI. Multiagent environments have two useful properties: first, there is a natural curriculum—the difficulty of the environment is determined by the skill of your competitors (and if you’re competing against clones of yourself, the environment exactly matches your skill level). Second, a multiagent environment has no stable equilibrium: no matter how smart an agent is, there’s always pressure to get smarter. These environments have a very different feel from traditional environments, and it’ll take a...

OpenAI·9 years ago

OpenAI· FRONTIER

Learning to communicate

In this post we’ll outline new OpenAI research in which agents develop their own language.

OpenAI·9 years ago

← Front Page13 matches

← Newer