This week at CES, Rabbit R1 took the spotlight with the introduction of their new AI-powered companion device.
It's not quite a smart phone, not a large language model, and not an AI agent...as the company describes it's:
...a personalized operating system through a natural language interface.
In this article, we'll look at what exactly the Rabbit R1 is, what's under the hood, and more.
What is the Rabbit R1?
Retailing for $199, the Rabbit R1 is an AI-powered handheld device, similar to a smart phone but operates through natural language and can learn actions based on imitation.
Here are a few key features of the device:
- The device features a small 2.88 inch touch screen, an analog scroll wheel for navigation, and a 360° rotating camera
- Rabbit R1 has a push-to-talk button and is operated by the Rabbit OS, which is a natural language interface that uses a "Large Action Model" (discussed below).
- This model offers support for a number of commonly used apps (like ride sharing, maps, etc.)
- You can also teach it new skills & capabilities based on user interactions. This allows the LAM to learn the nuances of the new action and create a "rabbit" you can use in the future
- The company also highlighted their plans of launching a rabbit store where you can monetize and distribute your trained actions.
What is a Large Action Model (LAM)?
What makes the Rabbit R1 unique that it doesn't rely on large language models (LLMs) converting natural language into standard API calls.
Instead, the operating system uses what they call a "Large Action Model", which is:
...is a new type of foundation model that understands human intentions on computers.
The Large Action Model uses neuro-symbolic programming, which is a combination of neural networks and symbolic AI architecture. This approach allows for the direct modeling of application structures and user actions without relying on intermediate text representations.
We characterize our system as a (Large) Action Model, or LAM, emphasizing our commitment to better understand actions, specifically human intentions expressed through actions oncomputers, and, by extension, in the physical world.
The LAM represents a significant shift from traditional AI models, enabling a more intuitive, language-based user interface. Specifically, this technology allows Rabbit R1 to bypass the limitations of traditional API-driven interactions, offering a more fluid and natural user experience
Learning Actions by Demonstration
One of the key features of Rabbit R1 is it's ability to learn actions by demonstration. This involves observing human interactions with various website or tools and replicating these processes, even if the interface changes or is presented differently.
As their research highlights:
LAM's modeling approach is rooted in imitation, or learning by demonstration: it observes a human using the interface and aims to reliably replicate the process, even if the interface is presented differently or slightly changed.
This technique allows LAM to gain an in-depth understanding of application interfaces, essentially creating a "conceptual blueprint" of the services offered by these apps.
Over time, LAM could become increasingly helpful in solving complex problems spanning multiple apps that require professional skills to operate.
Is the Rabbit R1 the iPhone of AI?
There's certainly been a lot of buzz around the Rabbit R1, as the company highlighted they've already sold 10,000 unites on day 1...
There have been a few other AI consumer hardware devices gaining popularity lately like the AI pin by Humane and Tab AI, but with Rabbit R1's unique foundational model, it appears to have the the most potential and an accessible price point.
So while 2023 was the year of the large language model, will 2024 be the year of the AI-powered consumer device?