Earlier this week, big data processor Databricks made a splash in the generative AI industry with a major acquisition. The deal, which is reportedly for up to $1.3 billion, will see Databricks acquire MosaicML, a large-scale AI model deployment platform. The deal is just the latest shoe to fall in an industry that is aiming to revolutionize the way in which we send, receive and process information.
The AI industry has taken center stage in 2023 and has single-handedly inflated company valuations on Wall Street. Led by tech giants like NVIDIA, Microsoft, and Meta Platforms, the generative AI sector is being treated by investors like it is the 21st-century reinvention of the wheel.
Valuations have become frothy and many are pointing towards the dotcom bubble in 2000 as a similar market fervour. Is AI the most important secular evolution since the introduction of the internet? Some of the world’s largest companies are making a pretty sizable bet that it is.
Did Databricks Overpay for MosaicML?
It is common knowledge that when valuations get frothy, so too does the cost of doing business. Databricks shelled out $1.3 billion for MosaicML which is a significant leap over recent investor-round valuations. Previously, MosiacML had "only" raised about $63 million from private investors that included Frontline, Atlas, and Samsung Next. At its most recent round of capital raising, MosaicML was only valued at about $222 million.
This either shows how frothy AI is right now or that Databricks considers MosaicML a much more valuable company than its investors did. The answer, as it usually does, likely lies somewhere in the middle of those two options. But this could be the first expensive step in an attempt to establish itself as the industry leader in the foundational layer of large language models.
What exactly is a large language model or LLM? Something like the globally known ChatGPT from OpenAI, which is a way to input prompts to receive a certain output from the AI chatbot. While the hardware market for AI chips is already dominated by the likes of NVIDIA and Alphabet, the software market is still largely up for grabs. Databricks has its fair share of stiff competition though. It is competing against cloud data solutions provider Snowflake and the aforementioned OpenAI. Snowflake is already a $57 billion company and OpenAI was valued at about $40 billion when Microsoft took its $10 billion stake.
The point here is that Databricks is looking for a first-mover advantage against its competitors. It also speaks to a broader consolidation of the AI industry as larger companies absorb these startups.
Databricks and MosaicML Integration
It was reported that MosaicML will be integrated into the Databricks Lakehouse Platform. MosaicML will provide generative AI tools which will complement the existing AI services that Databricks already offers. In a nutshell, MosaicML is like OpenAI in that they are creating and scaling its own LLMs that are entirely built upon its own data. MosaicML already has some major partners that use its LLM including Hippocratic AI, Replit, and the Allen Institute for AI.
But it isn’t just the software that is being integrated with Databricks. MosaicML CEO Naveen Rao is ensuring that he can bring the whole team over to Databricks. The cost of salaries and service integration is another likely reason why the cost for MosaicML rose to the level that it did.
MosaicML is a unique case study in its own right. One way in which MosaicML stands apart from its competition is that its platforms are completely open-source. This community approach to building generative AI LLMs was a major boost for the upstart sector. As of the time of this writing, it has not been revealed if Databricks will allow MosaicML’s code to remain available, or if it will choose to close it off. The impact of this could be beneficial for Databricks as that source code could become much more valuable. One downfall is that it could be damaging to the open-source AI community.
What does this mean for LLMs and the AI industry?
It is one of the first declarations from Databricks that it is gunning for the lead position in the global LLM race. Undoubtedly, the partnership between Databricks and MosaicML will create strong, scalable LLMs that will be in high demand from corporations around the world. As we continue to move towards AI replacing mundane tasks and menial labour, it appears that Databricks is gaining the upper hand.
The deal also comes on the heels of Snowflake’s acquisition of Neeva. It is not an apples-to-apples comparison as Neeva which provides an intelligent search platform for cloud data management. Still, it reiterates the consolidation of the industry and cements Snowflake, Databricks, and OpenAI as the current frontrunners.
Overall, this competitive oneupmanship is likely to benefit humanity over the long run. It will likely cause more dominoes to fall in the industry and set up a period of further mergers and acquisitions. Ultimately, the demand for large language models will continue to rise as corporations around the world attempt to cut costs and improve efficiencies through the integration of generative AI.
Is AI a bubble? It certainly could be although as of now it seems to have more real-world use than previous bubbles like cryptocurrencies or the Metaverse. Valuations are definitely frothy but we could be on the cusp of a new paradigm that would alter the current path that humanity is on.
Databrick’s acquisition of MosaicML does not establish it as the leader in LLM yet. But it is the first step of many more to come for the company. Corporations will be seeking out more cost-effective, pre-trained LLMs that can come pre-programmed with specific industry knowledge. This saves valuable time and resources for enterprises that would need to train an LLM from scratch. Herein lies the primary advantage that Databricks saw when acquiring MosaicML and why this new partnership could be ready to disrupt an industry that Pitchbook data estimates will be worth nearly $100 billion by 2026.