How Hugging Face Positions Itself in the Open LLM Stack

What role does Hugging Face play in the generative AI developer ecosystem? We take a look at the company's savvy open source branding.

Jun 20th, 2023 9:01am by Richard MacManus

Featued image for: How Hugging Face Positions Itself in the Open LLM Stack

Forget the LAMP stack, it’s now all about the LLM stack. Tools such as LangChain and Anyscale’s Aviary have launched over the past year to help developers build apps based on — or connected to — large language models (LLMs). Although it’s still early days, Hugging Face has quickly become a key part of this emerging stack. It’s already become the repository of choice for choosing LLMs and other machine learning models and datasets.

In a recent presentation at PyCon Sweden, Hugging Face Chief Evangelist Julien Simon explained the role Hugging Face plays in the generative AI developer ecosystem, and its plans for the near future.

How Did Hugging Face Become an Open Source Champion?

Ironically, perhaps, Hugging Face is a commercial company and its repository is not actually an open source platform. But then, neither is its closest “Web 2.0” equivalent, GitHub (which is of course owned by Microsoft). What matters, in both cases, is that the files being hosted are open source.

Perhaps even more importantly in its branding as an open platform, Hugging Face started out as a provider of open source transformer libraries.

“So Hugging Face is a company started in 2016, and we started building open source libraries for transformers around 2018,” said Julien Simon in his PyCon Sweden keynote. “And as it turns out, we are one of the fastest growing open source projects ever.”

Hugging Face growth

So why did Hugging Face become so popular, so fast? Simon laid out several factors, including the difficulty of dealing with early neural networks and the expense of GPUs to run them on. But the biggest problem, he said, was a lack of “expert tools.”

“So if you want to get the accuracy that you want from neural networks and deep learning models, you need to go heavy into PyTorch code, TensorFlow code […] and you need a background in computer science and statistics, and machine learning — and not everybody has that, right.”

Hugging Face

What Hugging Face is trying to do, he continued, is to make AI development “faster, simpler, more efficient.” He compared this effort to how Agile usurped Waterfall as the process of choice in software engineering project management.

No Fate

The key to this new process, which he called (of course) Deep Learning 2.0, is to use transformers — the technology that OpenAI’s GPT, and pretty much everything that came after, was built on.

“The main thing is, instead of having to work with that collection of crazy deep learning architectures, we tend to work more and more with transformer models,” he said.

Hugging Face

Also key is developer tools that are simpler than the aforementioned “expert tools.” As Simon put it, “if you can write a few lines of Python, you’re good to go.”

It wouldn’t be 2023 without a re-working of Marc Andreessen’s famous 2011 quip, “software is eating the world.” In the world of Hugging Face, that saying has become “transformers are eating deep learning.”

Hugging Face

The Hugging Face Hub

As well as its transformer libraries, Hugging Face is well known for its “Hub”, which is a platform “with over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available.” In his presentation, Simon called this “the GitHub of machine learning.” He also said that the Hub has over 100,000 “active users” with over 1 million downloads per day.

Hugging Face

Returning to his agile comparison, Simon then presented a flow chart that a developer might follow on Hugging Face.

Hugging Face flow

“So you start from existing datasets on the Hub [and] pre-trained models on the Hub. Then you can use them as is — […] a few lines of code in the Transformers library and test the models on your data. If they’re good, if you get the accuracy that you want, you are done […] and you can call yourself a machine learning engineer.”

That’s just the start of what developers can do — he noted that you might want to fine-tune your own data, or use Optimum for hardware acceleration.

He added that Hugging Face has integrations with both Amazon (SageMaker) and Azure, so developers can use that tooling too. As yet, there is no Google integration.

Mix of Open and Closed

I was being a little flippant in the opening line of this article. The new LLM stack isn’t directly comparable to the LAMP stack of the late 1990s and early 2000s — for a start, there’s no operating system component in the LLM stack. But there is a set of tools, including excellent open source versions, that developers are starting to favor when working with LLMs. For vector databases, for instance, there are both commercial (e.g. Pinecone) and open source versions (e.g. Chroma) to choose from.

Hugging Face is an interesting mix of open source offerings and typical SaaS commercial products. On the open source side, in 2022 it released an LLM called BLOOM, and this year it released a ChatGPT competitor called HuggingChat. On the SaaS side, one of its many products is Inference Endpoints, a “fully managed infrastructure” for deploying models, which starts at $0.06 per hour. Given the commercial setup and venture funding, it’s possible (maybe even probable) that a big tech company will acquire Hugging Face — just as Microsoft acquired GitHub. But for now, there’s little for developers to complain about.

Simon told Intel in a recent interview, “I tell customers that if they believe AI is transformative — and it’s probably even more transformative than the cloud was — how could you not own it? You don’t want someone else to be in control of your future. You want to be in control of your future.”

Ultimately, it’s clever positioning to take the opposite side of the proprietary LLM camp, which is led by OpenAI. Also, Hugging Face calling itself the “GitHub of machine learning” is pure catnip to developers in the age of AI.

Richard MacManus is a Senior Editor at The New Stack and writes about web and application development trends. Previously he founded ReadWriteWeb in 2003 and built it into one of the world’s most influential technology news sites. From the early...