In the race for generative intelligence, whomever knows how to cultivate its own data and has the organizational vision to turn it into value will win.

In the era of ubiquitous advanced language models (LLM), the competitive advantage does not come from owning the model, but from holding the distinctive data that only its relations can generate, and from the organizational discipline to leverage it faster than the opposition.

The widespread commoditization of the LLM is advancing like an unstoppable tsunami. The latest Generative AI Snapshot survey, commissioned by Salesforce, indicates that 49% of respondents have already tried LLM tools. Reinforcing this trend, Microsoft and LinkedIn’s Work Trend Index 2024 reveals that three out of four knowledge workers (75%) already use generative AI tools at work.

Many companies are rushing to catch this wave and monetize it. Instead of building models from scratch, they resort to and refine already established foundations – ChatGPT, Llama, Claude, DeepSeek, among others – to create differentiated products and services. We call them LLM-based businesses. Vartheo is one such case: founded by alumni of CATÓLICA-LISBON, it accelerates qualitative marketing research through LLM-powered “personas” that simulate real interviewees – compressing weeks of fieldwork into hours. However, the same accessibility that fuels this boom also threatens it. If anyone can call on the same underlying model, raw capacity ceases to be a competitive barrier. In a world where LLMs become so ubiquitous as the internet itself, the question is clear: what will keep companies ahead when the underlying technology is a commodity?

Where does sustainable competitive advantage come from?

The fundamentals of strategic management offer a perspective focused on the company’s resources and competencies: the VRIO model, which assesses whether the organization can fully exploit it. When all four criteria are met, the resource can sustain long-term success and differentiation from the competition.

Developing an LLM-based product or service requires several critical resources and competencies: first and foremost, a value proposition, realized through a combination of human capital, infrastructure, and customer interface. Some of these elements are merely necessary to operate; others have the potential to generate sustainable competitive advantage.

Easily rented: infrastructure and user interface

Operating an LLM-supported business requires digital infrastructure. Today, however, this infrastructure is viewed as an upstream resource, outsourced to third parties. For example, an LLM-based service sends a request to ChatGPT, which is processed on OpenAI’s infrastructure. As long as suppliers do not sign exclusivity agreements, the infrastructure is easily “rented” and changing suppliers is also simple, making it no longer rare. Similarly, the user interface and interaction are downstream resources treated in the same way. Consequently, they are only a source of competitive parity.

Short-term scarcity: human capital

Professionals with LLM skills are essential to the success of these businesses. They need to understand how an LLM works, its strengths and limitations, and how to integrate it into a product or service. Currently, this know-how, in an emerging and rapidly evolving technology, is rare. However, as with other highly skilled profiles, it is not immune to imitation. LLM skills therefore only confer a temporary competitive advantage.

Proprietary data as a differentiator: the value proposition

What problem does your LLM-based business solve – and how? The source of a sustainable competitive advantage lies precisely in how the problem is solved; after all, any competitor can try to solve the same challenge. As we have seen, none of the components that simply incorporate an existing. LLM into a product or service pass the imitability test.

It is in the customization of the model, made with unique data, that the sustainable competitive advantage lies.

A primary way to customize LLM with data is to create a “library” to run a Retrieval-Augmented Generation (RAG) model. This adds a search mechanism that allows the LLM to access additional data – web search, legal codes, emails, for example – without changing the intrinsic behavior of the model. LLM can also be profoundly transformed through fine-tuning, a method that “reprograms” its brain: the model is retrained to express itself in a manner consistent with a specialized body of text. For example, LLM that support email writing, such as Apple Intelligence, benefit from fine-tuning with email data, not books or magazines.

Sustainable competitive advantage in the virtuous circle of data: the data flywheel

Without specialized LLM, it will lose quality when attempting to offer unique value. Regardless of how you specialize them, the sustainability of the competitive advantage of an LLM-based business depends entirely on the value, rarity, and difficulty of imitation of the data used.

The business value of data has been recognized for decades, but LLM bring relevant differences.

One of them is that the data that gained importance with the popularization of machine learning is not the same as the data that became valuable with the advent of LLMs. Machine learning has made it possible to extract complex patterns from quantitative data; LLMs work with text. Organizations naturally record numerical data, but text is rarely or never stored because it is often incidental to the core business (except, for example, on social media). Thus, most will have to create structured methods to collect these texts and put them to work.

To gather rare and non-replicable data, it is essential to draw on customer relationships – who they are and how they interact with you. Each point of contact is an opportunity to get to know them better, tailor your LLM to their needs, and deliver higher quality business; it is a virtuous circle, the so-called data flywheel.

Echoing 18th-century economist David Ricardo, financial and human capital are not enough to sustain competitive advantage; the key is to discover the right productive (digital) “land” and be the one who harvests the right set of data. It is crucial to understand where the data comes from and organize yourself to exploit it.

In the race for generative intelligence, the winner will be whoever knows how to cultivate their own data – and has the organizational vision to transform it into value.

Ekin Ilseven, Professor at CATÓLICA-LISBON | Nicolò Bertani, Professor at CATÓLICA-LISBON