Technical debt is a problem at the best of times, but during periods of rapid innovation, it can become overwhelming.
Innovation, after all, is never linear. It comes in fits and starts, with dead ends and sudden turns aplenty. Every twist in this tale of innovation leaves something behind – some tool or experiment that turns out to have been a bad idea in retrospect.
All these deviations can potentially add to technical debt, as they are far easier to deploy than to decommission.
Today, AI – in particular, generative AI (GenAI) – is in such a period of rapid innovation. Enterprises, software vendors, and born-in-the-cloud companies alike are jumping into the GenAI pool with both feet.
It won’t be long, therefore, until many such organizations run into burgeoning technical debt challenges.
Bob Quillin, Chief Ecosystem Officer at vFunction, discussed this problem in a recent article. In this article, he explains how the rapid accumulation of AI tooling can lead to technical debt as requirements evolve and tool vendors go out of business, leading to unsupported packages and tools.
While the rapid obsolescence of tooling is problematic, there is an even more ominous form of technical debt that the rapid innovation in GenAI exacerbates: architectural technical debt (ATD).
Not only do older architectures fall short of the requirements of GenAI, but even modern architectures suffer from the problem of rapid innovation.
Getting a handle on ATD, therefore, is essential for the successful deployment and operationalization of GenAI-based applications.
Understanding GenAI’s Architectural Technical Debt
As I explained in an article from 2023, architectural debt is a special kind of technical debt that indicates expedient, poorly constructed, or obsolete architecture.
I went on to explain that not all architectural debt is bad. Excessive efforts to limit architectural debt can lead to overdesign, thus counterintuitively raising the debt.
Instead, taking an iterative, ‘just enough, just in time’ approach to architecture can mitigate an organization’s exposure to architectural debt long term.
In the case of GenAI, the ATD challenge divides roughly into two areas: dealing with legacy architectures and iterating modern architectures to support rapid innovation.
GenAI in Legacy Architectures
When I say ‘legacy architecture’ in this context, I mean ‘whatever architecture you had in place before GenAI.’
Such architecture might indeed be old, say, a three-tier architecture for supporting websites and web applications. Or it might be relatively recent, typically a cloud-native architecture that leverages Kubernetes to provide dynamic software at scale.
If your architecture falls into the first camp, then it won’t take long until you realize that it’s simply not up to the performance, scale, and specialized hardware requirements of GenAI. In fact, it’s unlikely to support the modeling and data preparation phases of a GenAI project, let alone the operationalization of your GenAI applications.
Even though I’m focusing on the application architecture, part of the problem is with the hardware. GenAI requires graphics processing units (GPUs). GPUs aren’t simply souped-up central processing units (CPUs); they work in new ways to address the unique processing requirements of AI modeling and inferencing. As a result, obsolete application architectures are unlikely to support GPUs – if you can even get such chips to work in a legacy environment in the first place.
Legacy data architectures are also a major source of ATD for GenAI. The belief that you can pour whatever enterprise data you have lying around into your large language model (LLM) of choice and get useful results is a myth.
In reality, discovering, organizing, cleaning, and preparing data for GenAI is essential to the initiative’s success – and in many cases, real-time access to up-to-the-minute data is also essential. Legacy data architectures possess neither the data models nor the real-time characteristics that modern GenAI data architectures require.
Rubber Meet Road: Operationalizing GenAI
Preparing data and tweaking models are important to be sure – but unless you can operationalize your GenAI applications at scale, you’ll never see any business value from them.
Due to their performance, scale, and hardware limitations, any architectures that predate cloud native is simply not up to the task of operationalization and thus form a massive hairball of architectural debt.
Just because you’ve transitioned to a cloud-native architecture, however, doesn’t necessarily mean you’re ready for such operationalization.
Instead, the current best practice is to tweak what is now ‘traditional’ cloud-native architecture to support the operationalization of GenAI. These tweaks should include:
- Supporting the ability to rapidly process massive data sets that include both structured and unstructured data
- Full lifecycle considerations that include ingestion, data preparation, and training as well as deployment, management, and ongoing updates
- The incorporation of AI-based capabilities for the continued management and updates of the GenAI applications themselves.
As organizations require greater power and sophistication from their AI-based applications, the more important it will become to leverage AI to support the ongoing requirements of those applications. This ‘apply AI to AI’ capability is unique to AI and thus represents new architectural requirements that no existing architecture is likely to possess.
The Intellyx Take
The fact that even a fully modern cloud-native architecture may not be up to the task of supporting GenAI operationalization should be a wakeup call for anyone dealing with ATD.
The lesson here isn’t that you simply need to update cloud-native architectures for GenAI. The most important takeaway is that it will be necessary to update your GenAI production architecture on a regular basis as innovation continues and technologies evolve.
ATD, therefore, will accrue on a regular basis. The old saying that today’s new technology is tomorrow’s legacy is especially true with GenAI.
Developing a proactive architectural observability strategy for dealing with such debt, therefore, is essential for maintaining both a competitive pace of innovation and the ability to operationalize GenAI at scale.
Copyright © Intellyx BV. vFunction is an Intellyx customer. Intellyx retains final editorial control of this article. No AI was used to write this article.
Image by Craiyon