AI for the Sake of AI Is the New Microservices

Image generated with ChatGPT

AI has been eating the world faster than Galactus, and this time we did not even get a Silver Surfer to warn us.

Suddenly, companies are forcing their engineers to use AI to write code faster and “more efficiently,” while we, engineers, are still figuring out what it even means to be an AI engineer. Or what I prefer to call an AI enabled AI engineer, someone who not only builds AI powered software, but also uses AI tools themselves, eating their own dog food, so to speak.

There is another side to this story. Whenever a new trend emerges, no one wants to be left behind. Everyone wants something to say, something to show. That is exactly what I see happening now, and I've seen it many times with clients during my time at BCG. Organizations want to embed AI into their workflows even when there is no real need for it, or worse, when simpler techniques would perform better than a Large Language Model in that context.

I saw a very similar pattern when microservices became the new hot trend. Every tech organization were breaking monoliths apart and migrating to microservices because “that is how you do tech,” even if they were not operating at a scale that required it for performance reasons, nor had teams large enough to justify the operational complexity.

And I am sure we will keep seeing this behavior every time a new technology comes along to disrupt the industry.

To me, doing microservices for the sake of microservices because everyone else is doing it is just as inconvenient and wrong as doing AI for the sake of AI because everyone else is talking about it. In both cases, the use case needs to be justified. It should be clear that the problem truly benefits from LLMs and cannot be solved in a more straightforward way with other techniques.

A Small Decision Framework

Whenever I am faced with a potential use case for an agentic solution or something based on LLMs, I go through a few phases of thought.

First, fully understand the problem.

We need to understand what we are trying to solve and the outcome we are expecting. Is the problem predictable? Is the success criteria clear and objective? If the answer to both questions is “yes,” then the use case probably does not need LLMs at all. Problems that are open ended or require generating natural language responses are usually better suited for generative models.

Second, look at the data.

Is it structured or unstructured?

Structured data, such as tables, numeric features, and categorical inputs, is often best handled by traditional machine learning models such as decision trees, random forests or gradient boosting. These models are efficient, explainable, and well optimized for prediction tasks.
Unstructured data, such as free text, documents or images, may benefit from LLMs or deep learning models, especially when the task involves understanding context or generating answers in human language (or any other language).

For example, a problem like customer churn prediction typically does not benefit from generative models, while summarizing content from a large set of unstructured documents often does.

Third, consider explainability.

If outputs must be explainable and predictable, LLMs and AI agents might not be a good fit. Rule based systems and traditional machine learning techniques have explicit decision paths and logic, making results more repeatable. This is crucial in scenarios where transparency, auditing or compliance matters.

LLMs and AI agents, on the other hand, are probabilistic and context aware. That adaptability comes at the cost of explainability.

Fourth, assess the need for human level language understanding.

If deep language understanding is a hard requirement, then LLMs or an agentic approach may be justified. Use cases like conversational agents, automated translation, document summarization, or semantic search are classic examples.

However, tasks like text classification or deterministic decision making often make a stronger case for traditional machine learning. This usually comes at a lower cost and with lower latency. LLMs are still slower and more expensive than many classical approaches.

Finally, consider cost and operational complexity.

Even if a use case is a good fit for LLMs, the cost matters. LLMs and autonomous agents are more expensive to run and maintain. They require significant compute resources, specialized infrastructure, and close monitoring for hallucinations and data drift over time, which causes the model to underperform.

Hybrid Approaches are Often the Best

In engineering, complex problems rarely have a single silver bullet solution. Hybrid approaches often outperform attempts to use one tool for everything.

Retrieval Augmented Generation, or RAG, is a great example. It combines traditional techniques like embeddings and vector search with generative models to produce grounded and factual responses from a trusted knowledge base.

Sometimes one part of the problem is a good fit for agents, while another can be solved with simpler, more deterministic methods. Always analyze the use case from multiple angles. Often, the best solution turns out to be simpler than you initially thought.

If you liked what you have read or resonate with these opinions, feel free to subscribe. It is free.