15 March 2026 · Originally published on LinkedIn

[Me & X] What is the single most important lesson from building systems with AI?

X: What is the single most important lesson you've learned from building systems with AI?

Me: There are many lessons. The most important? Probably, the lack of understanding.

X: Explain.

Me: The systems don't understand. They simulate reasoning but the process is still probabilistic. They are trained to be "helpful" as in instruction following and task completion, but in practice that means keeping the conversation moving. When you ask a question they generate the statistically most likely answer that completes the task and continues the interaction i.e. they help.

X: What is wrong with that?

Me: They generate answers through probability distributions and sampling, and then present them both coherently and convincingly. Because of this, it is easy to assume they understand the problem, to be lulled in. However when coding to fix a problem, it will try or find the variables and localised edits it needs to modify or the similar patterns in code it needs to add in order to "solve" the problem.

X: So, what's wrong with that?

Me: Suppose I'm building a graphical system which combines various meshes together. If there is an error then it will attempt to modify parameters to fix it. You can consider these as shallow problem. However, if the solution needs a deep understanding of how the meshes work together, it will never learn the systems, it will never RTFM and reason out how to fix it. Sure it can read documents to surface scan for patterns but it never understands the system. Hence when faced with a deep problem, it will often skirt around the issue constantly changing parameters and never get to where it needs to be. At this point, it often requires a skilled human to prod it in the right direction.

X: Is this common?

Me: It's happening all the time especially in the deep structure of any system you ask it to build. Coding contains two aspect - the surface level functionality and the deeper structure. Surface functionality is easy to test and often works well. The deeper structure, where architectural decisions live, is usually a mess. However, that will only show up in long term maintenance and evolution of the system when you need to adapt it.

X: What does that mean?

Me: Most of these systems are ticking time bombs. We haven't yet developed the practice to do this safely in production, they are still emerging.

X: What about using a swarm with different roles?

Me: I already do that. I even use judgement systems with multiple agents challenging a decision. They are good for verification, planning and even surface level error detection. However, the systemic issue is that none of them understand the space, unless you are lucky then they tend to fail at deep problems because they are all statistical engines at heart.

X: What does this mean though in practice?

Me: The likely outcome over time to all our AI wrangling? ... Boom. Go read E.M. Forster's "The Machine Stops". Have some slop to explain the problem.

AI-generated illustration of shallow vs deep problems

Originally published on LinkedIn.