# When an AI Project Needs a Rules Engine

Incentiviiz started with an LLM thesis. The real product needed a hands-on fractional CTO, deterministic software, and AI used in the right place.

Published: December 12, 2025
Tags: Case Study, AI, Product Engineering

Canonical URL: https://michaelrispoli.com/blog/conduiit-incentiviiz-ai-tax-incentives/

Some AI projects become more interesting when AI turns out not to be the answer.

That is what happened with Conduiit and Incentiviiz.

It is also a good example of what a fractional CTO role can look like when it is actually useful.

This was not a few advisory calls and a diagram. It was hands on keyboard, working beside a domain expert in film production accounting, trying to turn a hard professional workflow into software that could survive real data. I had to learn enough of the domain to understand what the expert was seeing, then translate that judgment into product structure, technical architecture, and working code.

The original thesis was reasonable. Film and television tax incentive work involves dense laws, changing jurisdictions, complicated ledgers, and accountants trying to understand which expenses may qualify across countries, states, counties, provinces, and regions. It is the kind of domain where an LLM sounds like a natural fit.

Let the model read the law. Let it inspect rows in an accounting ledger. Let it decide what qualifies. Turn weeks of manual review into an intelligent workflow.

At small scale, it looked impressive.

Give the system a few rows of ledger data and a narrow incentive rule, and the model could produce a useful explanation. It could reason through a transaction. It could point at possible eligibility. It could make the demo feel like the future.

Then production reality showed up.

The model did not guarantee that it would find the same rows every time. It did not always apply the same reasoning every time. Sometimes the law itself was not perfectly clear about whether a particular expense qualified. That ambiguity is already hard for humans. Feeding it into a nondeterministic system made the problem worse, not better.

We did not arrive at that conclusion from theory. We arrived there in the lab, testing the product against real ledger data and watching where the AI approach bent, slowed down, or changed its mind. That is the kind of evidence a founder needs from a technical leader: not generic AI optimism, and not reflexive skepticism, but a working diagnosis from contact with the actual workflow.

That problem compounds fast.

A multimillion-dollar production does not have five ledger rows. It can have thousands upon thousands of transactions spread across pages of accounting data. If every row requires an LLM call, the system becomes slow and expensive. Worse, the result is not guaranteed to be repeatable.

That is unacceptable for accountants. It is unacceptable for state tax offices. It is unacceptable for anyone making financial decisions where the system needs to explain what it did and produce the same answer under the same conditions.

The demo said AI could do the work. The product said something more honest: AI could help us understand the work, but the final computation needed good old-fashioned programming.

The real solution was a deterministic rules engine.

Instead of asking an LLM to evaluate every ledger row at runtime, the product needed rules that could run quickly, consistently, and explainably against structured ledger data. The system had to parse ledgers, classify transactions, apply incentive rules, compute totals, and produce repeatable results at computer speed.

That is a different kind of engineering problem. Less magical in the demo. Much more useful in production.

The rules engine became the spine of the product. AI still had a role, but the role changed. It was useful for reading law, exploring edge cases, summarizing dense material, comparing provisions, and helping author or refine the deterministic rules that would later run the same way every time.

That required a tight loop between domain expertise and engineering. The expert could explain why a category of spend mattered, where the law was ambiguous, how accountants would review the output, and what kind of answer would be defensible. My job was to turn that into software primitives: rules, calculations, review states, data structures, and interfaces that made the expert workflow repeatable.

That distinction matters.

AI is excellent at helping humans move through ambiguity. It is not always the right tool for final authority. In Incentiviiz, the LLM could assist the process of turning messy legal and accounting context into structured rules. Once those rules existed, the runtime system could apply them consistently across large ledgers without the latency, cost, and inconsistency of model calls for every transaction.

This is the kind of lesson AI prototypes hide.

The impressive prototype is the model reasoning over a few examples. The serious product is the system that can run over an entire production ledger, finish quickly, produce the same result twice, preserve assumptions, expose review points, and give accountants something they can defend.

That does not make the product less AI-native. It makes it more mature.

The strongest AI products will not blindly throw models at every workflow. They will use all the power of computers: deterministic code where repeatability matters, databases where truth has to persist, rules engines where logic has to run quickly, and LLMs where language, ambiguity, and authoring support create real leverage.

That is the lesson from Incentiviiz. The job was never to make the machine sound smart. The job was to build a system that works.