Inference is not Verification — fleeting.computer

When you use the stochastic-parrot-genie that is gen-ai you need to be careful what you wish for.

Asking your coding agent to generate you a report that tells you whether the code works is not sufficient proof that you are done.

Good verification is repeatable, fast and understandable. LLMs are none of these things (but they're still cool).

We can use gen-ai to generate code because we don't really care how the AI reached its conclusion about what the right code was for the given situation.

Inference is about the conclusion, verification is about how you reach the conclusion.

To sign off on system correctness we need to care deeply about how we arrived at the conclusion that the system works.

What to do instead

Don't get me wrong. I like AI and think it has a lot to offer software development. But this (in my opinion - hey it's my blog after all) is a case of wrong tool for the job.

Do not use AI to verify that the system works.

Instead use AI to build (or speed up building) a system that verifies if the system works.

The second option is clearer, deterministic, repeatable after something changes and usually faster - all the things we want from verification.