Deterministic agents
Behavioral consistency, not stochastic suppression
At CodeDroid, we have been building toward a core idea: deterministic agents, AI systems that are intelligent, consistent, and reliable. This post introduces the framework, and shows what it looks like when put to an empirical test through a case study.
We use the term deterministic agents carefully. It does not mean removing the probabilistic nature of the underlying LLM. The stochasticity of language models is a feature, not a bug: it enables generalization, creativity, and the ability to handle novel inputs. We are not trying to eliminate that.
What we mean is behavioral determinism: the model might phrase its reasoning differently across runs, but the decision path it takes and the actions it executes remain consistent. Same environment, same correct outcome. When something goes wrong, the failure is informative rather than opaque.
The Design
The core design is a two-phase loop
Phase 1: Explore and Plan. The agent investigates the environment before acting. It inspects files, gathers relevant state, and builds a set of testable expectations about what a correct fix should produce. A plan is a set of verifiable outcomes.
Phase 2: Execute and Validate. The agent acts, then checks whether reality matches its expectations. If the result diverges unexpectedly, it re-evaluates based on new evidence or stops with an informative failure. It does not guess, retry blindly, or silently substitute actions.
The contrast with conventional agentic systems is significant:
A deterministic agent may not guarantee success, but it guarantees repeatable success or understandable failure.
Critically, a deterministic agent is adaptive. It updates its understanding when observations justify it. What changes is the standard for adaptation: evidence, not speculation.
We consider this a design pattern and a technical contribution in its own right. The two-phase loop, behavioral determinism, and evidence-driven adaptation are not tied to any single domain or application. They are reusable principles for building LLM agents that are intelligent, consistent, and reliable. In a series of blog posts, we will show what this looks like when applied to Android build repair, mobile App testing, game play testing, usability analysis, and other domains.