Safety

Kaval.AI’s mission is to make agentic AI pipelines predictable, observable, and safe. Safety is not a single feature but a set of guarantees baked into the runtime: data is typed at every boundary, control-flow expressions are evaluated without eval, agents are explicitly bounded, and any pipeline can be run fully offline for deterministic testing.

For a hands-on walkthrough, see Building agents with Workflows.

Typed I/O everywhere

Every value that crosses a boundary is validated. Workflow data_types are JSON-schema fragments compiled to Pydantic models, so each node input and output is checked (see Workflows). Likewise, every tool in the FunctionKernel has a Pydantic input and output model, and the kernel validates and coerces arguments before a call and the return value after (see Tools). A malformed value is caught at the boundary, not deep inside a downstream step.

A safe expression language

Branching nodes (if / switch) and context lookups are powered by a small expression language in kavalai.workflow.expressions: evaluate_expression(), evaluate_bool, evaluate_value, and ExpressionError.

Expressions are evaluated safely via an AST whitelist — never Python eval. Only comparisons, and / or / not, in, arithmetic, and dotted/indexed access into the context are permitted. Unknown names resolve to None, so guard checks degrade gracefully rather than crashing.

Bounded agents, explicit tools

Agentic loops are constrained on two fronts. An Agent (and the agent workflow node) is bounded by an explicit max_steps so it cannot run away (see Agents). And it can only act through tools addressed explicitly by URI — there is no implicit capability, only the tools you have registered with the kernel.

Deterministic testing

Because LLMs are non-deterministic, Kaval.AI lets you replace them entirely for testing. Inject a client_factory into the engine to run a workflow fully offline with canned model output:

from kavalai import WorkflowEngine

engine = WorkflowEngine.from_yaml(
    yaml,
    client_factory=lambda model, parameters, stats_receiver: StubClient(),
)

A StubClient subclasses BaseLlmClient and implements chat_completions to return canned response_model instances. This makes workflow logic — branching, data flow, node wiring — testable and repeatable without a single live model call, which complements the run-level auditing described in Observability.