Safety¶
Kaval.AI’s mission is to make agentic AI pipelines predictable, observable,
and safe. Safety is not a single feature but a set of guarantees baked into
the runtime: data is typed at every boundary, control-flow expressions are
evaluated without eval, agents are explicitly bounded, and any pipeline can
be run fully offline for deterministic testing.
For a hands-on walkthrough, see Building agents with Workflows.
Typed I/O everywhere¶
Every value that crosses a boundary is validated. Workflow data_types are
JSON-schema fragments compiled to Pydantic models, so each node input and output
is checked (see Workflows). Likewise, every tool in the
FunctionKernel has a Pydantic input and output model, and the
kernel validates and coerces arguments before a call and the return value after
(see Tools). A malformed value is caught at the boundary, not deep inside
a downstream step.
A safe expression language¶
Branching nodes (if / switch) and context lookups are powered by a small
expression language in kavalai.workflow.expressions:
evaluate_expression(), evaluate_bool, evaluate_value, and
ExpressionError.
Expressions are evaluated safely via an AST whitelist — never Python eval.
Only comparisons, and / or / not, in, arithmetic, and
dotted/indexed access into the context are permitted. Unknown names resolve to
None, so guard checks degrade gracefully rather than crashing.
Bounded agents, explicit tools¶
Agentic loops are constrained on two fronts. An Agent (and the
agent workflow node) is bounded by an explicit max_steps so it cannot
run away (see Agents). And it can only act through tools addressed
explicitly by URI — there is no implicit capability, only the tools you have
registered with the kernel.
Deterministic testing¶
Because LLMs are non-deterministic, Kaval.AI lets you replace them entirely for
testing. Inject a client_factory into the engine to run a workflow fully
offline with canned model output:
from kavalai import WorkflowEngine
engine = WorkflowEngine.from_yaml(
yaml,
client_factory=lambda model, parameters, stats_receiver: StubClient(),
)
A StubClient subclasses BaseLlmClient and implements
chat_completions to return canned response_model instances. This makes
workflow logic — branching, data flow, node wiring — testable and repeatable
without a single live model call, which complements the run-level auditing
described in Observability.