Performance Benchmarks¶

Measured framework overhead for selectools v0.26.0. These numbers answer one question: how much time does selectools add on top of the LLM call?

All benchmarks use LocalProvider (a zero-latency mock), so the timings below are pure framework overhead. In production, LLM API latency (100–2000ms per call) dominates; everything on this page is noise by comparison.

Environment¶


selectools	v0.26.0
Python	3.9 (CPython)
Machine	Apple M4, 24 GB RAM, macOS 26.3
Method	100 iterations per case, fresh agent/graph instances per iteration
Date	2026-06-12

Framework overhead¶

Operation	mean	p50	p95	p99
`agent.run()` single iteration	0.04ms	0.03ms	0.04ms	0.25ms
`agent.run()` with tool call	0.03ms	0.03ms	0.04ms	0.04ms
`graph.run()` 1 callable node	0.32ms	0.31ms	0.37ms	0.72ms
`graph.run()` 3 callable nodes	0.43ms	0.43ms	0.48ms	0.55ms
`graph.run()` 1 agent node	0.27ms	0.26ms	0.30ms	0.34ms
`graph.run()` 3 agent nodes	0.48ms	0.47ms	0.52ms	0.54ms
`graph.run()` 3 parallel nodes	0.51ms	0.51ms	0.54ms	0.57ms
`pipeline.run()` 1 step	<0.01ms	<0.01ms	<0.01ms	0.01ms
`pipeline.run()` 3 steps	<0.01ms	<0.01ms	<0.01ms	0.01ms
`pipeline.run()` 10 steps	0.01ms	0.01ms	0.01ms	0.01ms
checkpoint save (InMemory)	0.01ms	0.01ms	0.01ms	0.37ms
checkpoint load (InMemory)	<0.01ms	<0.01ms	<0.01ms	0.01ms
trace store save (InMemory)	<0.01ms	<0.01ms	<0.01ms	<0.01ms
trace store load (InMemory)	<0.01ms	<0.01ms	<0.01ms	<0.01ms

Takeaways:

An agent turn costs ~0.04ms of framework time. At a typical 500ms LLM round trip, selectools overhead is below 0.01% of wall clock.
Graph orchestration adds ~0.3ms fixed cost per run plus roughly 0.05–0.1ms per node.
Pipelines, checkpoints, and trace stores are effectively free.

Comparison: selectools vs LangGraph¶

Same tasks, same zero-latency mock providers, 200 iterations each (LangGraph 1.x, langchain-core current as of 2026-06-12).

Task	selectools (mean)	LangGraph (mean)	delta
3-node linear pipeline	0.43ms	0.33ms	LangGraph 0.10ms faster
Conditional routing	0.37ms	0.28ms	LangGraph 0.09ms faster
3-step pipeline composition	<0.01ms	N/A (LCEL, not compared)	—

Honest reading: LangGraph's compiled Pregel runtime is about 0.1ms faster per run on graph micro-tasks. Both frameworks are sub-millisecond, which is under 0.1% of a single real LLM call — neither framework's orchestration overhead will ever be the bottleneck in your application. selectools does not trade performance for its smaller API; it trades a compile step for a simpler execution model at a cost of ~0.1ms per graph run.

Reproduce¶

# Framework overhead (no extra deps)
python tests/benchmarks/bench_overhead.py

# Comparison (needs the competitor installed)
pip install langgraph langchain-core
python tests/benchmarks/bench_vs_langchain.py

The harness builds fresh agent/graph instances per iteration outside the timed window. Reusing one Agent across iterations accumulates conversation history and inflates later timings — an earlier revision of this harness had exactly that bug, reporting 6.5ms for the 3-agent-node graph that actually costs 0.48ms.