8.9 KiB
Test Coverage Analysis — Timmy Time Dashboard
Date: 2026-03-06 Overall coverage: 63.6% (7,996 statements, 2,910 missed) Threshold: 60% (passes, but barely) Test suite: 914 passed, 4 failed, 39 skipped, 5 errors — 35 seconds
Current Coverage by Package
| Package | Approx. Coverage | Notes |
|---|---|---|
spark/ |
90–98% | Best-covered package |
timmy_serve/ |
80–100% | Small package, well tested |
infrastructure/models/ |
42–97% | registry great, multimodal weak |
dashboard/middleware/ |
79–100% | Solid |
dashboard/routes/ |
36–100% | Highly uneven — some routes untested |
integrations/ |
51–100% | Paperclip well covered; Discord weak |
timmy/ |
0–100% | Several core modules at 0% |
brain/ |
0–75% | client and worker very low |
infrastructure/events/ |
0% | Completely untested |
infrastructure/error_capture.py |
0% | Completely untested |
Priority 1 — Zero-Coverage Modules (0%)
These modules have no test coverage at all and represent the biggest risk:
| Module | Stmts | Purpose |
|---|---|---|
src/timmy/semantic_memory.py |
187 | Semantic memory system — core agent feature |
src/timmy/agents/timmy.py |
165 | Main Timmy agent class |
src/timmy/agents/base.py |
57 | Base agent class |
src/timmy/interview.py |
46 | Interview flow |
src/infrastructure/error_capture.py |
91 | Error capture/reporting |
src/infrastructure/events/broadcaster.py |
67 | Event broadcasting |
src/infrastructure/events/bus.py |
74 | Event bus |
src/infrastructure/openfang/tools.py |
41 | OpenFang tool definitions |
src/brain/schema.py |
14 | Brain schema definitions |
Recommendation: timmy/agents/timmy.py (165 stmts) and semantic_memory.py (187 stmts) are the highest-value targets. The events subsystem (broadcaster.py + bus.py = 141 stmts) is critical infrastructure with zero tests.
Priority 2 — Under-Tested Modules (<50%)
| Module | Cover | Stmts Missed | Purpose |
|---|---|---|---|
brain/client.py |
14.8% | 127 | Brain client — primary brain interface |
brain/worker.py |
16.1% | 156 | Background brain worker |
brain/embeddings.py |
35.0% | 26 | Embedding generation |
timmy/approvals.py |
39.1% | 42 | Approval workflow |
dashboard/routes/marketplace.py |
36.4% | 21 | Marketplace routes |
dashboard/routes/paperclip.py |
41.1% | 96 | Paperclip dashboard routes |
infrastructure/hands/tools.py |
41.3% | 27 | Tool execution |
infrastructure/models/multimodal.py |
42.6% | 81 | Multimodal model support |
dashboard/routes/router.py |
42.9% | 12 | Route registration |
dashboard/routes/swarm.py |
43.3% | 17 | Swarm routes |
timmy/cascade_adapter.py |
43.2% | 25 | Cascade LLM adapter |
timmy/tools_intro/__init__.py |
44.7% | 84 | Tool introduction system |
timmy/tools.py |
46.4% | 147 | Agent tool definitions |
timmy/cli.py |
47.4% | 30 | CLI entry point |
timmy/conversation.py |
48.5% | 34 | Conversation management |
Recommendation: brain/client.py + brain/worker.py together miss 283 statements and are the core of the brain/memory system. timmy/tools.py misses 147 statements and is the agent's tool registry — high impact.
Priority 3 — Test Infrastructure Issues
3a. Broken Tests (4 failures)
All in tests/test_setup_script.py — tests reference /home/ubuntu/setup_timmy.sh which doesn't exist. These tests are environment-specific and should either:
- Be marked
@pytest.mark.skip_cior@pytest.mark.functional - Use a fixture to locate the script relative to the project
3b. Collection Errors (5 errors)
tests/functional/test_setup_prod.py — same issue, references a non-existent script path. Should be guarded with a skip condition.
3c. pytest-xdist Conflicts with Coverage
The pyproject.toml addopts includes -n auto --dist worksteal (xdist), but make test-cov also passes --cov flags. This causes a conflict:
pytest: error: unrecognized arguments: -n --dist worksteal
Fix: Either:
- Remove
-n auto --dist workstealfromaddoptsand add it only inmake testtarget - Or use
-p no:xdistin the coverage targets (current workaround)
3d. Tox Configuration
tox.ini has unit and integration environments that run the exact same command — they're aliases. This is misleading:
unitshould run-m unit(fast, no I/O)integrationshould run-m integration(may use SQLite)- Consider adding a
coveragetox env
3e. CI Workflow (tests.yml)
- CI uses
pip install -e ".[dev]"but the project uses Poetry — dependency resolution may differ - CI doesn't pass marker filters, so it runs all tests including those that may need Docker/Ollama
- No coverage enforcement in CI (the
fail_under=60in pyproject.toml only works with--cov-fail-under) - No caching of Poetry virtualenvs
Priority 4 — Test Quality Gaps
4a. Missing Error-Path Testing
Many modules have happy-path tests but lack coverage for:
- Graceful degradation paths: The architecture mandates graceful degradation when Ollama/Redis/AirLLM are unavailable, but most fallback paths are untested (e.g.,
cascade.pylines 563–655) brain/client.py: Only 14.8% covered — connection failures, retries, and error handling are untestedinfrastructure/error_capture.py: 0% — the error capture system itself has no tests
4b. No Integration Tests for Events System
The infrastructure/events/ package (broadcaster.py + bus.py) is 0% covered. This is the pub/sub backbone for the application. Tests should cover:
- Event subscription and dispatch
- Multiple subscribers
- Error handling in event handlers
- Async event broadcasting
4c. Security Tests Are Thin
tests/security/has only 3 files totaling ~140 linessrc/timmy_serve/l402_proxy.py(payment gating, listed as security-sensitive) has no dedicated test file- CSRF tests exist but bypass/traversal tests are minimal
- No tests for the
approvals.pyauthorization workflow (39.1% covered)
4d. Missing WebSocket Tests
WebSocket handler (ws_manager/handler.py) has 81.2% coverage, but the disconnect/reconnect and error paths (lines 132–147) aren't tested. For a real-time dashboard, WebSocket reliability is critical.
4e. No Tests for timmy/agents/ Subpackage
The Agno-based agent classes (base.py, timmy.py) are at 0% coverage (222 statements). These are stubbed in conftest but never actually exercised. Even with the Agno stub, the control flow and prompt construction logic should be tested.
Priority 5 — Test Speed & Parallelism
| Metric | Value |
|---|---|
| Total wall time | ~35s (sequential) |
Parallel (-n auto) |
Would be ~10-15s |
| Slowest category | Functional tests (HTTP, Docker) |
Observations:
- 30-second timeout per test is generous — consider 10s for unit, 30s for integration
- The
--dist workstealstrategy is good for uneven test durations - 39 tests are skipped (mostly due to missing markers/env) — this is expected
- No test duration profiling is configured (consider
--durations=10)
Recommended Action Plan
Quick Wins (High ROI, Low Effort)
- Fix the 4 broken tests in
test_setup_script.py(add skip guards) - Fix xdist/coverage conflict in
pyproject.tomladdopts - Differentiate tox
unitvsintegrationenvironments - Add
--durations=10to default addopts for profiling slow tests - Add
--cov-fail-under=60to CI workflow to enforce the threshold
Medium Effort, High Impact
- Test the events system (
broadcaster.py+bus.py) — 141 uncovered statements, critical infrastructure - Test
timmy/agents/timmy.py— 165 uncovered statements, core agent - Test
brain/client.pyandbrain/worker.py— 283 uncovered statements, core memory - Test
timmy/tools.pyerror paths — 147 uncovered statements - Test
error_capture.py— 91 uncovered statements, observability blind spot
Longer Term
- Add graceful-degradation tests — verify fallback behavior for all optional services
- Expand security test suite — approvals, L402 proxy, input sanitization
- Add coverage tox environment and enforce in CI
- Align CI with Poetry — use
poetry installinstead of pip for consistent resolution - Target 75% coverage as the next threshold milestone (currently 63.6%)
Coverage Floor Modules (Already Well-Tested)
These modules are at 95%+ and serve as good examples of testing patterns:
spark/eidos.py— 98.3%spark/memory.py— 98.3%infrastructure/models/registry.py— 97.1%timmy/agent_core/ollama_adapter.py— 97.8%timmy/agent_core/interface.py— 100%dashboard/middleware/security_headers.py— 100%dashboard/routes/agents.py— 100%timmy_serve/inter_agent.py— 100%