352 Commits

Author SHA1 Message Date
b30b5c6b57 [loop-cycle-6] Break thinking rumination loop — semantic dedup (#38)
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Failing after 25s
Add post-generation similarity check to ThinkingEngine.think_once().

Problem: Timmy's thinking engine generates repetitive thoughts because
small local models ignore 'don't repeat' instructions in the prompt.
The same observation ('still no chat messages', 'Alexander's name is in
profile') would appear 14+ times in a single day's journal.

Fix: After generating a thought, compare it against the last 5 thoughts
using SequenceMatcher. If similarity >= 0.6, retry with a new seed up to
2 times. If all retries produce repetitive content, discard rather than
store. Uses stdlib difflib — no new dependencies.

Changes:
- thinking.py: Add _is_too_similar() method with SequenceMatcher
- thinking.py: Wrap generation in retry loop with dedup check
- test_thinking.py: 7 new tests covering exact match, near match,
  different thoughts, retry behavior, and max-retry discard

+96/-20 lines in thinking.py, +87 lines in tests.
2026-03-14 16:21:16 -04:00
0d61b709da Merge pull request '[loop-cycle-5] Persist chat history in SQLite (#46)' (#63) from fix/issue-46-chat-persistence into main
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Failing after 14s
2026-03-14 16:10:55 -04:00
79edfd1106 feat: persist chat history in SQLite — survives server restarts
Some checks failed
Tests / lint (pull_request) Successful in 2s
Tests / test (pull_request) Failing after 13s
Replace in-memory MessageLog with SQLite-backed implementation.
Same API surface (append/all/clear/len) so zero caller changes needed.

- data/chat.db stores messages with role, content, timestamp, source
- Lazy DB connection (opened on first use, not at import time)
- Retention policy: oldest messages pruned when count > 500
- New .recent(limit) method for efficient last-N queries
- Thread-safe with explicit locking
- WAL mode for concurrent read performance
- Test isolation: conftest redirects DB to tmp_path per test
- 8 new tests: persistence, retention, concurrency, source field

Closes #46
2026-03-14 16:09:26 -04:00
013a2cc330 Merge pull request 'feat: add --session-id to timmy chat CLI' (#62) from fix/cli-session-id into main
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Failing after 14s
2026-03-14 16:06:16 -04:00
f426df5b42 feat: add --session-id option to timmy chat CLI
Some checks failed
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Failing after 15s
Allows specifying a named session for conversation persistence.
Use cases:
- Autonomous loops can have their own session (e.g. --session-id loop)
- Multiple users/agents can maintain separate conversations
- Testing different conversation threads without polluting the default

Precedence: --session-id > --new > default 'cli' session
2026-03-14 16:05:00 -04:00
bef4fc1024 Merge pull request '[loop-cycle-4] Push event system coverage to ≥80% on all modules' (#61) from fix/issue-45-event-coverage into main
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Failing after 14s
2026-03-14 16:02:27 -04:00
9535dd86de test: push event system coverage to ≥80% on all three modules
Some checks failed
Tests / lint (pull_request) Successful in 4s
Tests / test (pull_request) Failing after 16s
Add 3 targeted tests for infrastructure/error_capture.py:
- test_stale_entries_pruned: exercises dedup cache pruning (line 61)
- test_git_context_fallback_on_failure: exercises exception path (lines 90-91)
- test_returns_none_when_feedback_disabled: exercises early return (line 112)

Coverage results (63 tests, all passing):
- error_capture.py: 75.6% → 80.0%
- broadcaster.py: 93.9% (unchanged)
- bus.py: 92.9% (unchanged)
- Total: 88.1% → 89.4%

Closes #45
2026-03-14 16:01:05 -04:00
70d5dc5ce1 fix: replace eval() with AST-walking safe evaluator in calculator
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Failing after 14s
Fixes #52

- Replace eval() in calculator() with _safe_eval() that walks the AST
  and only permits: numeric constants, arithmetic ops (+,-,*,/,//,%,**),
  unary +/-, math module access, and whitelisted builtins (abs, round,
  min, max)
- Reject all other syntax: imports, attribute access on non-math objects,
  lambdas, comprehensions, string literals, etc.
- Add 39 tests covering arithmetic, precedence, math functions,
  allowed builtins, error handling, and 14 injection prevention cases
2026-03-14 15:51:35 -04:00
122d07471e Merge pull request 'fix: sanitize dynamic innerHTML in HTML templates (#47)' (#58) from fix/xss-sanitize into main
Some checks failed
Tests / lint (push) Successful in 4s
Tests / test (push) Failing after 12s
2026-03-14 15:45:11 -04:00
3d110098d1 Merge pull request 'feat: Add Kimi agent workspace with development scaffolding' (#44) from kimi/agent-workspace-init into main
Some checks failed
Tests / lint (push) Successful in 2s
Tests / test (push) Failing after 14s
Reviewed-on: #44
2026-03-14 15:09:04 -04:00
591954891a fix: sanitize dynamic innerHTML in templates (#47)
Some checks failed
Tests / lint (pull_request) Successful in 2s
Tests / test (pull_request) Failing after 15s
2026-03-14 15:07:00 -04:00
bb287b2c73 fix: sanitize WebSocket data in HTML templates (XSS #47) 2026-03-14 15:01:48 -04:00
6233a8ccd6 feat: Add Kimi agent workspace with development scaffolding
Some checks failed
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Failing after 13s
Create the Kimi (Moonshot AI) agent workspace per AGENTS.md conventions:

Workspace Structure:
- .kimi/AGENTS.md - Workspace guide and conventions
- .kimi/README.md - Quick reference documentation
- .kimi/CHECKPOINT.md - Session state tracking
- .kimi/TODO.md - Task list for upcoming work
- .kimi/notes/ - Working notes directory
- .kimi/plans/ - Plan documents
- .kimi/worktrees/ - Git worktrees (reserved)

Development Scripts:
- scripts/bootstrap.sh - One-time workspace setup (venv, deps, .env)
- scripts/resume.sh - Quick status check + resume prompt
- scripts/dev.sh - Development helpers (status, test, lint, format, clean, nuke)

Features:
- Validates Python 3.11+, venv, deps, .env, git config
- Provides quick status on git, tests, Ollama, dashboard
- Commands for testing, linting, formatting, cleaning

Per AGENTS.md:
- Kimi is Build Tier for large-context feature drops
- Follows existing project patterns
- No changes to source code - workspace only
2026-03-14 14:30:38 -04:00
fa838b0063 fix: clean shutdown — silence MCP async-generator teardown noise
Some checks failed
Tests / lint (push) Successful in 2s
Tests / test (push) Failing after 13s
Swallow anyio cancel-scope RuntimeError and BaseExceptionGroup
from MCP stdio_client generators during GC on voice loop exit.
Custom unraisablehook + loop exception handler + warnings filter.
2026-03-14 14:12:05 -04:00
782218aa2c fix: voice loop — persistent event loop, markdown stripping, MCP noise
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Failing after 12s
Three fixes from real-world testing:

1. Event loop: replaced asyncio.run() with a persistent loop so
   Agno's MCP sessions survive across conversation turns. No more
   'Event loop is closed' errors on turn 2+.

2. Markdown stripping: voice preamble tells Timmy to respond in
   natural spoken language, plus _strip_markdown() as a safety net
   removes **bold**, *italic*, bullets, headers, code fences, etc.
   TTS no longer reads 'asterisk asterisk'.

3. MCP noise: _suppress_mcp_noise() quiets mcp/agno/httpx loggers
   during voice mode so the terminal shows clean transcript only.

32 tests (12 new for markdown stripping + persistent loop).
2026-03-14 14:05:24 -04:00
dbadfc425d feat: sovereign voice loop — timmy voice command
Some checks failed
Tests / lint (push) Successful in 4s
Tests / test (push) Failing after 14s
Adds fully local listen-think-speak voice interface.
STT: Whisper, LLM: Ollama, TTS: Piper. No cloud, no network.

- src/timmy/voice_loop.py: VoiceLoop with VAD, Whisper, Piper
- src/timmy/cli.py: new voice command
- pyproject.toml: voice extras updated
- 20 new tests
2026-03-14 13:58:56 -04:00
d770d66150 Merge pull request 'fix: fact distillation — block garbage and secrets, improve dedup' (#43) from fix/fact-distillation into main
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 33s
hermes/v0.1
2026-03-14 13:00:59 -04:00
8ecc0b1780 fix: fact distillation — block garbage and secrets, improve dedup
All checks were successful
Tests / lint (pull_request) Successful in 5s
Tests / test (pull_request) Successful in 33s
- Rewrite distillation prompt with explicit GOOD/BAD examples
  Good: user preferences, project decisions, learned knowledge
  Bad: meta-observations, internal state, credentials
- Add security filter: block facts containing token/password/secret/key patterns
- Add meta-observation filter: block self-referential 'my thinking' facts
- Lower dedup threshold 0.9 -> 0.75 to catch paraphrased duplicates

Ref #40
2026-03-14 13:00:30 -04:00
60631a7ad1 Merge pull request 'fix: persistent event loop in CLI interview — no more Event loop is closed' (#42) from fix/cli-event-loop into main
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 33s
2026-03-14 12:58:46 -04:00
b222b28856 fix: use persistent event loop in interview command
All checks were successful
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Successful in 32s
Replace repeated asyncio.run() calls with a single event loop that
persists across all interview questions. The old approach created and
destroyed loops per question, orphaning MCP stdio transports and
causing 'Event loop is closed' errors on ~50% of questions.

Also adds clean shutdown: closes MCP sessions before closing the loop.

Ref #36
2026-03-14 12:58:11 -04:00
f19b52a4dc Merge pull request 'fix: corrupted memory state + regex bug in update_user_profile' (#41) from fix/corrupted-memory-state into main
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 33s
2026-03-14 12:56:52 -04:00
58ddf55282 fix: regex corruption in update_user_profile + hot memory write guards
All checks were successful
Tests / lint (pull_request) Successful in 5s
Tests / test (pull_request) Successful in 36s
- memory_system.py: fix regex replacement in update_user_profile()
  Used lambda instead of raw replacement string to prevent corruption
- memory_system.py: add guards to update_section() for empty/oversized writes

Ref #39
2026-03-14 12:55:02 -04:00
d062b0a890 Merge pull request 'cleanup: delete ~8,000 lines of dead code + sovereignty fix' (#33) from cleanup/code-review-issues into main
All checks were successful
Tests / lint (push) Successful in 15s
Tests / test (push) Successful in 1m19s
Reviewed-on: #33
2026-03-14 09:54:17 -04:00
2f623826bd cleanup: delete dead modules — ~7,900 lines removed
All checks were successful
Tests / lint (pull_request) Successful in 5s
Tests / test (pull_request) Successful in 1m30s
Closes #22, Closes #23

Deleted: brain/, swarm/, openfang/, paperclip/, cascade_adapter,
memory_migrate, agents/timmy.py, dead routes + all corresponding tests.

Updated pyproject.toml, app.py, loop_qa.py for removed imports.
2026-03-14 09:49:24 -04:00
c7221e27cc Merge pull request 'refactor: YAML-driven agent config — kill hardcoded personas' (#21) from refactor/yaml-driven-agents into main
All checks were successful
Tests / lint (push) Successful in 7s
Tests / test (push) Successful in 49s
Reviewed-on: #21
2026-03-14 08:44:04 -04:00
Trip T
0e89caa830 test: update delegation tests for YAML-driven agent IDs
All checks were successful
Tests / lint (pull_request) Successful in 8s
Tests / test (pull_request) Successful in 1m7s
Old hardcoded IDs (seer, forge, echo, helm, quill) replaced with
YAML-defined IDs (orchestrator, researcher, coder, writer, memory,
experimenter). Added test that old names are explicitly rejected.
2026-03-14 08:40:24 -04:00
dc380860ba Merge pull request 'fix: MCP integration — StdioServerParameters + smoke-tested' (#20) from claude/sharp-mcnulty into main
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 35s
Reviewed-on: #20
2026-03-12 22:06:55 -04:00
Trip T
bd1aa55904 fix: use StdioServerParameters to bypass Agno executable whitelist
All checks were successful
Tests / lint (pull_request) Successful in 23s
Tests / test (pull_request) Successful in 32s
Agno's MCPTools has an undocumented executable whitelist that blocks
gitea-mcp (Go binary). Switch to server_params=StdioServerParameters()
which bypasses this restriction. Also fixes:

- Use tools.session.call_tool() for standalone invocation (MCPTools
  doesn't expose call_tool() directly)
- Use close() instead of disconnect() for cleanup
- Resolve gitea-mcp path via ~/go/bin fallback when not on PATH
- Stub mcp.client.stdio in test conftest

Smoke-tested end-to-end against real Gitea: connect, list_issues,
create issue, close issue, create_gitea_issue_via_mcp — all pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:03:45 -04:00
Trip T
8aef55ac07 fix: correct MCP tool names, timeout kwarg, and make mcp a core dep
- Fix tool names to match gitea-mcp server: issue_write, issue_read,
  list_issues, pull_request_write, etc. (old names didn't exist)
- Fix timeout → timeout_seconds (MCPTools API)
- Move mcp from optional to core dependency (required for agent)
- Add PR tools (pull_request_write/read, list_pull_requests)
- Fix create_gitea_issue_via_mcp to use issue_write with method="create"
- Update tool_safety.py and tests for corrected names
- Regenerate poetry.lock

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:52:28 -04:00
dc13b368a5 Merge pull request 'feat: replace custom Gitea with MCP servers' (#14) from claude/sharp-mcnulty into main
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 30s
Reviewed-on: #14
2026-03-12 21:45:55 -04:00
Trip T
78167675f2 feat: replace custom Gitea client with MCP servers
All checks were successful
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Successful in 29s
Replace the bespoke GiteaHand httpx client and tools_gitea.py wrappers
with official MCP tool servers (gitea-mcp + filesystem MCP), wired into
Agno via MCPTools. Switch all session functions to async (arun/acontinue_run)
so MCP tools auto-connect. Delete ~1070 lines of custom Gitea code.

- Create src/timmy/mcp_tools.py with MCP factories + standalone issue bridge
- Wire MCPTools into agent.py tool list (Gitea + filesystem)
- Switch session.py chat/chat_with_tools/continue_chat to async
- Update all callers (dashboard routes, Discord vendor, CLI, thinking engine)
- Add gitea_token fallback from ~/.config/gitea/token
- Add MCP session cleanup to app shutdown hook
- Update tool_safety.py for MCP tool names
- 11 new tests, all 1417 passing, coverage 74.2%

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:40:32 -04:00
e1c6fdc3fd Merge pull request 'claude/sharp-mcnulty' (#13) from claude/sharp-mcnulty into main
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 30s
Reviewed-on: #13
2026-03-12 20:57:46 -04:00
Trip T
41d6ebaf6a feat: CLI session persistence + tool confirmation gate
All checks were successful
Tests / lint (pull_request) Successful in 5s
Tests / test (pull_request) Successful in 31s
- Chat sessions persist across `timmy chat` invocations via Agno SQLite
  (session_id="cli"), fixing context amnesia between turns
- Dangerous tools (shell, write_file, etc.) now prompt for approval in CLI
  instead of silently exiting — uses typer.confirm() + Agno continue_run
- --new flag starts a fresh conversation when needed
- Improved _maybe_file_issues prompt for engineer-quality issue bodies
  (what's happening, expected behavior, suggested fix, acceptance criteria)
- think/status commands also pass session_id for continuity

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:55:56 -04:00
Trip T
350e6f54ff fix: prevent "Event loop is closed" on repeated Gitea API calls
The httpx AsyncClient was cached across asyncio.run() boundaries.
Each asyncio.run() creates and closes a new event loop, leaving the
cached client's connections on a dead loop.  Second+ calls would fail
with "Event loop is closed".

Fix: create a fresh client per request and close it in a finally block.
No more cross-loop client reuse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:40:39 -04:00
4ca15de1e7 Merge pull request 'feat: add Gitea issue creation — Timmy's self-improvement channel' (#9) from claude/sharp-mcnulty into main
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 34s
Reviewed-on: #9
2026-03-12 18:39:46 -04:00
Trip T
7163b15300 feat: add Gitea issue creation — Timmy's self-improvement channel
All checks were successful
Tests / lint (pull_request) Successful in 4s
Tests / test (pull_request) Successful in 31s
Give Timmy the ability to file Gitea issues when he notices bugs,
stale state, or improvement opportunities in his own codebase.

Components:
- GiteaHand async API client (infrastructure/hands/gitea.py)
  - Token auth with ~/.config/gitea/token fallback
  - Create/list/close issues, dedup by title similarity
  - Graceful degradation when Gitea unreachable
- Tool functions (timmy/tools_gitea.py)
  - create_gitea_issue: file issues with dedup + work order bridge
  - list_gitea_issues: check existing backlog
  - Classified as SAFE (no confirmation needed)
- Thinking post-hook (_maybe_file_issues in thinking.py)
  - Every 20 thoughts, LLM classifies recent thoughts for actionable items
  - Auto-files bugs/improvements to Gitea with dedup
  - Bridges to local work order system for dashboard tracking
- Config: gitea_url, gitea_token, gitea_repo, gitea_enabled,
  gitea_timeout, thinking_issue_every

All 1426 tests pass, 74.17% coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 18:36:06 -04:00
faa743131f Merge pull request 'feat: consolidate memory into unified memory.db with 4-type model' (#8) from claude/sharp-mcnulty into main
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 37s
Reviewed-on: #8
2026-03-12 11:28:51 -04:00
Trip T
b2f12ca97c feat: consolidate memory into unified memory.db with 4-type model
All checks were successful
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Successful in 30s
Consolidates 3 separate memory databases (semantic_memory.db, swarm.db
memory_entries, brain.db) into a single data/memory.db with facts,
chunks, and episodes tables.

Key changes:
- Add unified schema (timmy/memory/unified.py) with 3 core tables
- Redirect vector_store.py and semantic_memory.py to memory.db
- Add thought distillation: every Nth thought extracts lasting facts
- Enrich agent context with known facts in system prompt
- Add memory_forget tool for removing outdated memories
- Unify embeddings: vector_store delegates to semantic_memory.embed_text
- Bridge spark events to unified event log
- Add pruning for thoughts and events with configurable retention
- Add data migration script (timmy/memory_migrate.py)
- Deprecate brain.memory in favor of unified system

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:23:18 -04:00
046e0055c5 Merge pull request 'feat: add DB Explorer for SQLite inspection' (#7) from claude/sharp-mcnulty into main
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 33s
Reviewed-on: #7
2026-03-12 10:47:50 -04:00
Trip T
bc38fee817 feat: add DB Explorer for read-only SQLite inspection
All checks were successful
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Successful in 29s
Adds /db-explorer page and JSON API to browse all 15 SQLite databases
in data/. Sidebar lists databases with sizes, clicking one renders all
tables as scrollable data tables with row truncation at 200.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 10:41:13 -04:00
765e0f79c7 Merge pull request 'feat: add Loop QA self-testing framework' (#6) from claude/suspicious-poincare into main
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 29s
Reviewed-on: #6
2026-03-11 22:38:50 -04:00
Trip T
d42c574d26 feat: add Loop QA self-testing framework
All checks were successful
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Successful in 29s
Structured self-test framework that probes 6 capabilities (tool use,
multistep planning, memory read/write, self-coding, lightning econ) in
round-robin. Reuses existing infra: event_log for persistence,
create_task() for upgrade proposals, capture_error() for crash handling,
and in-memory circuit breaker for failure tracking.

- src/timmy/loop_qa.py: Capability enum, 6 async probes, orchestrator
- src/dashboard/routes/loop_qa.py: JSON + HTMX health endpoints
- HTMX partial polls every 30s on the health panel
- Background scheduler in app.py lifespan
- 25 tests covering probes, orchestrator, health snapshot, routes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 22:33:16 -04:00
Trip T
c7f92f6d7b docs: add error handling patterns and module dependencies to CLAUDE.md
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 30s
- Document 3 graceful degradation patterns with code examples
- Add Service Fallback Matrix for optional services
- Add module dependency tree with change impact guide

chore: fix typecheck environment

- Add mypy to dev dependencies in pyproject.toml
- Fix tox.ini typecheck environment to install mypy explicitly
2026-03-11 22:21:07 -04:00
05bd7f03f4 Merge pull request 'feat: enrich thinking engine — anti-loop, anti-confabulation, grounding' (#5) from claude/suspicious-poincare into main
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 30s
Reviewed-on: #5
2026-03-11 21:50:52 -04:00
Trip T
f1e909b1e3 feat: enrich thinking engine — anti-loop, anti-confabulation, grounding
All checks were successful
Tests / lint (pull_request) Successful in 3s
Tests / test (pull_request) Successful in 30s
Rewrite _THINKING_PROMPT with strict rules: 2-3 sentence limit,
anti-confabulation (only reference real data), anti-repetition.

- Add _pick_seed_type() with recent-type dedup (excludes last 3)
- Add _gather_system_snapshot() for real-time grounding (time, thought
  count, chat activity, task queue)
- Improve _build_continuity_context() with anti-repetition header and
  100-char truncation
- Fix journal + memory timestamps to include local timezone
- 12 new TDD tests covering all improvements

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 21:47:28 -04:00
22589375e1 Merge pull request 'feat: tick prompt arg + fix name extraction' (#4) from claude/suspicious-poincare into main
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 33s
Reviewed-on: #4
2026-03-11 21:18:05 -04:00
Trip T
f8dadeec59 feat: tick prompt arg + fix name extraction learning verbs as names
All checks were successful
Tests / lint (pull_request) Successful in 4s
Tests / test (pull_request) Successful in 36s
Add optional prompt argument to `timmy tick` so custom journal
prompts can be passed from the CLI (seed_type="prompted").

Fix extract_user_name() learning verbs as names (e.g. "Serving").
Now requires the candidate word to start with a capital letter in
the original message, rejects common verb suffixes (-ing, -tion,
etc.), and deduplicates the naive regex in TimmyWithMemory to use
the fixed ConversationManager.extract_user_name() instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 21:11:53 -04:00
31fe150cb3 Merge pull request 'fix: test DB isolation, Discord recovery, and over-mocked tests' (#3) from claude/suspicious-poincare into main
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 29s
Reviewed-on: #3
2026-03-11 20:57:37 -04:00
Trip T
6a7875e05f feat: heartbeat memory hooks — pre-recall and post-update
All checks were successful
Tests / lint (pull_request) Successful in 4s
Tests / test (pull_request) Successful in 32s
Wire MEMORY.md + soul.md into the thinking loop so each heartbeat
is grounded in identity and recent context, breaking repetitive loops.

Pre-hook: _load_memory_context() reads hot memory first (changes each
cycle) then soul.md (stable identity), truncated to 1500 chars.

Post-hook: _update_memory() writes a "Last Reflection" section to
MEMORY.md after each thought so the next cycle has fresh context.

soul.md is read-only from the heartbeat — never modified by it.
All hooks degrade gracefully and never crash the heartbeat.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:54:13 -04:00
Trip T
ea2dbdb4b5 fix: test DB isolation, Discord recovery, and over-mocked tests
All checks were successful
Tests / lint (pull_request) Successful in 8s
Tests / test (pull_request) Successful in 33s
Test data was bleeding into production tasks.db because
swarm.task_queue.models.DB_PATH (relative path) was never patched in
conftest.clean_database. Fixed by switching to absolute paths via
settings.repo_root and adding the missing module to the patching list.

Discord bot could leak orphaned clients on retry after ERROR state.
Added _cleanup_stale() to close stale client/task before each start()
attempt, with improved logging in the token watcher.

Rewrote test_paperclip_client.py to use httpx.MockTransport instead of
patching _get/_post/_delete — tests now exercise real HTTP status codes,
error handling, and JSON parsing. Added end-to-end test for
capture_error → create_task DB isolation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:33:59 -04:00