Timmy-time-dashboard

Author	SHA1	Message	Date
Trip T	c7f92f6d7b	docs: add error handling patterns and module dependencies to CLAUDE.md All checks were successful Tests / lint (push) Successful in 3s Details Tests / test (push) Successful in 30s Details - Document 3 graceful degradation patterns with code examples - Add Service Fallback Matrix for optional services - Add module dependency tree with change impact guide chore: fix typecheck environment - Add mypy to dev dependencies in pyproject.toml - Fix tox.ini typecheck environment to install mypy explicitly	2026-03-11 22:21:07 -04:00
rockachopa	05bd7f03f4	Merge pull request 'feat: enrich thinking engine — anti-loop, anti-confabulation, grounding' (#5 ) from claude/suspicious-poincare into main All checks were successful Tests / lint (push) Successful in 2s Details Tests / test (push) Successful in 30s Details Reviewed-on: #5	2026-03-11 21:50:52 -04:00
Trip T	f1e909b1e3	feat: enrich thinking engine — anti-loop, anti-confabulation, grounding All checks were successful Tests / lint (pull_request) Successful in 3s Details Tests / test (pull_request) Successful in 30s Details Rewrite _THINKING_PROMPT with strict rules: 2-3 sentence limit, anti-confabulation (only reference real data), anti-repetition. - Add _pick_seed_type() with recent-type dedup (excludes last 3) - Add _gather_system_snapshot() for real-time grounding (time, thought count, chat activity, task queue) - Improve _build_continuity_context() with anti-repetition header and 100-char truncation - Fix journal + memory timestamps to include local timezone - 12 new TDD tests covering all improvements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 21:47:28 -04:00
rockachopa	22589375e1	Merge pull request 'feat: tick prompt arg + fix name extraction' (#4 ) from claude/suspicious-poincare into main All checks were successful Tests / lint (push) Successful in 4s Details Tests / test (push) Successful in 33s Details Reviewed-on: #4	2026-03-11 21:18:05 -04:00
Trip T	f8dadeec59	feat: tick prompt arg + fix name extraction learning verbs as names All checks were successful Tests / lint (pull_request) Successful in 4s Details Tests / test (pull_request) Successful in 36s Details Add optional prompt argument to `timmy tick` so custom journal prompts can be passed from the CLI (seed_type="prompted"). Fix extract_user_name() learning verbs as names (e.g. "Serving"). Now requires the candidate word to start with a capital letter in the original message, rejects common verb suffixes (-ing, -tion, etc.), and deduplicates the naive regex in TimmyWithMemory to use the fixed ConversationManager.extract_user_name() instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 21:11:53 -04:00
rockachopa	31fe150cb3	Merge pull request 'fix: test DB isolation, Discord recovery, and over-mocked tests' (#3 ) from claude/suspicious-poincare into main All checks were successful Tests / lint (push) Successful in 3s Details Tests / test (push) Successful in 29s Details Reviewed-on: #3	2026-03-11 20:57:37 -04:00
Trip T	6a7875e05f	feat: heartbeat memory hooks — pre-recall and post-update All checks were successful Tests / lint (pull_request) Successful in 4s Details Tests / test (pull_request) Successful in 32s Details Wire MEMORY.md + soul.md into the thinking loop so each heartbeat is grounded in identity and recent context, breaking repetitive loops. Pre-hook: _load_memory_context() reads hot memory first (changes each cycle) then soul.md (stable identity), truncated to 1500 chars. Post-hook: _update_memory() writes a "Last Reflection" section to MEMORY.md after each thought so the next cycle has fresh context. soul.md is read-only from the heartbeat — never modified by it. All hooks degrade gracefully and never crash the heartbeat. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 20:54:13 -04:00
Trip T	ea2dbdb4b5	fix: test DB isolation, Discord recovery, and over-mocked tests All checks were successful Tests / lint (pull_request) Successful in 8s Details Tests / test (pull_request) Successful in 33s Details Test data was bleeding into production tasks.db because swarm.task_queue.models.DB_PATH (relative path) was never patched in conftest.clean_database. Fixed by switching to absolute paths via settings.repo_root and adding the missing module to the patching list. Discord bot could leak orphaned clients on retry after ERROR state. Added _cleanup_stale() to close stale client/task before each start() attempt, with improved logging in the token watcher. Rewrote test_paperclip_client.py to use httpx.MockTransport instead of patching _get/_post/_delete — tests now exercise real HTTP status codes, error handling, and JSON parsing. Added end-to-end test for capture_error → create_task DB isolation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 20:33:59 -04:00
rockachopa	9d9449cdcf	Merge pull request 'fix: WebSocket crash from websockets 16.0 + branch pruning' (#2 ) from claude/consolidated-cherry-picks into main All checks were successful Tests / lint (push) Successful in 3s Details Tests / test (push) Successful in 28s Details Reviewed-on: #2	2026-03-11 19:06:46 -04:00
Trip T	ffdfa53259	fix: Discord token priority — settings before state file All checks were successful Tests / lint (pull_request) Successful in 4s Details Tests / test (pull_request) Successful in 30s Details load_token() was checking the state file before settings.discord_token, so a stale fake token in discord_state.json would block the real token from .env/DISCORD_TOKEN. Flipped the priority: env config first, state file as fallback for tokens set via /discord/setup UI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 19:03:24 -04:00
Alexander Payne	0bc4f55e1a	fix: resolve WebSocket crashes from websockets 16.0 incompatibility All checks were successful Tests / lint (pull_request) Successful in 3s Details Tests / test (pull_request) Successful in 30s Details The /ws redirect handler crashed with AttributeError because websockets 16.0 removed the legacy transfer_data_task attribute. The /swarm/live endpoint could also error on early client disconnects during accept. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 18:52:37 -04:00
rockachopa	5bfc389fee	Merge pull request 'feat: upgrade to qwen3.5, self-hosted Gitea CI, optimize Docker image' (#1 ) from claude/upbeat-jennings into main All checks were successful Tests / lint (push) Successful in 2s Details Tests / test (push) Successful in 30s Details Reviewed-on: #1	2026-03-11 18:38:21 -04:00
Trip T	f6a6c0f62e	feat: upgrade to qwen3.5, self-hosted Gitea CI, optimize Docker image All checks were successful Tests / lint (pull_request) Successful in 2s Details Tests / test (pull_request) Successful in 32s Details Model upgrade: - qwen2.5:14b → qwen3.5:latest across config, tools, and docs - Added qwen3.5 to multimodal model registry Self-hosted Gitea CI: - .gitea/workflows/tests.yml: lint + test jobs via act_runner - Unified Dockerfile: pre-baked deps from poetry.lock for fast CI - sitepackages=true in tox for ~2s dep resolution (was ~40s) - OLLAMA_URL set to dead port in CI to prevent real LLM calls Test isolation fixes: - Smoke test fixture mocks create_timmy (was hitting real Ollama) - WebSocket sends initial_state before joining broadcast pool (race fix) - Tests use settings.ollama_model/url instead of hardcoded values - skip_ci marker for Ollama-dependent tests, excluded in CI tox envs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 18:36:42 -04:00
Alexander Whitestone	36fc10097f	Claude/angry cerf (#173 ) Some checks failed Tests / lint (push) Failing after 4s Details Tests / test (push) Has been skipped Details Tests / docker-build (push) Failing after 1s Details * feat: set qwen3.5:latest as default model - Make qwen3.5:latest the primary default model for faster inference - Move llama3.1:8b-instruct to fallback chain - Update text fallback chain to prioritize qwen3.5:latest Retains full backward compatibility via cascade fallback. * test: remove ~55 brittle, duplicate, and useless tests Audit of all 100 test files identified tests that provided no real regression protection. Removed: - 4 files deleted entirely: test_setup_script (always skipped), test_csrf_bypass (tautological assertions), test_input_validation (accepts 200-500 status codes), test_security_regression (fragile source-pattern checks redundant with rendering tests) - Duplicate test classes (TestToolTracking, TestCalculatorExtended) - Mock-only tests that just verify mock wiring, not behavior - Structurally broken tests (TestCreateToolFunctions patches after import) - Empty/pass-body tests and meaningless assertions (len > 20) - Flaky subprocess tests (aider tool calling real binary) All 1328 remaining tests pass. Net: -699 lines, zero coverage loss. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent test pollution from autoresearch_enabled mutation test_autoresearch_perplexity.py was setting settings.autoresearch_enabled = True but never restoring it in the finally block — polluting subsequent tests. When pytest-randomly ordered it before test_experiments_page_shows_disabled_when_off, the victim test saw enabled=True and failed to find "Disabled" in the page. Fix both sides: - Restore autoresearch_enabled in the finally block (root cause) - Mock settings explicitly in the victim test (defense in depth) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 16:55:27 -04:00
Alexander Whitestone	0b91e45d90	Polish UI design with sleeker components and vivid magical animations (#172 )	2026-03-11 15:16:04 -04:00
Alexander Whitestone	9e56fad342	Fix iPhone responsiveness: layout stacking and Memory Browser styling (#171 ) - Fix critical mobile layout bug: override flex-wrap: nowrap on .mc-content > .row at 768px breakpoint so Bootstrap columns stack vertically on iPhone instead of being crammed side-by-side (causing content bleed/cutoff) - Add complete Memory Browser CSS: stats grid, search form, results, facts list with proper mobile breakpoints (2-col stats, stacked search form, touch-friendly fact buttons) - Move Grok button and fact list inline styles to CSS classes per project convention - Add shared .mc-btn, .mc-btn-primary, .mc-btn-small, .page-title, .mc-text-secondary classes used across templates https://claude.ai/code/session_01VRjXp6wxBrgawsKB92LEaT Co-authored-by: Claude <noreply@anthropic.com>	2026-03-11 13:08:19 -04:00
Alexander Whitestone	68115fe477	fix: update agno to v2 and fix airllm availability tests (#170 ) The agno dependency was pinned to <2.0 but the code uses agno.db.sqlite (a 2.x API), breaking all tests in CI. Also fix airllm provider tests to patch importlib.util.find_spec (what the production code uses) instead of builtins.__import__. Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 12:40:45 -04:00
Alexander Whitestone	9d78eb31d1	ruff (#169 ) * polish: streamline nav, extract inline styles, improve tablet UX - Restructure desktop nav from 8+ flat links + overflow dropdown into 5 grouped dropdowns (Core, Agents, Intel, System, More) matching the mobile menu structure to reduce decision fatigue - Extract all inline styles from mission_control.html and base.html notification elements into mission-control.css with semantic classes - Replace JS-built innerHTML with secure DOM construction in notification loader and chat history - Add CONNECTING state to connection indicator (amber) instead of showing OFFLINE before WebSocket connects - Add tablet breakpoint (1024px) with larger touch targets for Apple Pencil / stylus use and safe-area padding for iPad toolbar - Add active-link highlighting in desktop dropdown menus - Rename "Mission Control" page title to "System Overview" to disambiguate from the chat home page - Add "Home — Timmy Time" page title to index.html https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h * fix(security): move auth-gate credentials to environment variables Hardcoded username, password, and HMAC secret in auth-gate.py replaced with os.environ lookups. Startup now refuses to run if any variable is unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example. https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h * refactor(tooling): migrate from black+isort+bandit to ruff Replace three separate linting/formatting tools with a single ruff invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs), .pre-commit-config.yaml, and CI workflow. Fixes all ruff errors including unused imports, missing raise-from, and undefined names. Ruff config maps existing bandit skips to equivalent S-rules. https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-03-11 12:23:35 -04:00
Alexander Whitestone	708c8a2477	polish: streamline nav, extract inline styles, improve tablet UX (#168 )	2026-03-11 11:32:56 -04:00
Alexander Whitestone	b028b768c9	enhance: diversify and deepen thinking engine prompts (#167 ) Add sovereignty and observation seed types, expand creative metaphors, improve swarm seeds with reflective prompts, and update the thinking prompt to encourage grounded, specific, varied inner thoughts. Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 11:08:41 -04:00
Alexander Whitestone	c41e3e1e15	fix: clean up logging colors, reduce noise, enable Tailscale access (#166 ) * fix: reserve red for real errors, reduce log noise, allow Tailscale access - Add _ColorFormatter: red = ERROR/CRITICAL only, yellow = WARNING, green = INFO - Override uvicorn's default colors to use our scheme - Downgrade discord "not installed" from ERROR to WARNING (optional dep) - Downgrade DuckDuckGo unavailable from INFO to DEBUG - Stop discord token watcher retry loop when discord.py not installed - Add configurable trusted_hosts setting; dev mode allows all hosts - Exclude .claude/ from uvicorn reload watcher (worktree isolation) - Fix pre-commit hook: use tox -e unit, bump timeout to 60s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: auto-format with black Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: pre-commit hook auto-formats with black+isort before testing Formatting should never block a commit — just fix it automatically. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 10:37:20 -04:00
Alexander Whitestone	1191ea2f9a	Claude/fix tick engine v16wt (#165 ) * fix: wire up tick engine scheduler + add journal + systemd timer The ThinkingEngine was fully implemented but never called — the background scheduler was lost during the Celery removal in #133. This commit: - Add _thinking_scheduler() to dashboard lifespan (5-min cycle) - Add _write_journal() that appends thoughts to data/journal/YYYY-MM-DD.md - Add `timmy tick` CLI command for one-shot thinking (systemd-friendly) - Add deploy/timmy-tick.{service,timer} systemd units https://claude.ai/code/session_013e7upfJ6negFzu5YNJikge * Add macOS launchd plist for Timmy tick timer Equivalent of the existing systemd service/timer for Linux. Runs `timmy tick` every 5 minutes via launchd on macOS. https://claude.ai/code/session_013e7upfJ6negFzu5YNJikge * fix: make macOS launchd timer work with user-local paths The plist had hardcoded /opt/timmy paths that don't exist on Mac. Now uses a template with __PROJECT_DIR__ placeholders, a wrapper script for PATH setup, and an install script that wires it all up. Usage: ./deploy/install-mac-timer.sh https://claude.ai/code/session_013e7upfJ6negFzu5YNJikge * fix: add missing tox pre-commit env + pre-push hook to prevent broken builds RCA: the pre-commit hook referenced `tox -e pre-commit` which didn't exist in tox.ini, so commits went unchecked. There was also no pre-push hook, so broken code could reach GitHub without running the CI-mirror suite. - Add [testenv:pre-commit] to tox.ini (format check + unit tests) - Add .githooks/pre-push that runs `tox -e pre-push` (full CI mirror) https://claude.ai/code/session_013e7upfJ6negFzu5YNJikge --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-03-11 10:19:46 -04:00
Alexander Whitestone	622a6a9204	polish: extract inline CSS, add connection status, panel macro, favicon, ollama cache, toast system (#164 ) Major: - Extract all inline <style> blocks from 22 Jinja2 templates into static/css/mission-control.css — single cacheable stylesheet - Add tox lint check that fails on inline <style> in templates Minor: 1. Connection status indicator in topbar (green/amber/red dot) reflecting WebSocket + Ollama reachability, with auto-reconnect 2. Jinja2 {% macro panel(title) %} in macros.html — eliminates repeated .card.mc-panel markup; index.html converted as example 3. SVG favicon (purple T + orange dot) 4. 30-second TTL cache on _check_ollama() to avoid blocking the event loop on every health poll (asyncio.to_thread was already in place) 5. Toast notification system (McToast.show) for transient status messages — wired into connection status for Ollama/WebSocket state changes Enforcement: - CLAUDE.md updated with conventions 11-14 (no inline CSS, use panel macro, use toasts, never block the event loop) - tox lint + pre-push environments now fail on inline <style> blocks https://claude.ai/code/session_014FQ785MQdyJQ4BAXrRSo9w Co-authored-by: Claude <noreply@anthropic.com>	2026-03-11 09:52:57 -04:00
Alexander Whitestone	07f2c1b41e	fix: wire up tick engine scheduler + add journal + systemd timer (#163 )	2026-03-11 08:47:57 -04:00
Alexander Whitestone	a927241dbe	polish: make repo presentable for employer review (#162 )	2026-03-11 08:11:26 -04:00
Alexander Whitestone	1de97619e8	fix: restore ollama as default backend to fix broken build (#161 )	2026-03-10 18:17:47 -04:00
Manus AI	755b7e7658	feat: update default backend to AirLLM and optimize for Mac M3 36GB	2026-03-10 18:04:04 -04:00
Alexander Whitestone	6303a77f6e	Consolidate test & dev workflows into tox as single source of truth (#160 ) * Centralize all Python environments on tox tox.ini is now the single source of truth for how every Python environment runs — tests, linting, formatting, dev server, and CI. No more bare `poetry run` outside of tox. - Expand tox.ini from 4 to 15 environments (lint, format, typecheck, unit, integration, functional, e2e, fast, ollama, ci, coverage, coverage-html, pre-commit, dev, all) - Rewire all Makefile test/lint/format/dev targets to delegate to tox - Update .githooks/pre-commit to run `tox -e pre-commit` - Update .pre-commit-config.yaml to use tox instead of poetry run - Update CI workflow (lint + test jobs) to use `tox -e lint` and `tox -e ci` instead of ad-hoc pytest/black/isort invocations - Update CLAUDE.md to mandate tox usage and document all environments https://claude.ai/code/session_01MTUpqms1fgezZFrodGA8H5 * refactor: modernize tox.ini for tox 4.x conventions - Replace `skipsdist = true` (tox 3 alias) with `no_package = true` - Use `poetry install --no-root --sync` for faster, cleaner dep installs https://claude.ai/code/session_01MTUpqms1fgezZFrodGA8H5 * fix(ci): drop poetry install from lint/format tox envs Lint and format only need black, isort, and bandit — not the full project dependency tree. Override commands_pre to empty and use tox deps instead. Fixes CI failure where poetry is not on PATH. https://claude.ai/code/session_01MTUpqms1fgezZFrodGA8H5 * fix(ci): remove poetry run wrapper from all tox commands Since commands_pre runs poetry install into the tox-managed venv, all tools (pytest, mypy, black, etc.) are already on the venv PATH. The poetry run wrapper is redundant and fails in CI where poetry may not be installed globally. https://claude.ai/code/session_01MTUpqms1fgezZFrodGA8H5 * fix(ci): remove poetry dependency, align local and CI processes - Replace `poetry install` with `pip install -e ".[dev]"` in tox commands_pre so all envs work without poetry installed - Remove Poetry cache from GitHub Actions (only pip cache needed) - Rename pre-commit env to pre-push: runs lint + full CI suite (same checks as GitHub Actions, reports generated locally) - Update CLAUDE.md to reflect new pre-push workflow The local `tox -e pre-push` now runs the exact same lint + test + coverage checks as CI, so failures are caught before pushing. https://claude.ai/code/session_01MTUpqms1fgezZFrodGA8H5 --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-03-10 15:54:09 -04:00
Alexander Whitestone	2a5f317a12	fix: implement @csrf_exempt decorator support in CSRFMiddleware (#159 )	2026-03-10 15:26:40 -04:00
Alexander Whitestone	4a4c9be1eb	fix: repair broken test patch target and add interview transcript (#156 ) - Fix test_autoresearch_perplexity: patch target was dashboard.routes.experiments.get_experiment_history but the function is imported locally inside the route handler, so patch the source module timmy.autoresearch.get_experiment_history instead. - Add tests for src/timmy/interview.py (previously 0% coverage): question structure, run_interview flow, error handling, formatting. - Produce interview transcript document from structured Timmy interview. https://claude.ai/code/session_01EXDzXqgsC2ohS8qreF1fBo Co-authored-by: Claude <noreply@anthropic.com>	2026-03-10 15:26:29 -04:00
Alexander Whitestone	904a7c564e	feat: migrate to Agno native HITL tool confirmation flow (#158 ) Replace the homebrew regex-based tool extraction and manual dispatch (tool_executor.py) with Agno's built-in Human-In-The-Loop confirmation: - Toolkit(requires_confirmation_tools=...) marks dangerous tools - agent.run() returns RunOutput with status=paused when confirmation needed - RunRequirement.confirm()/reject() + agent.continue_run() resumes execution Dashboard and Discord vendor both use the native flow. DuckDuckGo import isolated so its absence doesn't kill all tools. Test stubs cleaned up (agno is a real dependency, only truly optional packages stubbed). 1384 tests pass in parallel (~14s). Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 21:54:04 -04:00
Alexander Whitestone	574031a55c	fix: remove invalid show_tool_calls kwarg crashing Agent init (#157 ) * fix: remove invalid show_tool_calls kwarg crashing Agent init (regression) show_tool_calls was removed in `f95c960` (Feb 26) because agno 2.5.x doesn't accept it, then reintroduced in `fd0ede0` (Mar 8) without runtime testing — mocked tests hid the breakage. Replace the bogus assertion with a regression guard and an allowlist test that catches unknown kwargs before they reach production. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: auto-install git hooks, add black/isort to dev deps - Add .githooks/ with portable pre-commit hook (macOS + Linux) - make install now auto-activates hooks via core.hooksPath - Add black and isort to poetry dev group (were only in CI via raw pip) - Fix black formatting on 2 files flagged by CI - Fix test_autoresearch_perplexity patching wrong module path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:01:00 -04:00
Alexander Whitestone	21e2ae427a	Add test plan for autoresearch with perplexity metric (#154 )	2026-03-09 09:36:26 -04:00
Alexander Whitestone	fe484ad7b6	Fix input validation for chat and memory routes (#155 )	2026-03-09 09:36:16 -04:00
Alexander Whitestone	82fb2417e3	feat: enable SQLite WAL mode for all databases (AGI ticket #1 ) (#153 )	2026-03-08 16:07:02 -04:00
Alexander Whitestone	11ba21418a	docs: sovereign AGI research — architecture analysis + Ghost Core integration (#152 )	2026-03-08 15:00:10 -04:00
Alexander Whitestone	8dbce25183	fix: handle concurrent table creation race in SQLite (#151 )	2026-03-08 13:27:11 -04:00
Alexander Whitestone	ae3bb1cc21	feat: code quality audit + autoresearch integration + infra hardening (#150 )	2026-03-08 12:50:44 -04:00
Alexander Whitestone	fd0ede0d51	feat: auto-escalation system + agentic loop fixes (#149 ) (#149 ) Wire up automatic error-to-task escalation and fix the agentic loop stopping after the first tool call. Auto-escalation: - Add swarm.task_queue.models with create_task() bridge to existing task queue SQLite DB - Add swarm.event_log with EventType enum, log_event(), and SQLite persistence + WebSocket broadcast - Wire capture_error() into request logging middleware so unhandled HTTP exceptions auto-create [BUG] tasks with stack traces, git context, and push notifications (5-min dedup window) Agentic loop (Round 11 Bug #1): - Wrap agent_chat() in asyncio.to_thread() to stop blocking the event loop (fixes Discord heartbeat warnings) - Enable Agno's native multi-turn tool chaining via show_tool_calls and tool_call_limit on the Agent config - Strengthen multi-step continuation prompts with explicit examples Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 03:11:14 -04:00
Alexander Whitestone	7792ae745f	feat: agentic loop for multi-step tasks + regression fixes (#148 ) * fix: name extraction blocklist, memory preview escaping, and gitignore cleanup - Add _NAME_BLOCKLIST to extract_user_name() to reject gerunds and UI-state words like "Sending" that were incorrectly captured as user names - Collapse whitespace in get_memory_status() preview so newlines survive JSON serialization without showing raw \n escape sequences - Broaden .gitignore from specific memory/self/user_profile.md to memory/self/ and untrack memory/self/methodology.md (runtime-edited file) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: catch Ollama connection errors in session.py + add 71 smoke tests - Wrap agent.run() in session.py with try/except so Ollama connection failures return a graceful fallback message instead of dumping raw tracebacks to Docker logs - Add tests/test_smoke.py with 71 tests covering every GET route: core pages, feature pages, JSON APIs, and a parametrized no-500 sweep — catches import errors, template failures, and schema mismatches that unit tests miss Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: agentic loop for multi-step tasks + Round 10 regression fixes Agentic loop (Parts 1-4): - Add multi-step chaining instructions to system prompt - New agentic_loop.py with plan→execute→adapt→summarize flow - Register plan_and_execute tool for background task execution - Add max_agent_steps config setting (default: 10) - Discord fix: 300s timeout, typing indicator, send error handling - 16 new unit + e2e tests for agentic loop Round 10 regressions (R1-R5, P1): - R1: Fix literal \n escape sequences in tool responses - R2: Chat timeout/error feedback in agent panel - R3: /hands infinite spinner → static empty states - R4: /self-coding infinite spinner → static stats + journal - R5: /grok/status raw JSON → HTML dashboard template - P1: VETO confirmation dialog on task cards Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: briefing route 500 in CI when agno is MagicMock stub _call_agent() returned a MagicMock instead of a string when agno is stubbed in tests, causing SQLite "Error binding parameter 4" on save. Ensure the return value is always an actual string. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: briefing route 500 in CI — graceful degradation at route level When agno is stubbed with MagicMock in CI, agent.run() returns a MagicMock instead of raising — so the exception handler never fires and a MagicMock propagates as the summary to SQLite, which can't bind it. Fix: catch at the route level and return a fallback Briefing object. This follows the project's graceful degradation pattern — the briefing page always renders, even when the backend is completely unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 01:46:29 -05:00
Alexander Whitestone	b8e0f4539f	fix: Discord memory bug — add session continuity + 6 memory system fixes (#147 ) Discord created a new agent per message with no conversation history, causing Timmy to lose context between messages (the "yes" bug). Now uses a singleton agent with per-channel/thread session_id, matching the dashboard's session.py pattern. Also applies _clean_response() to strip hallucinated tool-call JSON from Discord output. Additional fixes: - get_system_context() no longer clears the handoff file (was destroying session context on every agent creation) - Orchestrator uses HotMemory.read() to auto-create MEMORY.md if missing - vector_store DB_PATH anchored to __file__ instead of relative CWD - brain/schema.py: removed invalid .load dot-commands from INIT_SQL - tools_intro: fixed wrong table name 'vectors' → 'chunks' in tier3 check Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 00:20:38 -05:00
Alexander Whitestone	4bc53a43f9	fix: Round 4 bug fixes — 8 dashboard bugs + git blocker + Discord regression (#146 ) * chore: stop tracking runtime-generated self-modify reports These 65 files in data/self_modify_reports/ are auto-generated at runtime and already listed in .gitignore. Tracking them caused conflicts when pulling from main. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 8 dashboard bugs from Round 4 testing report - Fix Ollama timeout regression: request_timeout → timeout (agno API) - Add Bootstrap JS to base.html (fixes creative UI tab switching) - Send initial_state on Swarm Live WebSocket connect - Add /api/queue/status endpoint (stops 404 log spam from chat panel) - Populate agent tools from registry on /tools page - Add notification bell dropdown with /api/notifications endpoint - All 1157 tests pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add 17 e2e tests covering all Round 4 bug fixes Covers: /calm 200, /api/queue/status JSON, Bootstrap JS presence, Swarm Live WebSocket initial_state, agent tools populated on /tools, /api/notifications endpoint, Ollama timeout param, full task lifecycle, and smoke test for all 15 dashboard pages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 23:48:20 -05:00
Alexander Whitestone	248af9ed03	fix: dashboard bugs and clean up build artifacts (#145 ) * chore: stop tracking runtime-generated self-modify reports These 65 files in data/self_modify_reports/ are auto-generated at runtime and already listed in .gitignore. Tracking them caused conflicts when pulling from main. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 8 dashboard bugs from Round 4 testing report - Fix Ollama timeout regression: request_timeout → timeout (agno API) - Add Bootstrap JS to base.html (fixes creative UI tab switching) - Send initial_state on Swarm Live WebSocket connect - Add /api/queue/status endpoint (stops 404 log spam from chat panel) - Populate agent tools from registry on /tools page - Add notification bell dropdown with /api/notifications endpoint - All 1157 tests pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 23:44:56 -05:00
Alexander Whitestone	e36a1dc939	fix: resolve 6 dashboard bugs and rebuild Task Queue + Work Orders (#144 ) (#144 ) Round 2+3 bug fix batch: 1. Ollama timeout: Add request_timeout=300 to prevent socket read errors on complex 30-60s prompts (production crash fix) 2. Memory API: Create missing HTMX partial templates (memory_facts.html, memory_results.html) so Save/Search buttons work 3. CALM page: Add create_tables() call so SQLAlchemy tables exist on first request (was returning HTTP 500) 4. Task Queue: Full SQLite-backed rebuild with CRUD endpoints, HTMX partials, and action buttons (approve/veto/pause/cancel/retry) 5. Work Orders: Full SQLite-backed rebuild with submit/approve/reject/ execute pipeline and HTMX polling partials 6. Memory READ tool: Add memory_read function so Timmy stops calling read_file when trying to recall stored facts Also: Close GitHub issues #115, #114, #112, #110 as won't-fix. Comment on #107 confirming prune_memories() already wired to startup. Tests: 33 new tests across 4 test files, all passing. Full suite: 1155 passed, 2 pre-existing failures (hands_shell). Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 23:21:30 -05:00
Alexander Whitestone	b8164e46b0	fix: remove dead swarm imports, add memory_write tool, and auto-prune on startup (#143 ) - Replace dead `from swarm` imports in tools_delegation and tools_intro with working implementations sourced from _PERSONAS - Add `memory_write` tool so the agent can actually persist memories when users ask it to remember something - Enhance `memory_search` to search both vault files AND the runtime vector store for cross-channel recall (Discord/web/Telegram) - Add memory management config: memory_prune_days, memory_prune_keep_facts, memory_vault_max_mb - Auto-prune old vector store entries and warn on vault size at startup - Update tests for new delegation agent list (mace removed) Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 22:34:30 -05:00
Alexander Whitestone	3bf7187482	Clean up generated files and fix 6 dashboard bugs (#142 ) * chore: gitignore local/generated files and remove from tracking Remove user-specific files (MEMORY.md, user_profile.md, prompts.py) from source control. Add patterns for credentials, backups, and generated content to .gitignore. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 6 dashboard bugs — chat, /bugs, /swarm/events, WebSocket, marketplace, sidebar 1. Chat non-functional: CSRF middleware silently blocked HTMX POSTs. Added CSRF token transmission via hx-headers in base.html. 2. /bugs → 500: Route missing template vars (total, stats, filter_status). 3. /swarm/events → 500: Called .event_type.value on a plain str (SparkEvent.event_type is str, not enum). Also fixed timestamp and source field mismatches in the template. 4. WebSocket reconnect loop: No WS endpoint existed at /swarm/live, only an HTTP GET. Added @router.websocket("/live") using ws_manager. 5. Marketplace "Agent not found": Nav links /marketplace/ui matched the /{agent_id} catch-all. Added explicit /marketplace/ui route with enriched template context. 6. Agents sidebar "LOADING...": /swarm/agents/sidebar endpoint was missing. Added route returning the existing sidebar partial. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: restore src/timmy/prompts.py to source control prompts.py is imported by timmy.agent and is production code, not a user-local file. Re-add to tracking and remove from .gitignore. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 21:26:41 -05:00
Alexander Whitestone	b615595100	refactor: centralize config & harden security (#141 ) * feat: upgrade primary model from llama3.1:8b to qwen2.5:14b - Swap OLLAMA_MODEL_PRIMARY to qwen2.5:14b for better reasoning - llama3.1:8b-instruct becomes fallback - Update .env default and README quick start - Fix hardcoded model assertions in tests qwen2.5:14b provides significantly better multi-step reasoning and tool calling reliability while still running locally on modest hardware. The 8B model remains as automatic fallback. * security: centralize config, harden uploads, fix silent exceptions - Add 9 pydantic Settings fields (skip_embeddings, disable_csrf, rqlite_url, brain_source, brain_db_path, csrf_cookie_secure, chat_api_max_body_bytes, timmy_test_mode) to centralize env-var access - Migrate 8 os.environ.get() calls across 5 source files to use `from config import settings` per project convention - Add path traversal defense-in-depth to file upload endpoint - Add 1MB request body size limit to chat API - Make CSRF cookie secure flag configurable via settings - Replace 2 silent `except: pass` blocks with debug logging in session.py - Remove unused `import os` from brain/memory.py and csrf.py - Update 5 CSRF test fixtures to patch settings instead of os.environ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 18:49:37 -05:00
Alexander Whitestone	cdd3e1a90b	feat: upgrade primary model from llama3.1:8b to qwen2.5:14b (#140 ) - Swap OLLAMA_MODEL_PRIMARY to qwen2.5:14b for better reasoning - llama3.1:8b-instruct becomes fallback - Update .env default and README quick start - Fix hardcoded model assertions in tests qwen2.5:14b provides significantly better multi-step reasoning and tool calling reliability while still running locally on modest hardware. The 8B model remains as automatic fallback. Co-authored-by: Trip T <trip@local>	2026-03-07 18:20:34 -05:00
Alexander Whitestone	39f2eb418a	Remove stale references from documentation across 9 files (#139 )	2026-03-07 07:28:14 -05:00
Alexander Whitestone	480b8d324e	security: fix CSRF bypass vulnerabilities via strict path matching and normalization (#138 )	2026-03-07 06:45:32 -05:00

1 2 3 4 5 ...

310 Commits