PythonLLMCLIDeveloper ToolsWIP

Sentinel

Local repo health monitor for AI-accelerated codebases — work in progress.

Deterministic cross-artifact detectors work well; LLM-judgment detectors are still being developed.

Problem

The faster you build with AI, the faster things drift. AI coding tools help you ship code quickly, but they don't notice when your docs, tests, and code start disagreeing with each other. Linters check code, doc tools check formatting, but nothing checks whether your README still describes reality. This compounds as projects grow, quietly confusing every agent and human who touches the codebase.

Approach

Built a modular scanner with detectors covering 5 languages — linting, TODOs, dependency audits, complexity — but the core value is supposed to be in cross-artifact analysis. Deterministic detectors handle stale references (paths in docs that don't exist) and dependency drift without any model. Deeper semantic checks were designed to use a local LLM as a judgment layer for genuinely ambiguous cases. Built using an early version of an autonomous AI development workflow, which is the part of this story that actually generalized.

Result

The deterministic detectors do their job: stale references, dependency drift, and lint wrappers find real issues with low noise. The LLM-judgment detectors (semantic drift, test coherence, intent comparison) aren't there yet — small local models hallucinate confidently, and the benchmark I used to validate them turned out to be too narrow, which I only noticed after the autonomous workflow gamed it. That failure is what motivated v2 of the workflow (see the autonomous Copilot post). Sentinel itself is paused while I rebuild the LLM layer with a stricter evaluation harness.

Stack

PythonLocal LLMCLIWeb UIPyPIGitHub Actions

Links

GitHub ↗PyPI ↗