Current State
This project currently runs on personal hardware, personal time, and existing software subscriptions. That's sustainable for finishing the initial 365 analyses. It is not sustainable for what comes after — and what comes after is where the real value lives.
The tiers below describe what becomes possible at different resource levels. This isn't a promise tied to specific dollar amounts — it's a transparent look at the actual constraints and what relaxing them enables.
Resource Tiers
Foundation
Current operating levelWhere we are now. One person, nights and weekends, personal hardware.
- Local embedding models for semantic search (running on personal GPU)
- Local Qdrant instance for the knowledge graph
- Claude inference via personal subscription
- Source archival to local disk
- Static site hosting (near-zero cost)
Constraint: Time. Every hour spent on infrastructure is an hour not spent on analysis. Research throughput is limited by one person's schedule.
Sustained Operations
Covering ongoing costsKeeping the lights on reliably and removing fragility from the stack.
- Inference budget: API-level Claude access for research assistance — context-heavy queries against source material, batch verification, cross-reference discovery. The difference between manual and assisted research is roughly 3x throughput.
- Server hosting: Moving Qdrant and the source archive off personal hardware onto a small dedicated server. Eliminates the single point of failure and enables persistent availability for future public-facing tools.
- Software tools: Subscriptions and API access for data sources, monitoring, and development tools that accelerate the work.
Unlocks: Reliable infrastructure, faster research cycles, the ability to maintain and update analyses after initial publication.
Expanded Capability
Hardware and local AIMoving from rented inference to owned infrastructure.
- Hardware upgrades: Dedicated GPU compute for running local models — embeddings, classification, entity extraction — without per-query costs. Upfront investment that reduces long-term operating costs and eliminates dependency on external API availability.
- Local AI pipeline: Designing and running a local inference stack for the repetitive, high-volume parts of the research process — source ingestion, initial claim decomposition, entity recognition, cross-reference suggestions. Human judgment stays on the analytical decisions; the machine handles the mechanical ones.
- Expanded source access: PACER fees, subscription databases, FOIA request processing, and other primary-source access that costs real money.
Unlocks: Autonomous research pipelines, dramatically faster source processing, reduced per-analysis marginal cost, and the foundation for scaling the methodology to new domains.
Scale
From project to platformThe infrastructure exists to do this once. At scale, it becomes a system others can use.
- Local AI factory: Purpose-built local compute for running sustained, autonomous research sessions — multi-hour analysis runs against large document sets, with human review at decision points. The economics of local inference make this viable in a way that API costs do not.
- Public knowledge graph: Opening the entity registry, relationship map, and source archive as queryable public resources for journalists and researchers.
- Comparative analysis infrastructure: Applying the methodology to additional claim sets — prior administrations, state-level, corporate — with the tooling to support parallel investigations.
- Methodology toolkit: Packaging the research framework, provenance system, and validation tools so other organizations can stand up their own forensic analysis projects.
Unlocks: A replicable system for forensic journalism that doesn't depend on a single person's nights and weekends.
How I Work
I stay engaged with the tools that make this kind of work possible — not just the ones I'm using today, but the ones emerging in AI research, data journalism, and investigative tooling. The advancement plan isn't a fixed roadmap; it's a description of what becomes possible as constraints relax. I build what feels relevant and urgent, and I adapt as the tools and the needs evolve.
What stays constant: the evidence standard, the transparency, and the commitment to showing the work. The tools change. The principles don't.