Methodology

How the data gets here, and why you can trust it.

01Sources

Three primary feeds

We pull from three federal sources. The Federal Register API for NOIs, NOAs, and procedural notices. The EPA EIS database for full DEISs and FEISs. Regulations.gov for the public comments tied to each docket.

Every record keeps a stable source ID. Re-running an adapter merges fresh metadata in place. Nothing is invented.

02Freshness

6-hour cadence

All three feeds re-ingest every six hours. New NOIs appear in the home table within hours of publication, not weeks. The footer shows the last successful ingest.

03Deduplication

Embedding kNN + adjudication

A single project rarely appears once. The same NEPA action surfaces as an NOI in the Federal Register, then a DEIS in the EPA database, then an NOA in the Federal Register again, plus a docket on regulations.gov. We merge them.

New documents are embedded and matched by cosine distance against the centroids of existing projects. Clean matches fold in. Ambiguous distances run through a Sonnet adjudication step. Genuinely unclear cases land in a manual review queue rather than guessing.

04AI grounding

Every claim has a citation

Briefs and comment clusters are generated from the source PDFs, not from training data. Each line has a footnote pointing back to a specific section, page range, or comment paragraph.

If we can't ground a claim, we don't write it.

Every AI summary on envirodocket links its claims to a verbatim source quote. Click a footnote in any brief to see the literal text from the source filing, with a deep link to the page in the original PDF. The AI does the synthesis, but nothing is paraphrased without an audit trail you can verify in one click.

05Anti-slop prose

Practitioner-edited

A NEPA practitioner reviewed the brief template. We stripped the AI tells: filler intros, "stands as a testament," symmetric three-item lists, and em-dashes. Briefs use a fixed five-paragraph structure so the same project gets the same shape every time.

06What we don't do

Yet

State NEPA equivalents (CEQA, SEQRA, MEPA) are not covered. Tribal consultations and FOIA filings are not indexed. Litigation tracking and sub-NEPA categorical exclusions are out of scope for v0.