An NLP-powered research tool analyzing publicly released Epstein case documents to surface patterns of evidence density versus legal accountability.
Plenty of incredible work has been done mapping who shows up in the Epstein files. Journalists, researchers, and open-source communities have built searchable archives, entity graphs, and document indexes. But nobody had built a way for people to actually measure the gap between evidence and accountability. That is what the Impunity Index does.
It takes the documentary footprint of every named individual in the corpus and cross-references it against whether they ever faced real consequences. The result is a single, corpus-derived metric that quantifies impunity: high evidence plus low accountability equals a high impunity score. We built this because the data was public but the pattern was not visible. We wanted to make it visible.
The Impunity Index is an NLP-powered tool that analyzes publicly available Epstein court documents and government records to surface patterns of evidence density versus legal accountability.
Each individual in the dataset receives an Evidence Index (0–10) based on how frequently and severely they appear across the document corpus. This is multiplied by a Consequence Modifier based on whether they faced legal consequences, producing a final Impunity Index score.
The goal is simple: make it easier to see who had the most evidence against them and whether anything happened as a result. The gap between evidence and consequences is what we call impunity.
The Impunity Index does not measure guilt. It measures the gap between documentary evidence and legal accountability. A high score means the evidence outpaced the consequences. Here is how that works in practice.
Epstein has one of the highest evidence footprints in the entire corpus. But he was arrested, charged, and died in federal custody. The conviction modifier (x0.7) pulls his score down significantly. His score is not zero because the evidence density is extremely high, but the consequence meaningfully reduces his impunity.
Clinton appears frequently in the documents with significant evidence signals: 26+ documented flight legs, black book entry, and high keyword co-occurrence. He has faced no criminal charges or legal consequences related to the Epstein case. With no consequence modifier reduction, his raw evidence score translates almost directly into impunity.
It is counterintuitive that Epstein's score is lower than some of his associates. That is exactly the point. The Impunity Index measures the gap, not the guilt. A high score means the evidence outpaced the consequences.
We process over 1.4 million pages from DOJ/EFTA releases, court filings, depositions, and publicly available Epstein case documents. Text is extracted programmatically from PDFs and organized into a searchable corpus.
Named Entity Recognition (spaCy) identifies mentions of individuals across the document corpus. Each person is linked to the specific documents where they appear, along with contextual information about the nature of each mention.
Three model families produce independent evidence signals: Logistic Regression on tabular NLP features, Random Forest with TF-IDF text features, and Sentence Transformers (all-MiniLM-L6-v2) combined with Support Vector Classification for semantic analysis. Legal-BERT was evaluated but failed due to insufficient training data — that failure is documented as a finding.
The Evidence Index combines six features using log-scaled, percentile-capped normalization: email/EFTA document count, DOJ corpus mentions, keyword co-occurrence with incriminating terms, flight log entries, person-to-person connections, and black book presence. The ML Evidence Signals shown on each profile represent consensus across three model families, not a single model’s output.
Documents are scored for relevance using semantic similarity (sentence-transformer embeddings), not just keyword matching. A document containing detailed allegations scores higher than a passing name mention. Document summaries shown on profiles are extractive (first sentences of the source document) and should be verified against original source documents.
Some document links in the app may not resolve. Our DOJ document reference numbers (EFTA numbers) are accurate and can be used to look up the original documents on the DOJ Epstein files page.
Many individuals referenced in the Epstein files appeared as witnesses, were mentioned in passing, or had legitimate professional relationships. Being named in a document does not indicate involvement in criminal activity.
Summaries, scores, and classifications are produced by machine learning models and may contain errors or misinterpretations. Always refer to the original source documents for authoritative information. The Impunity Index is a research tool, not a factual determination.
This site is for informational and academic purposes only and does not constitute legal advice or official legal analysis.
All text was extracted programmatically from PDFs. Handwritten documents, image-embedded text, and certain file formats may not be fully captured. The dataset is not comprehensive — it represents the publicly released portion of Epstein case files.
Multiple models were used across the pipeline. While we use consensus scoring to mitigate this, scores are not perfect and should be interpreted as directional indicators, not precise measurements.
This project was built for Duke AIPI by Lindsay Gross, Shreya Mendi, and Andrew Jin.
We built on the shoulders of others who believed this information should be accessible to the public:
Our goal is accountability transparency, not accusation.
We believe the public has a right to understand patterns in publicly released legal documents. These are court records, government filings, and depositions that have already been made public through legal proceedings and FOIA requests. We are not revealing private information — we are making existing public information more accessible and analyzable.
We designed the scoring methodology to be evidence-based and reproducible, not sensationalized. Every score traces back to specific document features that can be independently verified. We chose log-scaled, percentile-capped normalization specifically to avoid arbitrary score inflation.
We welcome scrutiny of our methodology and corrections to our data. If you find an error in our analysis or believe a score is miscalibrated, we want to know. The code is open-source and the methodology is documented.