About
NLP-Driven Investigation

THE IMPUNITY
INDEX

Analyzing 1.4 million Epstein case documents with machine learning to measure the gap between evidence and justice across 1,264 individuals.

0
Documents Analyzed
0
Individuals Tracked
0
ML Models Trained
0
Face No Consequences

Connection Network

Co-occurrence relationships between individuals across case documents. Node size reflects impunity index. Click any node for details.

The Individuals

Geographic Distribution

Individuals mapped by country of origin. Node size reflects average impunity index. Click a country to filter individuals.

Semantic Space

t-SNE projection of 384-dimensional person embeddings built from 1.4M case documents. Each point is a person — proximity reflects similar document contexts. Color = impunity level.

Models & Evaluation

Three complementary ML approaches evaluated with multiple metrics and stress tests to understand model limitations and guide improvement.

ML Pipeline

01

Feature Extraction

7 tabular features per person: Epstein email/EFTA document count, DOJ corpus mentions, keyword co-occurrence with abuse-related terms, Epstein flight legs, person-to-person connections, black book inclusion, and a raw evidence index.

02

Semantic Embeddings

all-MiniLM-L6-v2 (SentenceTransformer) encodes each person's full document evidence text into 384-dimensional vectors via sliding-window mean pooling, capturing contextual meaning across 1.4 million documents.

03

Classification

Two models trained on labeled individuals: Logistic Regression on 7 tabular features (class-balanced, StandardScaler). ST + SVC combines 384-dim embeddings with tabular features into a 391-dim space, classified by a calibrated LinearSVC.

04

Consensus

Final probability is the mean of both models' calibrated scores. Inference runs over all 1,264 individuals, producing per-person consequence probabilities that surface patterns invisible to simple rule-based approaches.

Feature Ablation Study

How much does each feature contribute? Bars show F1 score when each feature is removed. Dashed line = full model baseline.

The Gap

Does evidence of involvement predict real-world consequences?

Impunity Gap: Evidence vs. Consequences

Content Warning

This site contains AI-generated analysis of publicly released court documents related to the Jeffrey Epstein case. Some content describes allegations of sexual abuse, trafficking, and exploitation of minors. This material may be disturbing.

This tool is for research and informational purposes only. It does not constitute legal advice or imply guilt of any named individual.

Learn more about this project →