Analyzing 1.4 million Epstein case documents with machine learning to measure the gap between evidence and justice across 1,264 individuals.
Co-occurrence relationships between individuals across case documents. Node size reflects impunity index. Click any node for details.
Individuals mapped by country of origin. Node size reflects average impunity index. Click a country to filter individuals.
t-SNE projection of 384-dimensional person embeddings built from 1.4M case documents. Each point is a person — proximity reflects similar document contexts. Color = impunity level.
Three complementary ML approaches evaluated with multiple metrics and stress tests to understand model limitations and guide improvement.
7 tabular features per person: Epstein email/EFTA document count, DOJ corpus mentions, keyword co-occurrence with abuse-related terms, Epstein flight legs, person-to-person connections, black book inclusion, and a raw evidence index.
all-MiniLM-L6-v2 (SentenceTransformer) encodes each person's full document evidence text into 384-dimensional vectors via sliding-window mean pooling, capturing contextual meaning across 1.4 million documents.
Two models trained on labeled individuals: Logistic Regression on 7 tabular features (class-balanced, StandardScaler). ST + SVC combines 384-dim embeddings with tabular features into a 391-dim space, classified by a calibrated LinearSVC.
Final probability is the mean of both models' calibrated scores. Inference runs over all 1,264 individuals, producing per-person consequence probabilities that surface patterns invisible to simple rule-based approaches.
How much does each feature contribute? Bars show F1 score when each feature is removed. Dashed line = full model baseline.
Does evidence of involvement predict real-world consequences?
This site contains AI-generated analysis of publicly released court documents related to the Jeffrey Epstein case. Some content describes allegations of sexual abuse, trafficking, and exploitation of minors. This material may be disturbing.
This tool is for research and informational purposes only. It does not constitute legal advice or imply guilt of any named individual.
Learn more about this project →