Skip to content

Project-Eyrie/redtrail

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REDTRAIL

REDTRAIL

Reddit archive search and OSINT tool that aggregates user history from multiple archive sources into a single searchable interface with built-in forensic extractors.

Status In Project Eyrie WEB


Overview

REDTRAIL is a Reddit user intelligence tool built for researchers and analysts. It solves the problem of fragmented Reddit archives by querying multiple data sources simultaneously, deduplicating results, and presenting a unified view with analytics and pattern-based forensic extractors.


Features

  • Multi-Source Aggregation queries PullPush and Arctic Shift APIs in parallel, merges results by Reddit fullname ID, and tags each item with its origin
  • Exhaustive Auto-Fetch drains every available page from every source on lookup with per-source rate limiting; full-screen loader blocks the UI until done with a stop control
  • Full-Text Search & Filtering filter by keyword (regex optional, scans title / body / subreddit / url / flair / author), multi-select subreddit chips, date range, score range, NSFW / edited / post-type toggles, source toggles, controversial-only, and date / score / subreddit / comments sort
  • Click-to-Open Item View Reddit-styled modal with vote column, inline image / video / link preview, full body, copy-text button, and live + archived deep links
  • Analytics Dashboard activity heatmap, subreddit breakdown, domain analysis, sentiment timeline, word frequency, activity gaps, and subreddit migration tracking
  • Forensic Extractors PII / OPSEC scanner with weighted risk score, external link extractor grouped by domain, mentioned-users panel, biographical-fragment extractor — all driven by user-editable regex patterns persisted to localStorage
  • Data Export download all (or just filtered) items as JSON, CSV, Markdown, or plain URLs, or copy any of those to the clipboard

How to Use

About the App

REDTRAIL is a single-page web app. Enter a Reddit username to fetch their archived post and comment history from PullPush and Arctic Shift. The app drains every available page from each source automatically; a full-screen loader shows live counts and stays visible until the fetch completes (or you stop it). Once loaded, results appear in a fixed-height two-pane layout: filtered content list on the left, compact analytics sidebar on the right. The full analytics + forensics suite opens in a modal.

Interface

Area Description
Search bar Enter a Reddit username and hit lookup to begin fetching
Fetch loader Full-screen blocker shown while drain is in flight; per-source counters and stop button
Summary strip Always-visible row above the filter panel: post / comment counts, date range, NSFW %, edited count, top subs
Filter panel Text / regex search, sort, source toggles, plus advanced section with date range, score range, multi-select subreddit chips, NSFW / edited / post-type toggles, controversial-only
Tabs Switch between posts, comments, or combined view
Content list Scrollable feed of post and comment cards in a fixed-height container; click any card to open the full-view modal
Item modal Reddit-styled full view of a single post or comment with score, body, source badges, and live + archived links
Analytics sidebar Inline summary: account stats, forensics teaser, activity heatmap, top subreddits, link to full analysis
Analytics modal Two-tab popup containing the full stats suite and the forensic extractors
Export Download all or filtered data as JSON, CSV, Markdown, URLs.txt, or copy any of those to the clipboard

Source Badges

Each post and comment displays coloured badges indicating which archive sources returned it. Items found in multiple sources show multiple badges, confirming cross-source availability.

Forensics Tab

Opened from the [full analysis] button or the sidebar teaser. Each panel scans every post / comment body for OSINT signals and links matches back to the source post.

Panel Description
OPSEC scan PII pattern matches grouped by category (email, phone, name, age, location, work, school, family, money, vehicle, pet, birthday, IP, SSN, postcode, credit card) with a weighted risk score (none / low / medium / high)
External links Every https://… URL found in body text, grouped by domain, expandable to see individual URLs and their source posts
Mentions Other Reddit users this account references (u/handle and @handle); click a name to filter, expand to see context snippets
Biographical fragments Sentences that start with self-statement patterns (I'm a…, My job…, When I was…); click to jump to the source

Each panel has an inline [ patterns ] editor that lists the active regex patterns, lets you toggle built-ins on / off, add custom patterns, and edit blacklists. All edits persist to localStorage.


Theory and Background

Core Concept

Reddit's official API limits history retrieval to roughly the most recent 1,000 items per user. Archive services like PullPush (successor to Pushshift) and Arctic Shift independently index Reddit's full public firehose, preserving content even after deletion or removal.

REDTRAIL queries these archives in parallel by author name and exhaustively pages through every result. Each source returns items in its own format — REDTRAIL normalises them into a unified schema and deduplicates by Reddit's fullname ID system (t3_ prefix for posts, t1_ for comments). When the same item appears in multiple sources, the sources are merged into a single record tagged with all origins.

Analytics

All analytics are computed client-side from the fetched data:

  • Activity heatmap — 7-day x 24-hour UTC grid showing posting frequency by day and hour, with the busiest 4-hour window highlighted
  • Subreddit stats — top communities by post / comment count, toggleable to sort by total score, click to filter
  • Domain analysis — most frequently linked external domains from non-self posts
  • Activity gaps — periods of inactivity exceeding 30 days
  • Subreddit migration — quarterly breakdown of top 5 communities to show how participation shifts over time
  • Sentiment timeline — monthly sentiment scores using positive / negative keyword counting
  • Word frequency — most common non-stopwords across all content, click any word to filter

Forensics

All forensic extractors are pattern-based, run client-side over body text, and surface matches back to their source items:

  • PII / OPSEC — 26 built-in regex patterns across 16 categories with per-pattern weights summed into a doxxing-risk score
  • External links — extracts every URL outside the blacklist (Reddit + image hosts), groups by domain
  • Mentionsu/handle, /u/handle, /user/handle, and @handle patterns, ranked by frequency, with self and bot blacklist filtering
  • Biographical fragments — 12 sentence-prefix patterns matched against split body text, scored by length / proper-noun / digit presence

Pattern definitions are exported as data; the inline editor toggles built-ins, adds custom regex patterns with capture-group selection, and persists every edit to localStorage.


Notes

  • Rate limiting — the server enforces a 30-second cooldown between searches per IP address
  • No Reddit API — REDTRAIL does not use Reddit's official API; all data comes from third-party archives
  • Archive coverage — results depend on what PullPush and Arctic Shift have indexed; very recent content (last few hours) may not appear
  • Client-side processing — all filtering, deduplication, analytics, and forensic extraction run in the browser; large accounts with thousands of items may take a moment to process
  • Timestamps — all times are displayed in UTC
  • Browser support — built with modern CSS and JS; works best in Chromium-based browsers and Firefox

Part of Project Eyrie — by notalex.sh

About

Reddit archive search and OSINT tool that aggregates user history from multiple archive sources into a single searchable interface.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors