This tool extracts and processes forum/thread HTML into structured post data (Date, Name, Post) and exports it to CSV.
Open the webpage you want to extract data from and:
- Right-click → View Page Source
- Or press
Ctrl + U
In the application:
- Click Copy HTML
- This will capture the page HTML for parsing
- Open the HTML input window
- Paste the copied HTML into the text box
If you have more than one page to process:
- Click Save and Add
- This will store the current page’s HTML and allow you to add more pages
When you are done adding all pages:
- Click Save and Close
- This will process all stored HTML pages into structured data
- Review parsed results in the DataGrid
- Click Save
- This will generate a
.csvfile on your Desktop with a timestamp
The exported CSV will include: Date, Name, Post
Each row represents a single parsed forum post.
All exported files are saved to: Desktop/output_yyyy-MM-dd_HH-mm-ss.csv
- Ensure full page HTML is copied for best results
- Multiple pages are merged automatically
- Duplicate posts are removed during processing
- Long posts with quotes are preserved and cleaned automatically
To download and run the application without building from source:
Navigate to the GitHub repository and click:
- Releases (right-hand side or top navigation)
- Find the latest release (highest version number or most recent date)
- Download the
.zipfile attached under Assets
- Right-click the downloaded
.zip - Select Extract All...
- Open the extracted folder
- Double-click the
.exefile to launch the program
When running the application for the first time, Windows may display a warning such as:
"Windows protected your PC"
This happens because the application is not signed with a trusted certificate.
To proceed:
- Click More info
- Click Run anyway
- Always download the latest release for new features and bug fixes
- Do not move the
.exeout of the extracted folder (it needs the included files to run properly unless published as single-file)