Mashed Mice - Dataset Prepper
A downloadable tool
Mashed Mice - Dataset Prepper
version 1.0
A powerful, slimmed and user-friendly tool for organizing square-format image datasets for AI training — especially ideal for prompt-based systems like Stable Diffusion. It can auto-align resolutions to a preset target format, quickly crop images with the built-in tool, randomize filenames for prompt variety, check spelling and auto-cleanse low res content from the dataset, all with safe file-handling and auto export to lossless PNG with every intervention. There's also a tool to find and remove double or very similar images.
____________________________________________________________
This tool was initially made for this locally run Stable Diffusion trainer:
https://mashed-mice.itch.io/mashed-mice-ai-trainer-for-stable-diffusion
____________________________________________________________
Features:
- Smart spell checker with rename suggestions
- Corrupt file detection
- Duplicate image detection using perceptual hashes
- Resolution alignment with batch resizing
- Manual cropping via built-in crop tool
- Theme support (light/dark mode)
- Progress tracking with `processed.json`
- Recent folder memory
- Filename randomizer for training diversity
- 100% local use, you can unplug your network cable
____________________________________________________________
How to Use:
1. Launch the app — no installation required.
2. Click `File → Open Dataset Folder` to select your image folder.
3. Browse and clean your dataset:
- Use `Dataset → Align Resolution` to resize all images
- Use `Crop Tool` to manually prepare single images
- Use `Spell Check`, `Cleanse`, or `Find Duplicates` for batch sanity checks
4. Processed images are saved to the `Prepared` folder inside your dataset
____________________________________________________________
Requirements
- Not much more required than a PC with Winrar installed.
- This app runs as a standalone `.exe` file once the rar file is unpacked.
- Make sure to not move any files from the original folder.
____________________________________________________________
License
This software is free for personal and commercial use.
Attribution is appreciated but not required.
Developed by Joel Sandström.
Built with: Python, Tkinter, Pillow, and imagehash.
Download
Click download now to get access to the following files:
Development log
- A Brand New Version Is Here (1.1.0)!42 days ago
- Have some free stuff - Dataset Prepper!47 days ago
Leave a comment
Log in with itch.io to leave a comment.