Czkawka/Krokiet 9.0 — Find duplicates faster than ever before
The current version primarily focuses on refining existing features and improving performance rather than introducing any spectacular new additions.
With each new release, it seems that I am slowly reaching the limits — of my patience, Rust’s performance, and the possibilities for further optimization.
Czkawka is now at a stage where, at first glance, it’s hard to see what exactly can still be optimized, though, of course, it’s not impossible.
Changes and Optimizations
For example, below is a performance graph from Hotspot, illustrating duplicate scanning in the current version on a 24-thread processor with an NVMe drive. It shows the application utilizing all available threads while searching for duplicates among 1.5 million files and computing hashes for Above:70 GB of data.
As you can see, the application is highly parallelized. The view from the Hotspot program shows that it utilizes all available threads. However, in my case, this does not mean that each thread is fully utilized at 100%. Despite having a fast NVMe drive, the application spends most of its time waiting for the appropriate batch of data to be retrieved from the disk before it can process it.
Below: CPU usage ~10–40%
Still, in the left part of the hotspot graph, there is a visible 250 ms bar where only a single thread is active, handling memory operations (memory deallocation, vector operations). At the moment, I’m not entirely sure what causes this, but I believe it can be at least partially improved.
One of the more unusual performance issues I encountered was that while scanning around a million files, the algorithm initially distributed tasks evenly across all available threads. However, in the second half of the process, it used only one thread.
An initial analysis indicated that Rayon, using its workload distribution algorithm, eventually move all the “leftover” tasks into a single thread, overloading it with work.
files.into_par_iter().map(|e|…) // files is BTreeMap<String, FileEntry>Since these tasks are time-consuming (reading data from disk), I figured that enforcing a maximum grouping of elements up to 3 might help.
let new_files = files.into_par_iter().collect::<Vec<_>>(); // Conversion to Vec<>, because with_max_len is not implemented for BTreeMap
non_cached_files_to_check.into_par_iter().with_max_len(3).map(|e|…) // Limiting group to max len 3,And indeed — this solution eliminated most of the issues, though I’m still not entirely sure why they occurred in the first place or why the problem wasn’t always present.
Increasing the buffer size for file I/O operations
Since the very beginning, file hashing has been performed by reading 16 KB chunks of a file’s content and then applying the hashing operation. This generally didn’t cause major issues (aside from stack overflow on some systems when using a 128 KB buffer).
However, some users reported that on HDDs under heavy load, frequent head jumps between scanned files occurred, negatively impacting performance and disc acoustics.
The simplest solution turned out to be increasing the buffer size to 2 MB — the downside is that each thread searching for duplicates will consume more memory. However, I find this trade-off fully acceptable, considering the performance boost it provides.
Tests showed that when scanning multiple files ranging from 50 KB to 50 MB on an HDD, scan time could be reduced by up to 50%, while on an SSD, the improvement ranged from a few to several percent.
To further enhance performance, I considered using semaphores to ensure that only one reading process would operate per disk at a time, minimizing head movement on HDDs.
However, tests showed no noticeable performance gains, so I abandoned the idea. That said, if you manage to convince me with a new implementation, I’m open to adding it in the future.
Interrupting Hash Calculations
In version 1.0, the entire GUI would freeze during hash calculations, making it impossible to monitor scan progress or stop the process.
By version 8.0, a scanned file counter already existed, and clicking Stop would cause the application to wait for all ongoing hash calculations to complete — meaning that if scanning a 4GB file had started, the user had to wait until it was fully processed.
In the latest version, I managed to implement near-instant scan interruption without any noticeable performance loss.
Additionally, in some modes, scan progress is now displayed in greater detail — not just the number of scanned files but also the total amount of bytes processed.
This provides a much clearer indication of how long the scan will take. Previously, the GUI sometimes displayed that nearly all files had been scanned, yet the process was still running, and the progress bar remained unchanged for a while.
Multi-threading
Probably the simplest way to improve performance is to process tasks in parallel across multiple threads.
One area where I applied this was in checking whether files within a given group are linked by hard links.
This operation requires access to disk resources, so in my case, checking 500,000 files initially took 4 seconds. After implementing the changes, the time dropped to around 1 second, suggesting that the task was no longer CPU-bound but instead limited by disk performance.
In practice, this modification came down to changing just a few lines of code, mainly replacing into_iter with its parallel equivalent into_par_iter. Since this is Rust, I was able to make this change without the slightest concern that it would introduce difficult-to-debug threading issues.
Faster Image Resizing
In the similar image search mode, the most demanding operations are reading the image from disk and subsequently resizing it (e.g., to 8x8, 16x16, or 32x32).
The larger the image, the slower these operations become.
There’s not much that can be done about the first issue — image-rs is already one of the most optimized libraries in this category (though it could likely be made even faster).
However, for image resizing, I discovered the fast-image-resize library, which — while introducing minimal differences in the appearance of the resized images compared to image-rs — can speed up the process by up to three times for large files.
Because of this, I decided to enable it by default, though users still have the option to disable it during compilation.
Portable mode
Czkawka/Krokiet were designed to be portable applications by utilizing Rust and minimizing the number of dynamically linked libraries (mainly in the case of Krokiet). The goal was to ensure they run identically across different systems without requiring the installation of additional dependencies.
However, some users wanted the ability to store cache and settings on an external drive. I initially found this request somewhat unusual since each system typically has its own configuration and cache.
Nevertheless, since this was a frequently requested feature, I added the option to change the directory where the application creates its configuration and cache files. This makes it easy to transfer settings between computers without issues. When launching the application from the terminal, these paths are displayed there.
Changes in current version
Breaking changes
- Video, Duplicate (smaller prehash size), and Image cache (EXIF orientation + faster resize implementation) are incompatible with previous versions and need to be regenerated.
Core
- Automatically rotating all images based on their EXIF orientation
- Fixed a crash caused by negative time values on some operating systems
- Updated `vid_dup_finder`; it can now detect similar videos shorter than 30 seconds
- Added support for more JXL image formats (using a built-in JXL → image-rs converter)
- Improved duplicate file detection by using a larger, reusable buffer for file reading
- Added an option for significantly faster image resizing to speed up image hashing
- Logs now include information about the operating system and compiled app features(only x86_64 versions)
- Added size progress tracking in certain modes
- Ability to stop hash calculations for large files mid-process
- Implemented multithreading to speed up filtering of hard links
- Reduced prehash read file size to a maximum of 4 KB
- Fixed a slowdown at the end of scans when searching for duplicates on systems with a high number of CPU cores
- Improved scan cancellation speed when collecting files to check
- Added support for configuring config/cache paths using the `CZKAWKA_CONFIG_PATH` and `CZKAWKA_CACHE_PATH` environment variables
- Fixed a crash in debug mode when checking broken files named `.mp3`
- Catching panics from symphonia crashes in broken files mode
- Printing a warning, when using `panic=abort`(that may speedup app and cause occasional crashes)
Krokiet
- Changed the default tab to “Duplicate Files”
GTK GUI
- Added a window icon in Wayland
- Disabled the broken sort button
CLI
- Added `-N` and `-M` flags to suppress printing results/warnings to the console
- Fixed an issue where messages were not cleared at the end of a scan
- Ability to disable cache via `-H` flag(useful for benchmarking)
Prebuild-binaries
- This release is last version, that supports Ubuntu 20.04 github actions drops this OS in its runners
- Linux and Mac binaries now are provided with two options x86_64 and arm64
- Arm linux builds needs at least Ubuntu 24.04
- Gtk 4.12 is used to build windows gtk gui instead gtk 4.10
- Dropping support for snap builds — too much time-consuming to maintain and testing(also it is broken currently)
- Removed native windows build krokiet version — now it is available only cross-compiled version from linux(should not be any difference)
Next version
In the next version, I will likely focus on implementing missing features in Krokiet that are already available in Czkawka, such as selecting multiple items using the mouse and keyboard or comparing images.
Although I generally view the transition from GTK to Slint positively, I still encounter certain issues that require additional effort, even though they worked seamlessly in GTK. This includes problems with popups and the need to create some widgets almost from scratch due to the lack of documentation and examples for what I consider basic components, such as an equivalent of GTK’s TreeView.
Price — free, so take it for yourself, your friends, and your family. Licensed under MIT/GPL
Repository — https://github.com/qarmin/czkawka
Files to download — https://github.com/qarmin/czkawka/releases
