Changelog#
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[0.1.0] - 2025-06-19#
Summary#
First stable release of iscc-sum, a high-performance ISCC Data-Code and Instance-Code hashing tool built in Rust with Python bindings. This release achieves 50-130x performance improvements over the pure python reference implementation while maintaining full compatibility with the ISCC standard.
Added#
Core Features#
- ISCC Data-Code generation: Content-defined chunking (CDC) algorithm for content similarity detection
- Wide format (128-bit, default) for enhanced security and similarity matching
- Narrow format (64-bit) for ISO 24138:2024 compliance via
--narrowflag
- ISCC Instance-Code generation: BLAKE3-based cryptographic hash for file integrity verification
- ISCC-SUM composite code: Combined Data-Code and Instance-Code with self-describing header
- High-performance processing: 950-1050 MB/s throughput using SIMD optimizations and parallel processing
Command-Line Interface#
- Full-featured CLI (
iscc-sum) with Unix-style interface- Generate checksums for files, directories, and stdin
- Verify checksums from files (
-c, --check) - Find similar files based on data (
--similar) - Process directories as single objects (
-t, --tree) - BSD-style output format (
--tag) - NUL-terminated output (
--zero) - Show individual code components (
--units)
Python API#
- High-level API for easy integration
code_iscc_sum()function supporting local files and fsspec URIs- Streaming processors for large file handling
- Dictionary-compatible result objects
- Universal path support via fsspec integration
- Process files from S3, HTTP/HTTPS, and other remote sources
- Transparent handling of different storage backends
Platform Support#
- Cross-platform compatibility: Linux, macOS, and Windows
- Python version support: 3.10, 3.11, 3.12, and 3.13
- Pre-built wheels for all major platforms
Developer Experience#
- 100% test coverage requirement with comprehensive test suite
- Integrated tooling via poethepoet task automation
- Type annotations with mypy type checking
- Security scanning with Bandit
- Automated CI/CD pipeline with Release Please
Performance#
- 50-130x faster than pure Python reference implementations
- 950-1050 MB/s processing speed on modern hardware
- Parallel processing using Rayon for multi-core utilization
Standards Compliance#
- ISO 24138:2024 compatible when using
--narrowflag - Reference implementation compatibility for all code formats
- Deterministic output across platforms
Known Limitations#
- This is an early release focused on core functionality
- Advanced ISCC features (Text-Code, Meta-Code) are out of scope for now
- Rust crate not published to crates.io in this release
Dependencies#
- blake3 >=1.0.5 - Cryptographic hashing
- click >=8.0.0 - CLI framework
- pathspec >=0.12.1 - Gitignore-style pattern matching
- universal-pathlib >=0.2.6 - Cross-platform path handling
- xxhash >=3.5.0 - High-speed hashing for CDC
Acknowledgments#
This project implements the ISCC (International Standard Content Code) as defined in ISO 24138:2024. The performance improvements are achieved through Rust's zero-cost abstractions and careful algorithm optimization while maintaining full compatibility with the standard.