We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Francesc Alted - Btune: Making Compression Better | PyData Global 2023
Learn how BTune optimizes BLOSC2 compression using genetic algorithms and neural networks to balance speed and ratio based on hardware and data patterns.
- 
    BTune is a tool designed to find optimal compression parameters in BLOSC2, balancing compression speed, decompression speed and compression ratio 
- 
    The tool has two main operating modes: - Genetic algorithm for parameter optimization
- Neural network inference for quick parameter prediction
 
- 
    Key factors affecting compression performance: - Hardware characteristics (CPU cores, cache size, memory speed)
- Data patterns and distribution
- Selected codec (LZ4, Zstd, BloscLZ etc.)
- Filters (shuffle, bitshuffle)
- Chunk sizes and splitting
- Compression levels (0-9)
 
- 
    The tradeoff parameter (0-1) controls optimization priorities: - 0: Favor speed
- 1: Favor compression ratio
- 0.5: Balanced approach
 
- 
    Recommendations for optimal usage: - Train models on hardware similar to production environment
- Don’t trust BTune blindly - validate results experimentally
- Consider chunk-by-chunk compression for better cache utilization
- Use tracing to understand BTune’s decision process
- LZ4 typically wins for speed, Zstd for compression ratio
 
- 
    Working with compressed data can sometimes be faster than uncompressed due to reduced memory bandwidth requirements when data fits in cache 
- 
    BTune provides tools to assess optimal compression parameters rather than relying on guesswork, but results should be validated through testing 
- 
    Performance is highly dependent on specific use case - hardware, data characteristics and optimization priorities must be considered together