Vincent D. Warmerdam - Scikit-Learn can do THAT?!

Python

Discover lesser-known but powerful features in scikit-learn including incremental learning, caching, sparse matrices, metadata routing, and semi-supervised learning capabilities.

Key takeaways

Scikit-learn offers partial fit capabilities for incremental learning and out-of-core datasets that don’t fit in memory
The library includes built-in caching functionality that can significantly speed up hyperparameter searches and pipeline operations
Sample weights can be applied throughout pipelines to give different importance to data points during training
Sparse matrix support is available across many components, allowing efficient handling of sparse data structures
Metadata routing enables passing custom arguments through pipelines to specific components
The standard scaler and other components are optimized to handle numerical stability issues and edge cases
Semi-supervised learning capabilities are available through the semi-supervised module for scenarios with limited labels
Image classification and text processing can be handled through unified pipeline interfaces
The library maintains backward compatibility while continuously improving solvers and implementations
Documentation provides implementation details, mathematics behind algorithms, and references to original papers

Vincent D. Warmerdam - Scikit-Learn can do THAT?!

More talks