We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Dewey Dunnington - Introducing nanoarrow: the world's tiniest Arrow Implementation | SciPy 2024
Learn about NanoArrow, a minimal Arrow implementation for efficient cross-language data transfer, featuring small footprint & easy C integration. By Dewey Dunnington at SciPy 2024.
- 
    NanoArrow is a minimal Arrow implementation designed for efficient data transfer between different languages and systems, using only two core files 
- 
    Key advantages of NanoArrow include: - Very small footprint compared to full Arrow C++ implementation
- No complex dependencies
- Efficient handling of strings and null values
- Easy integration into C libraries
 
- 
    Primary use cases: - Wrapping C libraries that need Arrow functionality
- Fast data transfer between different languages/runtimes
- Efficient handling of large string arrays
- Testing and development of Arrow-based functionality
 
- 
    NanoArrow handles data representation through: - Separate buffers for nullability
- Efficient string encoding
- Buffer protocol compatibility in Python
- Support for Arrow IPC format
 
- 
    Compared to full Arrow implementation: - More limited in scope (no nested data structures)
- Focuses on core data transfer functionality
- Lighter weight alternative for basic Arrow needs
- Better suited for embedded systems or minimal dependencies
 
- 
    Successfully used in projects like: - Snowflake Python connector
- GeoArrow implementations
- Testing frameworks
- Language bindings for C libraries
 
- 
    Particularly valuable when: - Working with cross-language data transfer
- Dealing with large string datasets
- Need for minimal dependency overhead
- Building Arrow-compatible interfaces