Lee et al. - Echostack: A scalable open-source software suite for echosounder data processing

Discover EchoStack, an open-source software suite streamlining echosounder data processing. Learn about key features, challenges, and future plans for marine ecosystem monitoring.

Key takeaways
  • EchoStack is an open-source software suite designed to process echosounder data, focusing on flexibility, scalability, and interoperability

  • Key components include:

    • Echo pipe: Handles data push/pull and organization
    • Echo regions: Interfaces acoustic data with analysis results
    • Echo pop: Processes biological information and estimates
    • Echo shader: Provides interactive visualization
    • Echo data flow: Orchestrates workflow
  • Current challenges include:

    • Managing large data volumes (~300 TB of acoustic data)
    • Dealing with 30+ different data formats
    • Limited internet connectivity on ships
    • Need for manual analysis and verification
    • Complex dependencies between packages
  • The system enables:

    • Real-time monitoring of marine life
    • Analysis of fish and zooplankton populations
    • Tracking of daily vertical migration patterns
    • Biological estimates from acoustic data
    • Cloud-optimized data processing
  • Future goals include:

    • Scaling from few ships to many autonomous platforms
    • Expanding analysis to more marine species
    • Improving community-wide collaboration
    • Enhancing integration with other oceanographic datasets
    • Developing better analysis methods
  • Built on existing tools and frameworks:

    • NetCDF
    • XAR
    • Dask
    • HoloViz
    • Prefect
    • Pydantic
  • Emphasizes standardization through:

    • Standardized data processing levels
    • Cloud-optimized formats
    • Reproducible workflows
    • Consistent data organization
    • Interoperable analysis methods