Fritz Lekschas - Interactive Exploration of Large-Scale Datasets with Jupyter-Scatter | SciPy 2023

Interactive exploration of large-scale datasets with Jupyter-Scatter, a tool for scatter plots, scalable and customizable with API for categorical data and integration with Jupyter and other libraries.

Key takeaways
  • Jscatter is a tool for interactive exploration of large-scale datasets via scatter plots.
  • It can handle datasets of up to several million points and provides a simple API for encoding categorical data.
  • The API can be customized to use different encoding methods, including colorblind-safe options.
  • Jscatter can be used to synchronize multiple scatter plots and provide a zoomable and panable interface.
  • The tool can be used in conjunction with other data visualization libraries, such as Seaborn and D3.js.
  • Jscatter is built on top of Jupyter and uses the IPY widget ecosystem to provide a seamless interface for users.
  • The tool can be used to explore a wide range of datasets, including single-cell data, geospatial data, and more.
  • Jscatter provides a flexible and customizable interface for exploratory data analysis and visualization.
  • The tool can be used to perform faceted analysis and filtering of large datasets.
  • Jscatter can be integrated with other libraries and tools, such as Jupyter Notebooks and PyCharm.
  • The tool can be used to visualize and explore large datasets in a scalable and interactive way.