William Dealtry - Data persistence with consistency and performance in a truly serverless system

Data persistence in a truly serverless system, achieving consistency, performance, and scalability with immutable storage, structured keys, and parallelization. Explore use cases in finance, research, and data science applications.

Key takeaways
  • Consistency models: eventual consistency, linearizability, strong consistency discussed in the context of data storage architectures.
  • Data persistence achieved with immutable storage, providing versioning and snapshot capabilities.
  • Ability to efficiently store and query large datasets using structured keys and storage.
  • Kubernetes-based notebook environments provide on-demand compute and storage for data processing.
  • Performance optimization achieved through parallelization and vectorized execution.
  • Support for multi-dimensional data and time series data.
  • Columnar storage architecture for efficient data retrieval.
  • Complexity of data transformations reduced by using query builder and lazy data frames.
  • Support for data schema evolution and versioning.
  • Data provenance tracking enabled through version keys and timestamping.
  • Scalability achieved through shared nothing architectures and distributed processing.
  • Advantages of cloud storage and object stores for data storage and processing.
  • Postgresql and MySQL compared to ArcticDB for data storage and processing.
  • Use cases include finance, research, and data science applications.
  • Immutable data storage ensures data integrity and avoids data corruption.
  • Shared nothing architectures provide fault tolerance and high availability.
  • Data storage and processing architecture is designed for ease of use and high performance.