We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
"Open-Sourcing Venice" by Felix GV (Strange Loop 2022)
Join Felix GV to discuss the open-sourced Venice data storage system, covering scalability, caching, and use cases.
- Data ingestion and storage, including techniques for writing data to Venice and the concept of hybrid workloads.
- The importance of considering scalability and hit rate when designing data storage systems.
- How Venice handles concurrent streams and incremental updates through its buffer replay mechanism.
- The concept of eager cache and read-through cache, and how they can improve performance depending on the data set.
- The versatility of Venice data storage, supporting both offline and nearline data sources, and the ability to join and union data from different sources.
- The use cases for Venice, including data analytics, machine learning, and AB testing, with examples from LinkedIn.
- The road ahead for the project, now that it is open-source, and the opportunities for the community to contribute and integrate with other projects.
- The advantages of Venice, including scalability, ease of use, and fault tolerance, with examples of its use in production environments at LinkedIn.
- The ability to support concurrent streaming writes and incremental updates, without compromising data consistency.
- The concept of optimistic locking, which enables multiple users to modify the same data simultaneously.
- The concept of data lineage, where data is tracked from its origin to its consumption, ensuring data integrity and end-to-end delivery.
- The importance of considering the scope of the data set, including the number of users and the rate of data update, when designing data storage systems.
- The flexibility of Venice, allowing users to choose the best approach for their data storage needs, and the ability to scale both horizontally and vertically.