Expert Talk: Unlocking the Power of Real-Time Analytics • Tim Berglund & Adi Polak • GOTO 2023

Discover the power of real-time analytics with Tim Berglund and Adi Polak, exploring Apache Pino's architecture, indexing, and applications in driving decisions and empowering users.

Key takeaways
  • Real-time analytics is not just about speed, but also about predictability and bounded latency, reaching a maximum of 100ms.
  • Apache Pino is a real-time analytics database that powers LinkedIn’s analytics capabilities.
  • Pino’s architecture is pluggable, allowing for easy addition of new indexes and features.
  • Real-time analytics needs to be able to process events as they happen, with no delay, unlike traditional analytics which look back over historical data.
  • Indexing is crucial in real-time analytics, as it enables fast querying and reduces the need for scanning large datasets.
  • Coupling between storage and compute is important, as tightly coupled systems can process queries faster than loosely coupled ones.
  • StarTree index is a proprietary index developed by LinkedIn, which is not a typical database index, but rather a custom-built index for real-time analytics.
  • Pino’s indexing is not just about speed, but also about enablement of complex analytics queries, such as joins and aggregations.
  • Real-time analytics has applications beyond just querying data, such as driving decisions and enabling users to take action in real-time.
  • LinkedIn’s transformation from a static presence to a real-time analytics powerhouse was driven by the need to empower users to take action based on real-time data.
  • The importance of data democratization was highlighted, as real-time analytics needs to be accessible to a wide range of users, not just a select few.