We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Build TikTok's Personalized Real-Time Recommendation System in Python with Hopsworks
Learn how to build TikTok-style video recommendations in Python with Hopsworks, covering real-time feature engineering, model training, and personalized content serving.
-
TikTok’s recommender system uses a two-tower model architecture with separate encoders for user queries and videos to create embeddings in the same vector space
-
System is composed of three main pipelines:
- Feature pipeline: processes user interactions and video metadata
- Training pipeline: creates and updates recommendation models
- Inference pipeline: handles real-time predictions and serving
-
Key features are stored in a feature store (Hopsworks) which maintains:
- User features (age, country, gender)
- Video features (category, views, likes, length)
- Interaction data between users and videos
-
Recommendation process has two main phases:
- Retrieval: Uses vector similarity search to find hundreds of candidate videos
- Ranking: Personalizes and orders candidates based on user preferences
-
System maintains fast feedback loop by:
- Quickly logging user interactions (views, likes, watch time)
- Updating feature values within seconds
- Using fresh features for next predictions
-
Models are implemented using:
- TensorFlow for embedding models
- CatBoost for ranking model
- Vector index for similarity search
-
Infrastructure uses:
- Kafka for event streaming
- K-serve for model serving
- Feature store for data management
- Vector database for embeddings
-
System achieves personalization through:
- Recent user activity history
- User demographic features
- Video metadata and engagement metrics
- Interaction patterns
-
Model training happens on regular schedules:
- Embedding models updated periodically
- Ranking models retrained frequently
- Vector index updated with new videos
-
Data validation rules ensure data quality:
- Value range checks
- Data type validation
- Business logic constraints