Highly Scalable Image Storage with AWS Serverless • Vadym Kazulkin & Firdaws Aboulaye • GOTO 2024

Learn how a development team migrated from monolithic to AWS serverless architecture for scalable image storage, handling millions of daily uploads using Lambda, S3 & DynamoDB.

Key takeaways
  • Migrated from on-premise monolithic application to AWS serverless architecture in 2018-2019 to handle millions of daily image uploads/downloads

  • Core services used include API Gateway, Lambda, DynamoDB, S3, and SQS for building modular, decoupled services around file storage, project management, and ordering

  • Key architectural decisions included:

    • Using S3 for scalable image storage instead of NFS
    • Implementing asynchronous processing with SQS for handling spikes
    • Separating static assets from personal images
    • Using DynamoDB for metadata despite learning curve
  • Critical AWS service quotas and limits to monitor:

    • Lambda concurrent executions (default 1000)
    • API Gateway requests per second (10,000 default)
    • DynamoDB throughput and item size limits
    • S3 event notifications
  • Security implemented through:

    • Custom token-based authentication with public/private keys
    • Lambda authorizers for API Gateway
    • Fine-grained IAM permissions
    • QR code-based file upload authorization
  • Cultural transformation required:

    • Upskilling developers with AWS and distributed systems knowledge
    • Moving from centralized ops to developer-owned services
    • Building serverless operational expertise internally
    • Adopting infrastructure as code with Terraform
  • Key benefits achieved:

    • Improved scalability for seasonal spikes
    • Independent deployments and testing
    • Better observability and metrics
    • Reduced operational overhead
    • Easier integration with e-commerce platforms
  • Main challenges faced:

    • Learning curve for NoSQL/DynamoDB
    • Managing distributed system complexity
    • Understanding service quotas and limits
    • Data migration from legacy systems
    • Asynchronous processing patterns