Haystack 2.0: the story of a rewrite [PyCon DE & PyData Berlin 2024]

Learn how Haystack 2.0's rewrite improved modularity, reduced dependencies, and enhanced workflows while maintaining backward compatibility. See what makes a rewrite successful.

Key takeaways
  • Haystack 2.0 underwent a major rewrite to improve modularity and reduce dependencies, making installation faster and lighter

  • The pipeline architecture was transformed from a linear “tube” structure to a directed acyclic graph, enabling more complex workflows and decision paths

  • Components were made smaller and more focused, receiving only necessary inputs rather than bulky data structures, improving testability and customization

  • Dependencies were split between core and integrations repositories, making maintenance easier and preventing dependency conflicts

  • Lazy imports were implemented for optional dependencies, providing better user experience with clear error messages when additional packages are needed

  • The rewrite focused on reducing technical debt while maintaining backward compatibility through migration guides and documentation

  • Integration development became 4x easier, leading to many new integrations with services like Elasticsearch, Weaviate, and PyCon

  • Setting firm deadlines for rewrites is crucial to avoid perfectionism and ensure actual releases

  • Community feedback and use cases should drive rewrite decisions, not just technical improvements

  • Documentation, DevRel support, and migration tools are as important as the code changes in a successful rewrite