Laszlo Sragner - Code Smells in Data Science: What can we do about them? | PyData London 2023

Improve code quality and reduce technical debt by recognizing and addressing common code smells in data science, with practical refactoring tips and strategies for success.

Key takeaways
  • Refactoring is essential to improve code quality, but it requires careful planning and testing.
  • Code smells are patterns that indicate poor design or implementation, leading to code rot and maintenance nightmares.
  • Examples of code smells include:
    • Long parameter lists
    • Data clusters (unnecessary variables)
    • Long and complicated composition of code
    • Improper variable scoping
    • Primitive obsession (overuse of basic types instead of meaningful classes)
    • Feature envy (processing data in a class that doesn’t own it)
  • Code review is essential to identify and fix code smells, but it should be a positive and collaborative process.
  • The Happy Path is the main part of the code, while the error handling is secondary.
  • Dependency injection is a design pattern that helps decouple code from specific implementations.
  • Code quality is crucial in high-velocity environments, where rapid changes require maintainable code.
  • Code smells can lead to technical debt, which is often difficult to pay back.
  • Writing readable code is essential for communication and collaboration, and it takes practice to develop good coding skills.
  • Code review should focus on making the code better, not just on checking for errors.
  • Improving code quality is important for reducing technical debt and increasing productivity.
  • The goal of code review is to improve the code, not to criticize the author.