Laszlo Sragner - Code Smells in Data Science: What can we do about them? | PyData London 2023

Laszlo Sragner

Improve code quality and reduce technical debt by recognizing and addressing common code smells in data science, with practical refactoring tips and strategies for success.

Key takeaways
  • Refactoring is essential to improve code quality, but it requires careful planning and testing.
  • Code smells are patterns that indicate poor design or implementation, leading to code rot and maintenance nightmares.
  • Examples of code smells include:
    • Long parameter lists
    • Data clusters (unnecessary variables)
    • Long and complicated composition of code
    • Improper variable scoping
    • Primitive obsession (overuse of basic types instead of meaningful classes)
    • Feature envy (processing data in a class that doesn’t own it)
  • Code review is essential to identify and fix code smells, but it should be a positive and collaborative process.
  • The Happy Path is the main part of the code, while the error handling is secondary.
  • Dependency injection is a design pattern that helps decouple code from specific implementations.
  • Code quality is crucial in high-velocity environments, where rapid changes require maintainable code.
  • Code smells can lead to technical debt, which is often difficult to pay back.
  • Writing readable code is essential for communication and collaboration, and it takes practice to develop good coding skills.
  • Code review should focus on making the code better, not just on checking for errors.
  • Improving code quality is important for reducing technical debt and increasing productivity.
  • The goal of code review is to improve the code, not to criticize the author.