"Clojure for Data Science in the Real World" by Kira McLean

Discover how Clojure can revolutionize data science in the real world with its natural language processing, interoperability with other tools, and reproducible results, making it an attractive option for data scientists.

Key takeaways
  • Clojure can be used for data science in the real world, and its natural language for processing data makes it suitable for this field.
  • The ecosystem for Clojure data science is still in its early days and needs more development, but it has potential.
  • Interoperability with other data science tools and languages is critical, and Clojure’s interop story is already good.
  • One of the main problems with the data science world is the issue of reproducibility, and Clojure can help with this.
  • The language is designed to eliminate external state, making it easier to reproduce results.
  • Clojure’s speed and accessibility make it an attractive option for data scientists.
  • The community is working on courses, books, and content to support data science use cases.
  • The language is well-suited for piecing together different tools and libraries, making it a “glue language” for data science.
  • The tidy data paper 10 years ago is an example of how Clojure’s design can solve real-world problems.
  • The community needs to work on smoothing out the rough edges of existing libraries and making them more practical for users.
  • The language’s foundation is DtypeNext, which provides a foundation for fast and efficient column-wise operations.
  • Data science involves a lot of moving parts, and Clojure can help simplify this by providing a reproducible and stable environment.
  • Reproducibility is achieved by eliminating state and using explicit declarative configurations.
  • The community is working on bridging the gap between software development and data science, and Clojure’s design can facilitate this.
  • The problem of state and parallelization is a significant issue in data science, and Clojure’s design can help solve this.
  • The language’s ease of use makes it an attractive option for data scientists with a software development background.
  • The community is working on providing resources and tools to support data science use cases, including courses, books, and content.
  • Clojure’s design is based on the idea that the code is data, and this makes it easier to work with.
  • The language is being used in industry and academia, and its potential is huge.
  • The community is working on making the language more accessible and user-friendly, and its design is inherently simple.