Clojure Where it Counts: Tidying Data Science Workflows - Pier Federico Gherardini & Ben Kamphaus

Discover how the Parker Institute for Cancer Immunotherapy leverages Clojure and Datomic to streamline data science workflows, enabling efficient and reproducible cancer research.

Key takeaways
  • The Parker Institute for Cancer Immunotherapy (PICI) is a non-profit organization focused on developing cancer treatments using the immune system.
  • PICI uses a data platform built on Clojure and Datomic to manage and analyze large amounts of molecular and clinical data.
  • The platform allows scientists to query the data using a data log query language, which is similar to SQL but more expressive.
  • The platform also includes tools for importing data from various sources, cleaning and transforming the data, and visualizing the results.
  • PICI is open-sourcing some of the components of the platform, including a library for querying Datomic from R and a data log JSON parser.
  • The platform has been used to identify patients who are likely to respond to immunotherapy, develop new clinical programs, and understand the mechanisms of action of different therapies.
  • The platform has improved the efficiency and reproducibility of cancer research and has the potential to be used in other data science contexts.
  • The platform is designed to be flexible and extensible, allowing scientists to add new features and functionality as needed.
  • The platform is also designed to be scalable, allowing it to handle large amounts of data and complex queries.
  • The platform is being used by a network of leading cancer researchers and clinicians to accelerate the development of new cancer treatments.