"Without Open Data, there is no Ethical Machine Learning" by Erin Mikail Staples (Strange Loop 2023)

Explore the importance of open data in machine learning, discussing its role in ensuring ethical practices, increasing transparency, and driving innovation.

Key takeaways
  • Without open data, there is no ethical machine learning.
  • Machine learning is a way to solve problems at scale, but problems need context to be solved.
  • Data integrity is crucial, including consent, control, consistency, and contextual understanding.
  • Open data is a philosophy and practice requiring certain data be made publicly available.
  • Government initiatives, such as the US Federal Data Strategy, can drive open data adoption.
  • Private companies, like Open AI, are also promoting open data initiatives.
  • Open data can increase transparency, trust, and public accountability.
  • Several organizations, like the European Federation for Cancer Images, make open data available.
  • Context is key when working with open data, and making data more accessible and usable is crucial.
  • Without the right context, it’s difficult to solve problems at all, let alone the right problems.
  • Building a better future with machine learning requires solving ethical and technical challenges.
  • Lobbying and public dollars can impact the problems of open data, and more advocacy is needed.
  • Some examples of open data sets include IMDB, Twitter, and Reddit data.