SAINTCON 2023 - Raili Taylor - Banish the Haunting Specter of Black Box Models

Learn how to banish the haunting specter of black box models in machine learning and discover the importance of transparency, data understanding, and feature engineering in AI.

Key takeaways
  • Here are the main points from the talk:
    • Black box models can be troublesome and lack transparency
    • Understanding the data is crucial in machine learning
    • Decision trees are a basic model and useful for anomaly detection
    • Supervised learning requires labeled data and can be mathematically proven to be accurate
    • Unsupervised learning can identify patterns and group similar data together
    • Feature engineering is important for selecting relevant data
    • Dimensionality reduction is necessary for processing high-dimensional data
    • Model interpretability is important for understanding how the model works
    • Deep learning models can be useful for anomaly detection and classification
    • Data poisoning is a challenge in machine learning
    • Machine learning in cybersecurity requires careful evaluation of data and models
    • It’s important to consider the limitations of machine learning models
    • Data labeling can be intensive and may require expert knowledge
    • Feature selection and encoding are important steps in preparing data for modeling
    • Accuracy is not the only metric to evaluate a model’s performance
    • Neural networks can be processor-intensive and require large amounts of data
    • Supervised learning can be used for classification and regression tasks
    • Unsupervised learning can be used for identifying new threat patterns
    • Feature engineering is important for selecting relevant data
    • Model selection is important for choosing the right algorithm for a particular task
    • The speaker encourages the audience to consider the concept of machine learning in a broader context
  • Other key points:
    • Moving data from one format to another
    • Data sets are messy and inconsistent
    • Confusion between AI and machine learning
    • Importance of understanding how models work
    • The value of transparency and model interpretability
    • The importance of considering the limitations of machine learning models
    • The role of feature engineering and dimensionality reduction in preparing data for modeling
    • The potential challenges of data poisoning
    • The importance of labeling data for supervised learning
    • The importance of considering the costs and benefits of using machine learning models
    • The speaker’s expertise in chemical engineering and computer science