We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Lightning Talks + Closing Remarks DAY 1 | PyData Amsterdam 2024
Key insights from PyData Amsterdam 2024 Day 1 lightning talks: healthcare data bias, timezones, LLMs, NLP trends, clinical trials, data structures & evolving data roles.
-
Missing or biased data in healthcare and medical research can have fatal consequences, especially for women and underrepresented groups since most data is collected from male subjects
-
Time zone handling in data pipelines remains challenging - inconsistencies between UTC, local time zones, and daylight savings time can cause data duplication and analysis issues
-
Converting documents (like PDFs) to markdown using multimodal LLMs is becoming a viable alternative to traditional OCR approaches, with better accuracy and structure preservation
-
In NLP, the rapid evolution of models and techniques (BERT, GPT, LoRA, etc.) creates a steep learning curve for newcomers - understanding core concepts is more important than chasing latest trends
-
Clinical trial data management requires special consideration for security, standardization and proper handling of sensitive patient information
-
When working with data frames and time series, understanding the underlying data structure and intent is more important than simply choosing pandas by default
-
Data roles continue to evolve and specialize - from data analysts and scientists to ML engineers, research scientists, and data engineers, each with distinct skillsets
-
Test data generation and synthetic data creation remain challenging, especially for complex scenarios and edge cases
-
Code readability and maintainability should prioritize clear intent and documentation over minimal line count
-
Proper data visualization and communication are critical for conveying insights and findings effectively across teams