Karel Boháček - Sign language recognition: Enabling communication for the hearing-impaired via ML

Discover how machine learning enables sign language recognition for the hearing-impaired, featuring MediaPipe framework, Czech Sign Language examples, and real-time processing.

Key takeaways
  • Sign language recognition aims to convert sign language to text/speech to help hearing-impaired people communicate more effectively

  • MediaPipe framework (open source) was used for hand detection and tracking, providing 21 key points per hand with coordinates

  • Project focused primarily on hand detection/tracking, though full sign language recognition requires whole body tracking including face expressions

  • Czech Sign Language was used as test case, with 17 basic testing signs created for proof of concept

  • Decision tree classifier achieved best results among tested models, chosen for interpretability and straightforward tweaking

  • Main challenges included:

    • Dynamic nature of signs (continuous motion)
    • Need for light/camera condition independence
    • Difficulty determining where one sign ends and another begins
    • Grammar differences between signed and spoken languages
  • Created dataset with 200 images per sign for training

  • Real-time processing was achieved but faced challenges with stuttering and frame processing

  • System could detect left/right hands independently and track hand orientation

  • Future improvements would need:

    • Larger datasets
    • Full body tracking
    • Better handling of dynamic movements
    • Integration with speech-to-text for two-way communication