Sign Language to Speech Challenge

We just launched the Sign Language to Speech Challenge for detecting sign language in videos/camera feeds and generating human-like speech of the equivalent words. The Winner will receive $500 + paid internship with the DeepQuestAI team.

All questions and issues related to the challenge can be posted here, including and not limited too

  • Datasets
  • DeepStack custom model APIs
  • Speech synthesis models
  • etc.

Please what kind of dataset are we supposed to use?. I tried collecting video dataset but I discovered that deepstack doesn’t support video classification

1 Like

@CaptainVee At the moment, DeepStack doesn’t support APIs that process a video in a API single request. What you can do in the case of video dataset is

  • The video sequences you have will definitely have annotation provided for specific time image frames in the video (e.g 2mins, 5seconds, 12 frame [in some cases, frame may not be specified.] ).
  • Extract the image frames with a python script and assign the corresponding annotation for each time frame
  • If your dataset is not in the Yolo format, convert it to Yolo format ( more details here )
  • Visualize your annotation is correct with a tool like LabelImg or LabelMe.