Our dataset is divided into three partitions, the original frames, the computed optical flow, and the extracted skeletons. We have recently updated our dataset to include six additional dances, which are reflected below. Since the publication of our original work, we've also had to remove videos from each of the categories as they have been taken down from YouTube. We updated the optical flow to be compute using FlowNet2 and the pose to be extracted from Faceook's DensePose.
The skeletal data follows the same folder structure as the original frames, and the skeletal points are in the same resolution as their original frame. They are formatted as a JSON, so parsing it as that is likely the most convenient approach (we use simplejson in Python).