YouTube-BoundingBoxes is a large-scale data set of video URLs with densely-sampled high-quality single-object bounding box annotations.
The data set consists of approximately 380,000 15-20s video segments extracted from 240,000 different publicly visible YouTube videos, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality often akin to that of a hand-held cell phone camera.
All these video segments were human-annotated with high precision classifications and bounding boxes at 1 frame per second.
WHY IT MATTERS: this report from gartner review the top 12 buzzword you need to master for your geek xmas cocktail conversations ;-) As often with these, they are very generic and could end up being very difficult to implement. Nevertheless they point in the right direction, globally.