Demystifying Human Action Recognition in Deep Learning with Space-Time Feature Descriptors