Attempting to Aggregate Perceptual Constructs From Deep Neural Networks for Video and Audio Interaction Representation | IEEE Conference Publication | IEEE Xplore