Combining Supervised Machine Learning and Structured Knowledge for Difficult Perceptual Tasks

Printer-friendly version


Learning models of visual perception lies at the heart of a number of computer vision problems, including object detection, image description, motion tracking, and more. There are a variety of models which may complete such tasks, though the tasks themselves are usually assumed to be consistent in their requirements: receive visual input, and perceive some desired content in said input. Yet for certain tasks, the desired outputs are very difficult to predict given input images alone. Many perceptual tasks require not only the ability to parse content of a visual scene, but also the ability to combine visual information with auxiliary knowledge to reach conclusions. Rather than attempt to incorporate auxiliary knowledge into the parameters of a learned model, this work presents an alternative approach.