Events
LMSS @ Cornell Tech: Kristen Grauman (UT Austin/Facebook AI Research)
Anticipating the Unseen and Unheard for Embodied Perception
Computer vision has seen major success in learning to recognize objects from massive “disembodied” Web photo collections labeled by human annotators. Yet cognitive science tells us that perception develops in the context of acting the world—and without intensive supervision. Meanwhile, many realistic vision tasks require not only categorizing a well-composed human-taken photo, but also actively deciding where to look in the first place. In the context of these challenges, we are exploring how perception benefits from anticipating the sights and sounds an agent will experience as a function of its own actions. We introduce methods to learn predictive representations from unlabeled video accompanied by multi-modal sensory data like egomotion and sound. Using these representations, we demonstrate the impact for low-shot visual recognition and visually-guided audio source separation. Moving from passively captured video to agents that control their own first-person cameras, we investigate how agents can learn to move intelligently to acquire visual observations
Speaker Bio
Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Scientist at Facebook AI Research. Her research in computer vision and machine learning focuses on visual recognition and search. Before joining UT Austin in 2007, she received her Ph.D. at MIT. She is a AAAI Fellow, a Sloan Fellow, and a recipient of the NSF CAREER, ONR YIP, PECASE, and PAMI Young Researcher awards, and she received the 2013 IJCAI Computers and Thought Award. She and her collaborators were recognized with best paper awards at CVPR 2008, ICCV 2011, ACCV 2016, and a 2017 Helmholtz Prize “test of time” award. She previously served as Program Chair of the Conference on Computer Vision and Pattern Recognition (CVPR) in 2015 and Neural Information Processing Systems (NeurIPS) in 2018, and she currently serves as Associate Editor-in-Chief for the Transactions on Pattern Analysis and Machine Intelligence (PAMI).