Machine perception has been widely researched in computer vision community due to their importance in many real-life applications. Recent breakthrough in machine perception by introducing deep learning significantly increased the usage of visual machine perception. In particular, hand/body pose estimation, object detection, object recognition, and pixel-wise segmentation have been opened the door to many real-life applications.
Deep Learning for 3D Information
Technical advancement of 3D printing, virtual reality, and augmented reality has greatly increased the interest of handling three-dimensional shapes such as three-dimensional object synthesis and reconstruction, which has been deeply studied in computer vision communities. Emergence of neural networks and creation of large-scale three-dimensional object datasets inspired researchers to rediscover three-dimensional object representation learning and synthesis.
Data Fusion of Multi-modal Data
Video contains not only visual information but also sound information. These multi-modal information helps understanding contents, and it is more descriptive than visual information. Fusing these multi-modal information is an open question, and it is the first step toward general artificial intelligence. Additionally, there are a lot of different kinds of visual sensors which provide complementary information. Fusing these heterogeneous information reduces the uncertainty of the estimation model. In our lab, we research on integration of multiple sensor data with deep learning models.
Novel & Future View Synthesis
In computer vision, view synthesis has been used to apply changes in lighting and viewpoint to single-view images of rigid and non-rigid objects. In real-life applications, synthetic views can be used to help predict unobserved part locations and also improve the performance of object grasping with manipulators and the path planning of an autonomous driving system.
Large Scale Dataset Curation
Large scale dataset curation and developing efficient annotation methods are crucial for deep learning since deep learning requires a large scale dataset to optimize the deep neural networks. In this reason, the performance of the model has positive correlation with the size and quality of the dataset.
Dynamic Vision Sensor
Dynamic vision sensor is the next generation of vision camera, which mimics human eyes to visualize motions. Unlike conventional camera, it locates individual pixel location in microsecond as an event data. Therefore, it has low latency and low power consumption, and it is also robust from motion blur unlike conventional camera. With these advantages, it has huge potential usages in AR/VR applications and autonomous driving.