Dissertation

Deep learning for visual understanding

With the dramatic growth of the image data on the web, there is an increasing demand of the algorithms capable of understanding the visual information automatically.

Author: Y.Guo
Date: 05 October 2017
Links: Thesis in Leiden Repository

With the dramatic growth of the image data on the web, there is an increasing demand of the algorithms capable of understanding the visual information automatically. Deep learning, served as one of the most significant breakthroughs, has brought revolutionary success in diverse visual applications, including image classification, object detection, image segmentation, image captioning and etc. The purpose of this thesis is to explore and design new deep learning algorithms for better visual understanding. The main purpose of the thesis is to develop new algorithms which can improve the understanding of images. To fulfill this, it focuses on two visual applications: image classification and image captioning. Image classification aims to classify images into pre-defined categories, and helps people to know what objects the images contain. Image captioning attempts to generate a sentence to describe the images. In addition to the object, the generated sentence should also contain the action, relation and etc.