Visual Relation extraction Based on Deep Cross-media Transfer Network
Building a Deep Cross-media Transfer Network to extract visual relations that relieve the problem of insufficient training data for visual tasks.
- 2017 - 2018
- Fons Verbeek
Visual relation extraction is still a challenging problem, and the performance of existing methods usually relies on labeled data for model training. However, the datasets of labeled relation are very scarce. Insufficient training data is a common and severe challenge, especially for DNN-based visual relation extraction methods.
Text has a lot of knowledge. By nature humans can adapt the knowledge from already learned tasks to new tasks. The Deep Cross-media Transfer Network aims to simulate such mechanisms and relieve the problem of insufficient training data for a specific task. The focus of transfer learning is to reduce the domain discrepancy, which is widely used in DNN-based methods for relieving the problem of insufficient training data, but mainly deals with single-media scenario.
The Deep Cross-media Transfer Network will exploit general relations from a source domain (usually a large-scale dataset) to relieve the problem of visual relation extraction.