In this chapter, we address the visual learning of automatic concept
detectors from web video as available from services like YouTube. While allowing
a much more efficient, flexible, and scalable concept learning compared to expert
labels, web-based detectors perform poorly when applied to different domains (such
as specific TV channels). We address this domain change problem using a novel
approach, which – after an initial training on web content – performs a highly
efficient online adaptation on the target domain.
In quantitative experiments on data from YouTube and from the TRECVID
campaign, we first validate that domain change appears to be the key problem
for web-based concept learning, with a much more significant impact than other
phenomena like label noise. Second, the proposed adaptation approach is shown to
improve the accuracy of web-based detectors significantly, even over SVMs trained
on the target domain. Finally, we extend our approach with active learning such that
adaptation can be interleaved with manual annotation for an efficient exploration
of novel domains.
Keywords: content-based video retrieval, concept detection, visual learning, web video,
domain change, cross-domain, online learning, classifier adaptation, TRECVID,
YouTube