Video Content Search

One of my favorite research topics is video content search, where we investigate systems and methods to retrieve particular scenes and moments in videos that are contained in a large video archive (e.g., 1000 hours of content). This challenging topic has high relevance in practice – with the ever increasing usage of videos in our life – and includes the following sub-topics:

The following video shows an interactive video retrieval system that we have developed over several years for content-based search in videos. This system is called diveXplore and has been used for the Video Browser Showdown in 2017 and 2018 as well as for the Lifelog Search Challenge 2018 (in slightly modified form). The video shows the different features of the system when applied to the TRECVID IACC.3 dataset that consists of 600 hours of video content (around 300000 shots):


Interactive video search (also known as exploratory video search) is a method of interactive information retrieval in video content that is not limited to the typical query-and-browse-results approach employed by traditional retrieval tools. Instead, exploratory video search tools are highly interactive and try to integrate the user during several stages of the search process (through iterative human-computer interaction). This way they provide both visual browsing facilities (e.g., smart navigation features, content overviews and structuring means) as well as query features that allow users to translate their imaginations to automatic content retrieval (e.g., query by filtering, query by sketch, similarity search, etc.). Exploratory video search therefore supports content search situations where users cannot formulate a concrete query or simply want to inspect the video in order to see its content.

This kind of modern video browsing supports two content search scenarios: directed and undirected search. In the first scenario, users have a clear information need and want to find a specific target segment in the video (e.g., the weather forecast in a news show); such a search is also known as known-item search or target search. In the second scenario, users have no concrete search goal but want to explore the content in order to learn or find something interesting (e.g., a violent scene in surveillance videos); such a search scenario is known as exploratory search.

Over the years many tools for interactive video search and exploration have been proposed in the literature (see [1],[2] for a survey), and it has been shown that these tools can effectively help users find desired content in videos (e.g., [3],[4]). Some of these tools combine sophisticated content analysis methods controlled by the user for their personal needs. Some others provide rather simple content navigation features but give the users more interactivity to allow them to effectively take advantage of their knowledge about the content and the content structure. Interestingly, it has been shown that tools of the latter kind can even outperform tools of the first kind for some search tasks [5].

Here are the slides of our tutorial on interactive video search, presented recently at the ACM Multimedia 2018 Conference in Seoul, South Koreo (on Monday, October 23, 2018).

Slides of a tutorial on Interactive Video Search, presented at the ACM International Conference on Multimedia 2015 (ACM MM’15) in Brisbane, Australia, can be found here:


Our research on video content search also includes collaboration features for interactive video retrieval, such that several users can work together to jointly retrieve information in a large video archive. As shown in the figure below, this allows for fast and flexible content-based search where

  • several¬†users¬†perform the same type of search but in different areas of the video collection, or
  • several¬†users perform different types of search (e.g., one searches by semantic concept browsing, the second one¬†by color sketches/queries, and the third one by motion sketches/queries), but work together to speed-up the whole search process. This way different users follow different search paths and¬†perform¬†distributed facet-based search.


Collaborative Video Search


In either way, the search system of each user needs to communicate with the systems of the other users in order to perform some kind of synchronization. In our system (used for VBS 2017) this is provided through the following features:

  • a¬†collaborative mini map that communicates with all connected systems and¬†shows what¬†content is currently inspected by all users (and has been inspected).
  • automatic collaborative re-ranking of retrieved results, such that already inspected videos are down-ranked, while unchecked¬†content is up-ranked.
  • manual notifications among users to make search colleagues aware of interesting areas in the video collection.


The figure below shows how a sophisticated video retrieval tool could work together with a mobile video browsing tool (optimized for visual human inspection), in a way that both systems benefit from each other.


If you want to know more details about collaborative video search, here are a few papers:


ViDive Screenshot 1ViDive Screenshot 2