Please use this identifier to cite or link to this item:
|Title:||Natural language description of surveillance events|
|Keywords:||Surveillance video description|
Video to text
|Abstract:||This paper presents a novel method to represent hours of surveillance video in a pattern-based text log. We present a tag and template-based technique that automatically generates natural language descriptions of surveillance events. We combine the output of some of the existing object tracker, deep learning guided object and action classifiers, and graph-based scene knowledge to assign hierarchical tags and generate natural language description of surveillance events. Unlike some state-of-the-art image and short video descriptor methods, our approach can describe videos, specifically surveillance videos by combining frame-level, temporal-level, and behavior-level target tags/features. We evaluate our method against two baseline video descriptors, and our analysis suggests that supervised scene knowledge and template can improve video descriptions, specially in surveillance videos. � 2019, Springer Nature Singapore Pte Ltd.|
|Appears in Collections:||Research Publications|
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.