Conference Program


Preliminary conference program; more information on exact paper timing will be posted here within the next few days.

Tuesday, January 8th, 2019
09:30-12:45Tutorial 1: Multimodal Deep Learning, by Prof. Xavier Giro-i-Nieto
09:30-12:45Workshop 1: MANPU Workshop
13:45-17:00Tutorial 2: New Trends of Simulation and Augmented Visualization in Medicine, by Prof. Lucio Tommaso De Paolis
13:45-17:00VBS rehearsal and closed session
Wednesday, January 9th, 2019
09:00-09:20Conference Opening
09:20-10:20Keynote Talk 1: Prof. Daniel Gatica-Perez
10:50-12:30Oral Session 1: Best Paper Session
10:50-11:10Junyi Wang, Bing-Kun Bao and Changsheng Xu. Sentiment-aware Multi-modal Recommendation on Tourist Attractions
11:10-11:30Kaijun Zhang, Chenghao Guo, Zhonghan Niu, Lufei Liu and Yubin Yang. SCOD:Dynamical Spatial Constraints for Object Detection
11:30-11:50Guang Chen, Yuexian Zou and Can Zhang. STMP: Spatial Temporal Multi-level Proposal Network for Activity Detection
11:50-12:10Junchao Zhang and Yuxin Peng. Hierarchical Vision-Language Alignment for Video Captioning
12:10-12:30Alex Kupin, Benjamin Moeller, Yijun Jiang, Natasha Kholgade Banerjee and Sean Banerjee. Task-Driven Biometric Authentication of Users in Virtual Reality (VR) Environments
13:30-15:10Oral Session 2A: 3D & VR
13:30-13:50Lingyun Yu, Jun Yu and Qiang Ling. Deep Neural Network Based 3D Articulatory Movement Prediction Using Both Text and Audio Inputs
13:50-14:10Kyriaki Christaki, Emmanouil Christakis, Petros Drakoulis, Alexandros Doumanoglou, Nikolaos Zioulis, Dimitrios Zarpalas and Petros Daras. Subjective Visual Quality Assessment of Immersive 3D Media Compressed by Open-Source Static 3D Mesh Codecs
14:10-14:30Kedong Liu, Yanwei Liu, Jinxia Liu, Antonios Argyriou and Ying Ding. Joint EPC and RAN Caching of Tiled VR Videos for Mobile Networks
14:30-14:50Adam Siekawa, Michał Chwesiuk, Radosław Mantiuk and Rafał Piórkowski. Foveated Ray Tracing for VR Headsets
14:50-15:10Marek Wernikowski, Radoslaw Mantiuk and Rafał Piórkowski. Preferred Model of Adaptation to Dark for Virtual Reality Headsets
13:30-15:10Oral Session 2B: Special Session 2 – MAPTA
Manuel Stein, Daniel Seebacher, Tassilo Karge, Tom Polk, Michael Grossniklaus and Daniel A. Keim. From Movement to Events: Improving Soccer Match Annotations
Lyndon Nixon, Evlampios Apostolidis, Foteini Markatopoulou, Ioannis Patras and Vasileios Mezaris. Hybrid Video Annotation for Retrieval and Discovery of Newsworthy Video in a News Verification Scenario
Björn Þór Jónsson, Snorri Gíslason and Laurent Amsaleg. Integration of Exploration and Search: A Case Study of the M^3 Model
Werner Bailer. Face Swapping for Solving Collateral Privacy Issues in Multimedia Analytics
Alan Smeaton, Yvette Graham, Kevin McGuinness, Noel O'Connor, Seán Quinn and Eric Arazo Sanchez. The Impact of Training Data Bias on Automatic Generation of Video Captions
16:00-19:00Video Browser Showdown
Klaus Schoeffmann, Bernd Muenzer, Andreas Leibetseder, Manfred Jürgen Primus and Sabrina Kletz. Autopiloting Feature Maps: The Deep Interactive Video Exploration (diveXplore) System at VBS2019
Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Franca Debole, Fabrizio Falchi, Claudio Gennaro, Lucia Vadicamo and Claudio Vairo. VISIONE at VBS2019
Jakub Lokoc, Gregor Kovalcik, Tomáš Souček, Jaroslav Moravec, Jan Bodnar and Premysl Cech. VIRET Tool Meets NasNet
Stelios Andreadis, Anastasia Moumtzidou, Damianos Galanopoulos, Foteini Markatopoulou, Konstantinos Apostolidis, Thanassis Mavropoulos, Ilias Gialampoukidis, Stefanos Vrochidis, Vasileios Mezaris, Ioannis Kompatsiaris and Ioannis Patras. VERGE in VBS 2019
Phuong Anh Nguyen, Chong-Wah Ngo, Danny Francis and Benoit Huet. VIREO @ Video Browser Showdown 2019
Luca Rossetto, Mahnaz Amiri Parian, Ralph Gasser, Ivan Giangreco, Silvan Heller and Heiko Schuldt. Deep Learning-based Concept Detection in vitrivr
Thursday, January 10th, 2019
09:20-10:20Keynote Talk 2: Prof. Andreas Symeonidis
10:50-12:30Oral Session 3A: Special Session 1 – PDAL
Owen Corrigan. Fashion Police: Towards Semantic Indexing of Clothing Information In Surveillance Data
Yijun Jiang, Elim Schenck, Spencer Kranz, Sean Banerjee and Natasha Kholgade Banerjee. CNN-Based Non-Contact Detection of Food Level in Bottles from RGB Images
Zhixiang Ji, Jie Tang and Gangshan Wu. Personalized Recommendation of Photography based on Deep Learning
Xiaohua Wang, Muzi Peng, Lijuan Pan, Min Hu, Chunhua Jin and Fuji Ren. Two-level Attention with Multi-task Learning for Facial Emotion Estimation
Aaron Duane and Cathal Gurrin. User Interaction for Visual Lifelog Retrieval in a Virtual Environment
10:50-12:30Oral Session 3B: MM Indexing and Mining
10:50-11:10Shuhei Tsuchida, Satoru Fukayama and Masataka Goto. Query-by-Dancing: A Dance Music Retrieval System Based on Body-Motion Similarity
11:10-11:30Xuelin Zhu, Jiuxin Cao, Shuai Xu and Bo Liu. Joint Visual-Textual Sentiment Analysis Based on Cross-modality Attention Mechanism
11:30-11:50Chang Zhou, Lai Man Po, Mengyang Liu, Wilson Y.F. Yuen, Peter H.W. Wong, Hon-Tung Luk, Kin Wai Lau and How Kwan Cheung. Deep Hashing with Triplet Labels and Unification Binary Code Selection for Fast Image Retrieval
11:50-12:10Martin Winter and Werner Bailer. Incremental Training for Face Recognition
12:10-12:30Ke Sun, Zhuo Lei, Jiasong Zhu, Xianxu Hou, Bozhi Liu and Guoping Qiu. Character Prediction in TV Series via Semantic Projection Network
13:30-15:10Oral Session 4A: Special Session 3 – MDRE
Cathal Gurrin, Klaus Schoeffmann, Hideo Joho, Bernd Munzer, Rami Albatal, Frank Hopfgartner, Liting Zhou and Duc-Tien Dang-Nguyen. A Test Collection for Interactive Lifelog Retrieval
Minh-Son Dao, Tomohiro Sato, Kota Kuribayashi and Koji Zettsu. DATASON: Challenges and Opportunities within Environment - Personal Health Archives
Theodoros Giannakopoulos and Margarita Orfanidi. Athens Urban Soundscape (ATHUS): A Dataset for Urban Soundscape Quality Recognition
Luca Rossetto, Heiko Schuldt, George Awad and Asad Butt. V3C - a Research Video Collection
13:30-15:10Oral Session 4B: Deep Learning & Applications
13:30-13:50Minho Park, Hak Gu Kim and Yong Man Ro. Photo-realistic Facial Emotion Synthesis using Multi-level Critic Networks with Multi-level Generative Model
13:50-14:10Xierong Zhu, Jiawei Liu, Hongtao Xie and Zhengjun Zha. Adaptive Alignment Network for Person Re-identification
14:10-14:30Yongchao Xu, Chaoran Cui and Cheng Shi. Visual Urban Perception with Deep Semantic-Aware Network
14:30-14:50Zhuopeng Li and Xiaoyan Zhang. Deep Reinforcement Learning for Automatic Thumbnail Generation
14:50-15:10Yu-Chieh Chen, Daniel Stanley Tan, Wen-Huang Cheng and Kai-Lung Hua. 3D Object Completion via Class-conditional Generative Adversarial Network
15:40-17:20Poster Session 1: Posters
Konstantinos Apostolidis and Vasileios Mezaris. Image Aesthetics Assessment using Fully Convolutional Neural Networks
Markos Zampoglou, Fotini Markatopoulou, Gregoire Mercier, Despoina Touska, Evlampios Apostolidis, Symeon Papadopoulos, Roger Cozien, Ioannis Patras, Vasileios Mezaris and Ioannis Kompatsiaris. Detecting Tampered Videos with Multimedia Forensics and Deep Learning
Boubacar Diallo, Thierry Urruty, Pascal Bourdon and Christine Fernandez-Maloigne. Improving Robustness of Image Tampering Detection for Compression
Patrice Guyot, Thierry Malon, Geoffrey Roman-Jimenez, Sylvie Chambon, Vincent Charvillat, Alain Crouzil, André Périnou, Julien Pinquier, Florence Sédes and Christine Sénac. Audiovisual Annotation Procedure for Multi-view Field Recordings
Nan Ran, Longteng Kong, Yunhong Wang and Qingjie Liu. A Robust Multi-Athlete Tracking Algorithm by Exploiting Discriminant Features and Long-Term Dependencies
Marios Krestenitis, Georgios Orfanidis, Konstantinos Ioannidis, Konstantinos Avgerinakis, Stefanos Vrochidis and Ioannis Kompatsiaris. Early Identification of Oil Spills in Satellite Images Using Deep CNNs
Xu Cao and Katashi Nagao. Point Cloud Colorization Based on Densely Annotated 3D Shape Database
Nikolaos Bastas, Theodoros Semertzidis, Apostolos Axenopoulos and Petros Daras. evolve2vec: Learning Network Representations Using Temporal Unfolding
Dunja Vucic and Lea Skorin-Kapov. The Impact of Packet Loss and Google Congestion Control on QoE for WebRTC-based Mobile Multiparty Audiovisual Telemeetings
Can Zhang, Yuexian Zou and Guang Chen. Hierarchical Temporal Pooling for Efficient Online Action Recognition
Xianyu Wu, Xiaojie Li, Jia He, Xi Wu and Imran Mumtaz. Generative Adversarial Networks with Enhanced Symmetric Residual Units for Single Image Super-Resolution
Anastasia Ioannidou, Elisavet Chatzilari, Spiros Nikolopoulos and Yiannis Kompatsiaris. 3D ResNets for 3D Object Classification
Xin Lai, Xirong Li, Rui Qian, Dayong Ding, Jun Wu and Jieping Xu. Four Models for Automatic Recognition of Left and Right Eye in Fundus Images
Alexander Schindler and Andreas Rauber. On the unsolved problem of Shot Boundary Detection for Music Videos
Chao Liu, Dongming Yang and Yuexian Zou. Enhancing Scene Text Detection via Fused Semantic Segmentation Network with Attention
Zhipeng Wu, Hui Tian, Xuzhen Zhu, Shaoshuai Fan and Shuo Wang. Exploiting Incidence Relation Between Subgroups for Improving Clustering-Based Recommendation Model
Yirui Wu, Weigang Xu, Qinghan Yu, Jun Feng and Tong Lu. Hierarchical Bayesian Network based Incremental Model for Flood Prediction
Dan Wang, Yun Sheng and Guixu Zhang. A New Female Body Segmentation and Feature Localisation Method for Image-based Anthropometry
Ioannis Mademlis, Anastasios Tefas and Ioannis Pitas. Greedy Salient Dictionary Learning For Activity Video Summarization
Jinzhong Lin, Junbiao Pang, Li Su, Yugui Liu and Qingming Huang. Accelerating Web Topic Detection For a Large-Scale Data Set via Stochastic Poisson Deconvolution
Siming Cui, Xuanjing Shen and Yingda Lyu. Automatic Segmentation of Brain Tumor Images Based on Region Growing with Co-constraint
Nami Iino, Mayumi Shimada, Takuichi Nishimura and Masatoshi Hamanaka. A Proposal of an Annotation Method of Instrument Performance Knowledge using GTTM Time-Span Tree
Wenliang Zeng and Ji Liu. A Hierarchical Level Set Approach to for RGBD Image Matting
Wei-Ta Chu and Hao-An Chu. A Genetic Programming Approach to Integrate Multilayer CNN Features for Image Classification
Madhumita Takalkar, Haimin Zhang and Min Xu. Improving Micro-Expression Recognition Accuracy using Twofold Feature Extraction
Li Yao, Ya Lin, Chunbo Zhu and Zuolong Wang. An Effective Dual-fisheye Lens Stitching Method based on Feature Points
Xin Liu and Guoying Zhao. 3D Skeletal Gesture Recognition via Sparse Coding of Time-Warping Invariant Riemannian Trajectories
Hengtong Hu, Weijie Fu and Richang Hong. Efficient Graph based Multi-View Leaning
Jesús Jorrín and Luis Buera. DANTE Speaker Recognition Module. An Efficient and Robust Automatic Speaker Searching Solution for Terrorism-related Scenarios
Friday, January 11th, 2019
09:20-10:20Keynote Talk 3: Prof. Martha Larson
10:50-12:30Oral Session 5A: Special Session 4 – CTA
Luis Lebron Casas and Eugenia Koblents. Video Summarization with LSTM and Deep Attention Models
Jodie Gauvain, Lori Lamel, Viet Bac Le, Julien Despres, Abdel Messaoudi, Jean-Luc Gauvain and Bianca Vieru. Challenges in Audio Processing of Terrorist-related Data
George Kalpakis, Theodora Tsikrika, Stefanos Vrochidis and Yiannis Kompatsiaris. Identifying Terrorism-related Key Actors in Multidimensional Social Networks
Alexander Schindler, Andrew Lindley, David Schreiber, Martin Boyer and Thomas Philipp. Large Scale Audio-Visual Video Analytics Platform for Forensic Investigations of Terroristic Attacks
Andrea Ciapetti, Giulia Ruggiero and Daniele Toti. A Semantic Knowledge Discovery Framework for Detecting Online Terrorist Networks
Konstantinos Gkountakos, Theodoros Semertzidis, Georgios Papadopoulos and Petros Daras. A Reliability Object Layer for Deep Hashing-based Visual Indexing
10:50-12:30Oral Session 5B: Audio & Speech
10:50-11:10Rui Zhang, Ruimin Hu, Gang Li and Xiaochen Wang. Spectral Tilt Estimation for Speech Intelligibility Enhancement using RNN based on All-pole Model
11:10-11:30Dading Chong, Yuexian Zou and Wenwu Wang. Multi-Channel Convolutional Neural Networks with Multi-level Feature Fusion for Environmental Sound Classification
11:30-11:50Hirofumi Takamori, Takayuki Nakatsuka, Satoru Fukayama, Masataka Goto and Shigeo Morishima. Audio-Based Automatic Generation of a Piano Reduction Score by Considering the Musical Structure
11:50-12:10Alfonso Perez-Carrillo. Violin Timbre Navigator: Real-time visual feedback of violin bowing based on Audio Analysis and Machine Learning
12:10-12:30Odette Scharenborg, Nikki van der Gouw, Martha Larson and Elena Marchiori. The Representation of Speech in Deep Neural Networks
13:30-15:10Oral Session 6A: Special Session 5 – TCMA
Tairan Zhang, Congyan Lang and Junliang Xing. Realtime Human Segmentation in Video
Chunyang Li, Caiyan Jia, Zhineng Chen, Xiaoyan Gu and Hongyun Bao. psDirector: An Automatic Director for TV View Generation from Panoramic Soccer Video
Li Su and Pamela Cosman. No Reference Video Quality Assessment Based on Ensemble of Knowledge and Data-driven Models
Jiajie Dai and Simon Dixon. Modelling Intonation Trajectories and Understanding the Pattern of Singing Notes
13:30-15:10Oral Session 6B: Industry Session
Nudrat Nida, Muhammad Haroon Yousaf, Aun Irtaza and Sergio Velastin. Bag of Deep Features for Instructor Activity Recognition in Lecture Room
Srijan Das, Monique Thonnat, Kaustubh Sakhalkar, Michal Koperski, Francois Bremond and Gianpiero Francesca. A New Hybrid Architecture for Human Activity Recognition from RGB-D videos
Tom Durand, Xiyan He, Ionel Pop and Lionel Robinault. Utilizing Deep Object Detector for Video Surveillance Indexing and Retrieval
Mehryar Emambakhsh, Alessandro Bay and Eduard Vazquez. Deep Recurrent Neural Network for Multi-target Filtering
Renjie Xie, Yuancheng Wang, Tian Xie, Yuhao Zhang, Li Xu, Jian Lu and Qiao Wang. Adversarial Training for Video Disentangled Representation
15:40-17:20Poster Session 2: Posters and Demos
(VBS systems will also be demonstrated in this session)
Damianos Galanopoulos and Vasileios Mezaris. Temporal Lecture Video Fragmentation using Word Embeddings
Chaohao Lu and Yuexian Zou. Using Coarse Label Constraint for Fine-grained Visual Classification
Danny Francis, Benoit Huet and Bernard Merialdo. Gated Recurrent Capsules for Visual Word Embeddings
Yisheng Yue, Palaiahnakote Shivakumara, Yirui Wu, Liping Zhu, Tong Lu and Umapada Pal. An Automatic System for Generating Artificial Fake Character Images
Wenfeng Zhang, Zhiqiang Wei, Lei Huang, Jie Nie, Lei Lv and Guanqun Wei. Person Re-Identification Based on Pose-aware Segmentation
Chih-Wei Lin and Qilu Ding. Neuropsychiatric Disorders Identification using Convolutional Neural Network
Efstratios Kakaletsis, Maria Tzelepi, Pantelis I. Kaplanoglou, Charalampos Symeonidis, Nikos Nikolaidis, Anastasios Tefas and Ioannis Pitas. Semantic Map Annotation through UAV Video Analysis using Deep Learning Models in ROS
Minglei Yang, Yan Song, Xiangbo Shu and Jinhui Tang. Temporal Action Localization Based on Temporal Evolution Model and Multiple Instance Learning
Jia-Li Tao, Jian-Ming Zhang, Liang-Jun Wang, Xiang-Jun Shen and Zheng-Jun Zha. Near-duplicate Video Retrieval through Toeplitz Kernel Partial Least Squares
Li Hongyang, Chen Jun, Hu Ruimin, Yu Mei, Chen Huafeng and Xu Zengmin. Action Recognition Using Visual Attention with Reinforcement Learning
Junqing Yu, Aiping Lei and Yangliu Hu. Soccer Video Event Detection Based on Deep Learning
Jinna Lv and Bin Wu. Spatio-Temporal Attention Model Based on Multi-View for Social Relation Understanding
Ting Wu, Qing Xu, Yunhe Li, Yuejun Guo and Klaus Schoeffmann. Detail-Preserving Trajectory Summarization Based on Segmentation and Group-Based Filtering
Fang Wen, Zehang Lin, Zhenguo Yang and Wenyin Liu. Single-Stage Detector with Semantic Attention for Occluded Pedestrian Detection
Xian Zhong, Meng Feng, Wenxin Huang, Zheng Wang and Shin’ichi Satoh. Poses Guide Spatiotemporal Model for Vehicle Re-identification
Jui-Yuan Su, Shyi-Chyi Cheng, Pei-Hua Zhang and Jing-Min Chen. Alignment of Deep Features in 3D Models for Camera Pose Estimation
Wenzhe Wang, Bin Wu, Jinna Lv and Pilin Dai. Regular and Small Target Detection
Yannick Le Cacheux, Hervé Le Borgne and Michel Crucianu. From Classical to Generalized Zero-Shot Learning: a Simple Adaptation Process
Masayuki Tamura and Satoshi Nakamura. A Method for Enriching Video-watching Experience with Applied Effects Based on Eye Movements
Junki Saito and Satoshi Nakamura. Fontender: Interactive Japanese Text Design with Dynamic Font Fusion Method for Comics
Iacopo Vagliano, Angela Fessl, Franziska Guenther, Thomas Koehler, Vasileios Mezaris, Ahmed Saleh, Ansgar Scherp and Ilija Simic. Training Researchers with the MOVING Platform
Kyriaki Christaki, Konstantinos C. Apostolakis, Alexandros Doumanoglou, Nikolaos Zioulis, Dimitrios Zarpalas and Petros Daras. Space Wars: An AugmentedVR Game
Bernd Münzer, Andreas Leibetseder, Sabrina Kletz and Klaus Schöffmann. ECAT - Endoscopic Concept Annotation Tool
Juan Soler-Company and Leo Wanner. Automatic Classification and Linguistic Analysis of Extremist Online Material