Research
Open Set Recognition
Artificial Intelligence
Computer Vision
Data Mining
Machine Learning
Instruction Tuning
Instructional Videos
Visual Explanation
Activity Driven
Image Reconstruction
Deep Convolutional Neural Network
Global Objective
Network Flow
Variational Autoencoder
Video Question Answering
Human Interaction
Lifelong Learning
Special Session
Supplement Materials
Invariant Features
Supporting Regions
Image Classification
Large Databases
Activity Recognition
Random Forest, Random Forests
Geographic Location, Geographical Locations
Multiple Temporal Scales
Semantic Descriptions
Hierarchical Model, Hierarchical Modeling
Level Set Method
Shape Recognition
List of Publications (100)
In 2025
100
Facial affective behavior analysis with instruction tuning. Y Li, A Dao, W Bao, Z Tan, T Chen, H Liu, Y Kong European Conference on Computer Vision, 165-186, 2025.
Found on Publication Page
99
Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment. Y Chen, K Li, W Bao, D Patel, Y Kong, MR Min, DN Metaxas European Conference on Computer Vision, 193-210, 2025.
Found on Publication Page
In 2024
98
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding. Z Cheng, Y Pu, S Gong, P Kordjamshidi, Y Kong arXiv preprint arXiv:2407.05118, 2024.
Found on Publication Page
97
The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative. Z Tan, C Zhao, R Moraffah, Y Li, Y Kong, T Chen, H Liu arXiv preprint arXiv:2402.14859, 2024.
Found on Publication Page
In 2023
96
CSGNN: Conquering Noisy Node labels via Dynamic Class-wise Selection. Y Li, Z Tan, K Shu, Z Cao, Y Kong, H Liu arXiv preprint arXiv:2311.11473, 2023.
Found on Publication Page
95
Catch Missing Details: Image reconstruction with frequency augmented variational autoencoder. X Lin, Y Li, J Hsiao, C Ho, Y Kong Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2023.
Found on Publication Page
94
Latent space energy-based model for fine-grained open set recognition. W Bao, Q Yu, Y Kong arXiv preprint arXiv:2309.10711, 2023.
Found on Publication Page
93
Prompting language-informed distribution for compositional zero-shot learning. W Bao, L Chen, H Huang, Y Kong arXiv preprint arXiv:2305.14428, 2023.
Found on Publication Page
92
Ancestor search: Generalized open set recognition via hyperbolic side information learning. X Dengxiong, Y Kong Proceedings of the IEEE/CVF Winter Conference on Applications of Computer ..., 2023.
Found on Publication Page
91
On Model Explanations with Transferable Neural Pathways. X Lin, W Bao, Q Yu, Y Kong arXiv preprint arXiv:2309.09887, 2023.
Found on Publication Page
90
ATM: Action Temporality Modeling for Video Question Answering. J Chen, J Zhu, Y Kong Proceedings of the 31st ACM International Conference on Multimedia, 4886-4895, 2023.
Found on Publication Page
89
Uncertainty-aware state space transformer for egocentric 3d hand trajectory forecasting. W Bao, L Chen, L Zeng, Z Li, Y Xu, J Yuan, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision ..., 2023.
Found on Publication Page
In 2022
88
A dynamic meta-learning model for time-sensitive cold-start recommendations. KP Neupane, E Zheng, Y Kong, Q Yu Proceedings of the AAAI Conference on Artificial Intelligence 36 (7), 7868-7876, 2022.
Found on Publication Page
87
Opental: Towards open set temporal action localization. W Bao, Q Yu, Y Kong Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2022.
Found on Publication Page
86
Learning of global objective for network flow in multi-object tracking. S Li, Y Kong, H Rezatofighi Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2022.
Found on Publication Page
85
Gatehub: Gated history unit with background suppression for online action detection. J Chen, G Mittal, Y Yu, Y Kong, M Chen Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2022.
Found on Publication Page
84
An eye for an eye: Defending against gradient-based attacks with gradients. H Hong, Y Hong, Y Kong arXiv preprint arXiv:2202.01117, 2022.
Found on Publication Page
83
Universal 3-dimensional perturbations for black-box attacks on video recognition systems. S Xie, H Wang, Y Kong, Y Hong 2022 IEEE Symposium on Security and Privacy (SP), 1390-1407, 2022.
Found on Publication Page
82
Human action recognition and prediction: A survey. Y Kong, Y Fu International Journal of Computer Vision 130 (5), 1366-1401, 2022.
Found on Publication Page
81
A Dynamic Meta-Learning Model for Time-Sensitive Cold-Start Recommendations. K Prasad Neupane, E Zheng, Y Kong, Q Yu arXiv e-prints, arXiv: 2204.00970, 2022.
Found on Publication Page
In 2021
80
DRIVE: Deep reinforced accident anticipation with visual explanation. W Bao, Q Yu, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision ..., 2021.
Found on Publication Page
79
Adversarial Memory Networks for Action Prediction. Z Tao, Y Bai, H Zhao, S Li, Y Kong, Y Fu arXiv preprint arXiv:2112.09875, 2021.
Found on Publication Page
78
Gradient frequency modulation for visually explaining video understanding models. X Lin, W Bao, M Wright, Y Kong arXiv preprint arXiv:2111.01215, 2021.
Found on Publication Page
77
From ensemble clustering to subspace clustering: Cluster structure encoding. Z Tao, J Li, H Fu, Y Kong, Y Fu IEEE Transactions on Neural Networks and Learning Systems 34 (5), 2670-2681, 2021.
Found on Publication Page
76
Few-shot human motion prediction via learning novel motion dynamics. C Zang, M Pei, Y Kong Proceedings of the Twenty-Ninth International Conference on International ..., 2021.
Found on Publication Page
75
Privacy attributes-aware message passing neural network for visual privacy attributes classification. H Hong, W Bao, Y Hong, Y Kong 2020 25th International Conference on Pattern Recognition (ICPR), 4245-4251, 2021.
Found on Publication Page
74
Evidential deep learning for open set action recognition. W Bao, Q Yu, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision ..., 2021.
Found on Publication Page
73
Accurate and fast image denoising via attention guided scaling. Y Zhang, K Li, K Li, G Sun, Y Kong, Y Fu IEEE Transactions on Image Processing 30, 6255-6265, 2021.
Found on Publication Page
72
Coupling Adversarial Graph Embedding for transductive zero-shot action recognition. Y Tian, Y Huang, W Xu, Y Kong Neurocomputing 452, 239-252, 2021.
Found on Publication Page
71
Multiple Instance Relational Learning for Video Anomaly Detection. X Dengxiong, W Bao, Y Kong 2021 International Joint Conference on Neural Networks (IJCNN), 1-8, 2021.
Found on Publication Page
70
Explainable video entailment with grounded visual evidence. J Chen, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
Found on Publication Page
In 2020
69
Object-aware centroid voting for monocular 3d object detection. W Bao, Q Yu, Y Kong 2020 IEEE/RSJ international conference on intelligent robots and systems ..., 2020.
Found on Publication Page
68
Group activity prediction with sequential relational anticipation model. J Chen, W Bao, Y Kong European Conference on Computer Vision, 581-597, 2020.
Found on Publication Page
67
Publishing video data with indistinguishable objects. H Wang, Y Kong, Y Hong, J Vaidya Advances in database technology: proceedings. International Conference on ..., 2020.
Found on Publication Page
66
Activity-driven weakly-supervised spatio-temporal grounding from untrimmed videos. J Chen, W Bao, Y Kong Proceedings of the 28th ACM International Conference on Multimedia, 3789-3797, 2020.
Found on Publication Page
65
Rit-18: A novel dataset for compositional group activity understanding. J Chen, H Hao, H Hong, Y Kong Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2020.
Found on Publication Page
64
Uncertainty-based traffic accident anticipation with spatio-temporal relational learning. W Bao, Q Yu, Y Kong Proceedings of the 28th ACM International Conference on Multimedia, 2682-2690, 2020.
Found on Publication Page
In 2019
63
Action Recognition. Y Kong, Y Fu Deep Learning Through Sparse and Low-Rank Modeling, 183-212, 2019.
Found on Publication Page
62
Aligned dynamic-preserving embedding for zero-shot action recognition. Y Tian, Y Kong, Q Ruan, G An, Y Fu IEEE Transactions on Circuits and Systems for Video Technology 30 (6), 1597-1612, 2019.
Found on Publication Page
61
Semi-supervised cross-modality action recognition by latent tensor transfer learning. C Jia, Z Ding, Y Kong, Y Fu IEEE Transactions on Circuits and Systems for Video Technology 30 (9), 2801-2814, 2019.
Found on Publication Page
60
Visual object tracking via multi-stream deep similarity learning networks. K Li, Y Kong, Y Fu IEEE Transactions on Image Processing 29, 3311-3320, 2019.
Found on Publication Page
In 2018
59
Clustered lifelong learning via representative task selection. G Sun, Y Cong, Y Kong, X Xu 2018 IEEE International Conference on Data Mining (ICDM), 1248-1253, 2018.
Found on Publication Page
58
Action prediction from videos via memorizing hard-to-predict samples. Y Kong, S Gao, B Sun, Y Fu Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018.
Found on Publication Page
57
Residual dense network for image super-resolution. Y Zhang, Y Tian, Y Kong, B Zhong, Y Fu Proceedings of the IEEE conference on computer vision and pattern ..., 2018.
Found on Publication Page
56
Adversarial action prediction networks. Y Kong, Z Tao, Y Fu IEEE transactions on pattern analysis and machine intelligence 42 (3), 539-553, 2018.
Found on Publication Page
In 2017
55
Max-margin heterogeneous information machine for RGB-D action recognition. Y Kong, Y Fu International Journal of Computer Vision 123, 350-371, 2017.
Found on Publication Page
54
Sparse subspace clustering by learning approximation l0 codes. J Li, Y Kong, Y Fu Proceedings of the AAAI Conference on Artificial Intelligence 31 (1), 2017.
Found on Publication Page
53
Deep Sequential Context Networks for Action Prediction. Y Kong, Z Tao, Y Fu Computer Vision and Pattern Recognition, 2017.
Found on Publication Page
52
Deeply learned view-invariant features for cross-view action recognition. Y Kong, Z Ding, J Li, Y Fu IEEE Transactions on Image Processing 26 (6), 3028-3037, 2017.
Found on Publication Page
51
Multi-stream deep similarity learning networks for visual tracking. K Li, Y Kong, Y Fu IJCAI, 2017.
Found on Publication Page
50
Probabilistic low-rank multitask learning. Y Kong, M Shao, K Li, Y Fu IEEE transactions on neural networks and learning systems 29 (3), 670-680, 2017.
Found on Publication Page
49
Hierarchical and spatio-temporal sparse representation for human action recognition. Y Tian, Y Kong, Q Ruan, G An, Y Fu IEEE Transactions on Image Processing 27 (4), 1748-1762, 2017.
Found on Publication Page
48
Deep Geo-Constrained Auto-Encoder for Non-Landmark GPS Estimation. S Jiang, Y Kong, Y Fu IEEE Transactions on Big Data 5 (2), 120-133, 2017.
Found on Publication Page
47
Deep active learning through cognitive information parcels. W Zhao, Y Kong, Z Ding, Y Fu Proceedings of the 25th ACM international conference on Multimedia, 952-960, 2017.
Found on Publication Page
In 2016
46
Rgb-d action recognition. C Jia, Y Kong, Z Ding, Y Fu Human Activity Recognition and Prediction, 87-106, 2016.
Found on Publication Page
45
Action Recognition and Human Interaction. Y Kong, Y Fu Human Activity Recognition and Prediction, 23-48, 2016.
Found on Publication Page
43
Learning hierarchical 3D kernel descriptors for RGB-D action recognition. Y Kong, B Satarboroujeni, Y Fu Computer Vision and Image Understanding 144, 14-23, 2016.
Found on Publication Page
41
Deep convolutional neural network with independent softmax for large scale face recognition. Y Wu, J Li, Y Kong, Y Fu Proceedings of the 24th ACM international conference on Multimedia, 1063-1067, 2016.
Found on Publication Page
40
Learning fast low-rank projection for image classification. J Li, Y Kong, H Zhao, J Yang, Y Fu IEEE Transactions on Image Processing 25 (10), 4803-4814, 2016.
Found on Publication Page
39
Discriminative relational representation learning for RGB-D action recognition. Y Kong, Y Fu IEEE Transactions on Image Processing 25 (6), 2856-2865, 2016.
Found on Publication Page
38
Efficient image geotagging using large databases. D Kit, Y Kong, Y Fu IEEE Transactions on Big Data 2 (4), 325-338, 2016.
Found on Publication Page
In 2015
37
Modeling supporting regions for close human interaction recognition. Y Kong, Y Fu Computer Vision-ECCV 2014 Workshops: Zurich, Switzerland, September 6-7 and ..., 2015.
Found on Publication Page
36
Bilinear heterogeneous information machine for RGB-D action recognition. Y Kong, Y Fu Proceedings of the IEEE conference on computer vision and pattern ..., 2015.
Found on Publication Page
35
Hierarchical 3d kernel descriptors for action recognition using depth sequences. Y Kong, B Satarboroujeni, Y Fu 2015 11th IEEE international conference and workshops on automatic face and ..., 2015.
Found on Publication Page
34
Max-margin action prediction machine. Y Kong, Y Fu IEEE transactions on pattern analysis and machine intelligence 38 (9), 1844-1858, 2015.
Found on Publication Page
33
Close human interaction recognition using patch-aware models. Y Kong, Y Fu IEEE Transactions on Image Processing 25 (1), 167-178, 2015.
Found on Publication Page
In 2014
32
A discriminative model with multiple temporal scales for action prediction. Y Kong, D Kit, Y Fu Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland ..., 2014.
Found on Publication Page
31
Learning a discriminative mid-level feature for action recognition. CW Liu, MT Pei, XX Wu, Y Kong, YD Jia Science China Information Sciences 57, 1-13, 2014.
Found on Publication Page
30
Interactive Phrases: Semantic Descriptions for Human Interaction Recognition. Y Kong, Y Jia, Y Fu IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 2014.
Found on Publication Page
29
Latent tensor transfer learning for RGB-D action recognition. C Jia, Y Kong, Z Ding, YR Fu Proceedings of the 22nd ACM international conference on Multimedia, 87-96, 2014.
Found on Publication Page
28
Recognising human interaction from videos by a discriminative model. Y Kong, W Liang, Z Dong, Y Jia IET Computer vision 8 (4), 277-286, 2014.
Found on Publication Page
27
LASOM: Location Aware Self-Organizing Map for Discovering Similar and Unique Visual Features of Geographical Locations. D Kit, Y Kong, Y Fu .
Found on Publication Page
In 2013
26
Activity recognition by learning structural and pairwise mid-level features using random forest. J Hu, Y Kong, Y Fu 2013 10th IEEE International Conference and Workshops on Automatic Face and ..., 2013.
Found on Publication Page
In 2012
25
Contour-HOG: A Stub Feature based Level Set Method for Learning Object Contour. Z Yang, Y Kong, Y Fu BMVC, 1-11, 2012.
Found on Publication Page
24
Action recognition with discriminative mid-level features. C Liu, Y Kong, X Wu, Y Jia Proceedings of the 21st International Conference on Pattern Recognition ..., 2012.
Found on Publication Page
23
Decomposed contour prior for shape recognition. Z Yang, Y Kong, Y Fu Proceedings of the 21st International Conference on Pattern Recognition ..., 2012.
Found on Publication Page
22
Learning human interaction by interactive phrases. Y Kong, Y Jia, Y Fu European Conference on Computer Vision, 300-313, 2012.
Found on Publication Page
21
A hierarchical model for human interaction recognition. Y Kong, Y Jia 2012 IEEE International Conference on Multimedia and Expo, 1-6, 2012.
Found on Publication Page
In 2011
20
Adaptive learning codebook for action recognition. Y Kong, X Zhang, W Hu, Y Jia Pattern Recognition Letters 32 (8), 1178-1186, 2011.
Found on Publication Page
19
Recognizing human interaction by multiple features. Z Dong, Y Kong, C Liu, H Li, Y Jia The First Asian Conference on Pattern Recognition, 77-81, 2011.
Found on Publication Page
In 2010
18
Learning human actions with an adaptive codebook. Y Kong, X Zhang, W Hu, Y Jia 2010 16th International Conference on Virtual Systems and Multimedia, 13-20, 2010.
Found on Publication Page
17
A swarm intelligence based searching strategy for articulated 3D human body tracking. X Zhang, W Hu, X Wang, Y Kong, N Xie, H Wang, H Ling, S Maybank 2010 IEEE Computer Society Conference on Computer Vision and Pattern ..., 2010.
Found on Publication Page
16
Compact visual codebook for action recognition. Q Wei, X Zhang, Y Kong, W Hu, H Ling 2010 IEEE International Conference on Image Processing, 3805-3808, 2010.
Found on Publication Page
In 2009
15
Group action recognition using space-time interest points. Q Wei, X Zhang, Y Kong, W Hu, H Ling Advances in Visual Computing: 5th International Symposium, ISVC 2009, Las ..., 2009.
Found on Publication Page
14
Learning group activity in soccer videos from local motion. Y Kong, W Hu, X Zhang, H Wang, Y Jia Asian Conference on Computer Vision, 103-112, 2009.
Found on Publication Page
In 2008
13
Group action recognition in soccer videos. Y Kong, X Zhang, Q Wei, W Hu, Y Jia 2008 19th International Conference on Pattern Recognition, 1-4, 2008.
Found on Publication Page
Unspecified
10
Supplemental Material Max-Margin Heterogeneous Information Machine for RGB-D Action Recognition. Y Kong, Y Fu .
Found on Publication Page
9
Special Session 1: Human Activity Recognition in Smart Environment. G Jeong, HS Yang, Y Kong, X Zhang, W Hu, Y Jia, MS Ryoo, J Joung, ... .
Found on Publication Page
7
Supplement Materials for Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder. X Lin, Y Li, J Hsiao, C Ho, Y Kong .
Found on Publication Page
6
Moamar Sayed-Mouchaweh, High National Eng School of Mines of Douai, France Yun Raymond Fu, Northeastern University John Anderson, University of Manitoba Plamen Angelov .... G Batista, R Bayindir, F Luo, N Bouguila, L Liao, M Shao, A Dourado, ... .
Found on Publication Page
5
Deep Reinforced Accident Anticipation with Visual Explanation Supplementary Materials. W Bao, Q Yu, Y Kong .
Found on Publication Page
4
Evidential Deep Learning for Open Set Action Recognition Supplementary Materials. W Bao, Q Yu, Y Kong .
Found on Publication Page
3
OpenTAL: Towards Open Set Temporal Action Localization Supplementary Material. W Bao, Q Yu, Y Kong .
Found on Publication Page
2
Supplementary: GateHUB: Gated History Unit with Background Suppression for Online Action Detection. J Chen, G Mittal, Y Yu, Y Kong, M Chen RED 7, 45.3, 0.
Found on Publication Page
1
Call for Papers on Blockchain in Healthcare. H Kolivand, Y Kong, A Bagula, P Kieseberg, B Balamurugan .
Found on Publication Page