Yu Kong

Profile Picture of Yu Kong
Title
Assistant Professor
Department
PhD Program in Computing and Information Sciences
Institution
Rochester Institute of Technology

Education

  • Beijing Institute of Technology

Research Interests

Open Set Recognition   Artificial Intelligence   Computer Vision  

  View all research interests

Biography

Not mentioned yet

Homepages

Contact Information

Not mentioned yet.
Research
Not mentioned yet. (?)
List of Publications (100)
In 2025
100

Facial affective behavior analysis with instruction tuning. Y Li, A Dao, W Bao, Z Tan, T Chen, H Liu, Y Kong European Conference on Computer Vision, 165-186, 2025.

Found on Publication Page
99

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment. Y Chen, K Li, W Bao, D Patel, Y Kong, MR Min, DN Metaxas European Conference on Computer Vision, 193-210, 2025.

Found on Publication Page
In 2024
98

SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding. Z Cheng, Y Pu, S Gong, P Kordjamshidi, Y Kong arXiv preprint arXiv:2407.05118, 2024.

Found on Publication Page
97

The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative. Z Tan, C Zhao, R Moraffah, Y Li, Y Kong, T Chen, H Liu arXiv preprint arXiv:2402.14859, 2024.

Found on Publication Page
In 2023
96

CSGNN: Conquering Noisy Node labels via Dynamic Class-wise Selection. Y Li, Z Tan, K Shu, Z Cao, Y Kong, H Liu arXiv preprint arXiv:2311.11473, 2023.

Found on Publication Page
95

Catch Missing Details: Image reconstruction with frequency augmented variational autoencoder. X Lin, Y Li, J Hsiao, C Ho, Y Kong Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2023.

Found on Publication Page
94

Latent space energy-based model for fine-grained open set recognition. W Bao, Q Yu, Y Kong arXiv preprint arXiv:2309.10711, 2023.

Found on Publication Page
93

Prompting language-informed distribution for compositional zero-shot learning. W Bao, L Chen, H Huang, Y Kong arXiv preprint arXiv:2305.14428, 2023.

Found on Publication Page
92

Ancestor search: Generalized open set recognition via hyperbolic side information learning. X Dengxiong, Y Kong Proceedings of the IEEE/CVF Winter Conference on Applications of Computer ..., 2023.

Found on Publication Page
91

On Model Explanations with Transferable Neural Pathways. X Lin, W Bao, Q Yu, Y Kong arXiv preprint arXiv:2309.09887, 2023.

Found on Publication Page
90

ATM: Action Temporality Modeling for Video Question Answering. J Chen, J Zhu, Y Kong Proceedings of the 31st ACM International Conference on Multimedia, 4886-4895, 2023.

Found on Publication Page
89

Uncertainty-aware state space transformer for egocentric 3d hand trajectory forecasting. W Bao, L Chen, L Zeng, Z Li, Y Xu, J Yuan, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision ..., 2023.

Found on Publication Page
In 2022
88

A dynamic meta-learning model for time-sensitive cold-start recommendations. KP Neupane, E Zheng, Y Kong, Q Yu Proceedings of the AAAI Conference on Artificial Intelligence 36 (7), 7868-7876, 2022.

Found on Publication Page
87

Opental: Towards open set temporal action localization. W Bao, Q Yu, Y Kong Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2022.

Found on Publication Page
86

Learning of global objective for network flow in multi-object tracking. S Li, Y Kong, H Rezatofighi Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2022.

Found on Publication Page
85

Gatehub: Gated history unit with background suppression for online action detection. J Chen, G Mittal, Y Yu, Y Kong, M Chen Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2022.

Found on Publication Page
84

An eye for an eye: Defending against gradient-based attacks with gradients. H Hong, Y Hong, Y Kong arXiv preprint arXiv:2202.01117, 2022.

Found on Publication Page
83

Universal 3-dimensional perturbations for black-box attacks on video recognition systems. S Xie, H Wang, Y Kong, Y Hong 2022 IEEE Symposium on Security and Privacy (SP), 1390-1407, 2022.

Found on Publication Page
82

Human action recognition and prediction: A survey. Y Kong, Y Fu International Journal of Computer Vision 130 (5), 1366-1401, 2022.

Found on Publication Page
81

A Dynamic Meta-Learning Model for Time-Sensitive Cold-Start Recommendations. K Prasad Neupane, E Zheng, Y Kong, Q Yu arXiv e-prints, arXiv: 2204.00970, 2022.

Found on Publication Page
In 2021
80

DRIVE: Deep reinforced accident anticipation with visual explanation. W Bao, Q Yu, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision ..., 2021.

Found on Publication Page
79

Adversarial Memory Networks for Action Prediction. Z Tao, Y Bai, H Zhao, S Li, Y Kong, Y Fu arXiv preprint arXiv:2112.09875, 2021.

Found on Publication Page
78

Gradient frequency modulation for visually explaining video understanding models. X Lin, W Bao, M Wright, Y Kong arXiv preprint arXiv:2111.01215, 2021.

Found on Publication Page
77

From ensemble clustering to subspace clustering: Cluster structure encoding. Z Tao, J Li, H Fu, Y Kong, Y Fu IEEE Transactions on Neural Networks and Learning Systems 34 (5), 2670-2681, 2021.

Found on Publication Page
76

Few-shot human motion prediction via learning novel motion dynamics. C Zang, M Pei, Y Kong Proceedings of the Twenty-Ninth International Conference on International ..., 2021.

Found on Publication Page
75

Privacy attributes-aware message passing neural network for visual privacy attributes classification. H Hong, W Bao, Y Hong, Y Kong 2020 25th International Conference on Pattern Recognition (ICPR), 4245-4251, 2021.

Found on Publication Page
74

Evidential deep learning for open set action recognition. W Bao, Q Yu, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision ..., 2021.

Found on Publication Page
73

Accurate and fast image denoising via attention guided scaling. Y Zhang, K Li, K Li, G Sun, Y Kong, Y Fu IEEE Transactions on Image Processing 30, 6255-6265, 2021.

Found on Publication Page
72

Coupling Adversarial Graph Embedding for transductive zero-shot action recognition. Y Tian, Y Huang, W Xu, Y Kong Neurocomputing 452, 239-252, 2021.

Found on Publication Page
71

Multiple Instance Relational Learning for Video Anomaly Detection. X Dengxiong, W Bao, Y Kong 2021 International Joint Conference on Neural Networks (IJCNN), 1-8, 2021.

Found on Publication Page
70

Explainable video entailment with grounded visual evidence. J Chen, Y Kong Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.

Found on Publication Page
In 2020
69

Object-aware centroid voting for monocular 3d object detection. W Bao, Q Yu, Y Kong 2020 IEEE/RSJ international conference on intelligent robots and systems ..., 2020.

Found on Publication Page
68

Group activity prediction with sequential relational anticipation model. J Chen, W Bao, Y Kong European Conference on Computer Vision, 581-597, 2020.

Found on Publication Page
67

Publishing video data with indistinguishable objects. H Wang, Y Kong, Y Hong, J Vaidya Advances in database technology: proceedings. International Conference on ..., 2020.

Found on Publication Page
66

Activity-driven weakly-supervised spatio-temporal grounding from untrimmed videos. J Chen, W Bao, Y Kong Proceedings of the 28th ACM International Conference on Multimedia, 3789-3797, 2020.

Found on Publication Page
65

Rit-18: A novel dataset for compositional group activity understanding. J Chen, H Hao, H Hong, Y Kong Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern ..., 2020.

Found on Publication Page
64

Uncertainty-based traffic accident anticipation with spatio-temporal relational learning. W Bao, Q Yu, Y Kong Proceedings of the 28th ACM International Conference on Multimedia, 2682-2690, 2020.

Found on Publication Page
In 2019
63

Action Recognition. Y Kong, Y Fu Deep Learning Through Sparse and Low-Rank Modeling, 183-212, 2019.

Found on Publication Page
62

Aligned dynamic-preserving embedding for zero-shot action recognition. Y Tian, Y Kong, Q Ruan, G An, Y Fu IEEE Transactions on Circuits and Systems for Video Technology 30 (6), 1597-1612, 2019.

Found on Publication Page
61

Semi-supervised cross-modality action recognition by latent tensor transfer learning. C Jia, Z Ding, Y Kong, Y Fu IEEE Transactions on Circuits and Systems for Video Technology 30 (9), 2801-2814, 2019.

Found on Publication Page
60

Visual object tracking via multi-stream deep similarity learning networks. K Li, Y Kong, Y Fu IEEE Transactions on Image Processing 29, 3311-3320, 2019.

Found on Publication Page
In 2018
59

Clustered lifelong learning via representative task selection. G Sun, Y Cong, Y Kong, X Xu 2018 IEEE International Conference on Data Mining (ICDM), 1248-1253, 2018.

Found on Publication Page
58

Action prediction from videos via memorizing hard-to-predict samples. Y Kong, S Gao, B Sun, Y Fu Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018.

Found on Publication Page
57

Residual dense network for image super-resolution. Y Zhang, Y Tian, Y Kong, B Zhong, Y Fu Proceedings of the IEEE conference on computer vision and pattern ..., 2018.

Found on Publication Page
56

Adversarial action prediction networks. Y Kong, Z Tao, Y Fu IEEE transactions on pattern analysis and machine intelligence 42 (3), 539-553, 2018.

Found on Publication Page
In 2017
55

Max-margin heterogeneous information machine for RGB-D action recognition. Y Kong, Y Fu International Journal of Computer Vision 123, 350-371, 2017.

Found on Publication Page
54

Sparse subspace clustering by learning approximation l0 codes. J Li, Y Kong, Y Fu Proceedings of the AAAI Conference on Artificial Intelligence 31 (1), 2017.

Found on Publication Page
53

Deep Sequential Context Networks for Action Prediction. Y Kong, Z Tao, Y Fu Computer Vision and Pattern Recognition, 2017.

Found on Publication Page
52

Deeply learned view-invariant features for cross-view action recognition. Y Kong, Z Ding, J Li, Y Fu IEEE Transactions on Image Processing 26 (6), 3028-3037, 2017.

Found on Publication Page
51

Multi-stream deep similarity learning networks for visual tracking. K Li, Y Kong, Y Fu IJCAI, 2017.

Found on Publication Page
50

Probabilistic low-rank multitask learning. Y Kong, M Shao, K Li, Y Fu IEEE transactions on neural networks and learning systems 29 (3), 670-680, 2017.

Found on Publication Page
49

Hierarchical and spatio-temporal sparse representation for human action recognition. Y Tian, Y Kong, Q Ruan, G An, Y Fu IEEE Transactions on Image Processing 27 (4), 1748-1762, 2017.

Found on Publication Page
48

Deep Geo-Constrained Auto-Encoder for Non-Landmark GPS Estimation. S Jiang, Y Kong, Y Fu IEEE Transactions on Big Data 5 (2), 120-133, 2017.

Found on Publication Page
47

Deep active learning through cognitive information parcels. W Zhao, Y Kong, Z Ding, Y Fu Proceedings of the 25th ACM international conference on Multimedia, 952-960, 2017.

Found on Publication Page
In 2016
46

Rgb-d action recognition. C Jia, Y Kong, Z Ding, Y Fu Human Activity Recognition and Prediction, 87-106, 2016.

Found on Publication Page
45

Action Recognition and Human Interaction. Y Kong, Y Fu Human Activity Recognition and Prediction, 23-48, 2016.

Found on Publication Page
44

Activity Prediction. Y Kong, Y Fu Human Activity Recognition and Prediction, 107-122, 2016.

Found on Publication Page
43

Learning hierarchical 3D kernel descriptors for RGB-D action recognition. Y Kong, B Satarboroujeni, Y Fu Computer Vision and Image Understanding 144, 14-23, 2016.

Found on Publication Page
42

Introduction. Y Kong, Y Fu Human Activity Recognition and Prediction, 1-22, 2016.

Found on Publication Page
41

Deep convolutional neural network with independent softmax for large scale face recognition. Y Wu, J Li, Y Kong, Y Fu Proceedings of the 24th ACM international conference on Multimedia, 1063-1067, 2016.

Found on Publication Page
40

Learning fast low-rank projection for image classification. J Li, Y Kong, H Zhao, J Yang, Y Fu IEEE Transactions on Image Processing 25 (10), 4803-4814, 2016.

Found on Publication Page
39

Discriminative relational representation learning for RGB-D action recognition. Y Kong, Y Fu IEEE Transactions on Image Processing 25 (6), 2856-2865, 2016.

Found on Publication Page
38

Efficient image geotagging using large databases. D Kit, Y Kong, Y Fu IEEE Transactions on Big Data 2 (4), 325-338, 2016.

Found on Publication Page
In 2015
37

Modeling supporting regions for close human interaction recognition. Y Kong, Y Fu Computer Vision-ECCV 2014 Workshops: Zurich, Switzerland, September 6-7 and ..., 2015.

Found on Publication Page
36

Bilinear heterogeneous information machine for RGB-D action recognition. Y Kong, Y Fu Proceedings of the IEEE conference on computer vision and pattern ..., 2015.

Found on Publication Page
35

Hierarchical 3d kernel descriptors for action recognition using depth sequences. Y Kong, B Satarboroujeni, Y Fu 2015 11th IEEE international conference and workshops on automatic face and ..., 2015.

Found on Publication Page
34

Max-margin action prediction machine. Y Kong, Y Fu IEEE transactions on pattern analysis and machine intelligence 38 (9), 1844-1858, 2015.

Found on Publication Page
33

Close human interaction recognition using patch-aware models. Y Kong, Y Fu IEEE Transactions on Image Processing 25 (1), 167-178, 2015.

Found on Publication Page
In 2014
32

A discriminative model with multiple temporal scales for action prediction. Y Kong, D Kit, Y Fu Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland ..., 2014.

Found on Publication Page
31

Learning a discriminative mid-level feature for action recognition. CW Liu, MT Pei, XX Wu, Y Kong, YD Jia Science China Information Sciences 57, 1-13, 2014.

Found on Publication Page
30

Interactive Phrases: Semantic Descriptions for Human Interaction Recognition. Y Kong, Y Jia, Y Fu IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 2014.

Found on Publication Page
29

Latent tensor transfer learning for RGB-D action recognition. C Jia, Y Kong, Z Ding, YR Fu Proceedings of the 22nd ACM international conference on Multimedia, 87-96, 2014.

Found on Publication Page
28

Recognising human interaction from videos by a discriminative model. Y Kong, W Liang, Z Dong, Y Jia IET Computer vision 8 (4), 277-286, 2014.

Found on Publication Page
27

LASOM: Location Aware Self-Organizing Map for Discovering Similar and Unique Visual Features of Geographical Locations. D Kit, Y Kong, Y Fu .

Found on Publication Page
In 2013
26

Activity recognition by learning structural and pairwise mid-level features using random forest. J Hu, Y Kong, Y Fu 2013 10th IEEE International Conference and Workshops on Automatic Face and ..., 2013.

Found on Publication Page
In 2012
25

Contour-HOG: A Stub Feature based Level Set Method for Learning Object Contour. Z Yang, Y Kong, Y Fu BMVC, 1-11, 2012.

Found on Publication Page
24

Action recognition with discriminative mid-level features. C Liu, Y Kong, X Wu, Y Jia Proceedings of the 21st International Conference on Pattern Recognition ..., 2012.

Found on Publication Page
23

Decomposed contour prior for shape recognition. Z Yang, Y Kong, Y Fu Proceedings of the 21st International Conference on Pattern Recognition ..., 2012.

Found on Publication Page
22

Learning human interaction by interactive phrases. Y Kong, Y Jia, Y Fu European Conference on Computer Vision, 300-313, 2012.

Found on Publication Page
21

A hierarchical model for human interaction recognition. Y Kong, Y Jia 2012 IEEE International Conference on Multimedia and Expo, 1-6, 2012.

Found on Publication Page
In 2011
20

Adaptive learning codebook for action recognition. Y Kong, X Zhang, W Hu, Y Jia Pattern Recognition Letters 32 (8), 1178-1186, 2011.

Found on Publication Page
19

Recognizing human interaction by multiple features. Z Dong, Y Kong, C Liu, H Li, Y Jia The First Asian Conference on Pattern Recognition, 77-81, 2011.

Found on Publication Page
In 2010
18

Learning human actions with an adaptive codebook. Y Kong, X Zhang, W Hu, Y Jia 2010 16th International Conference on Virtual Systems and Multimedia, 13-20, 2010.

Found on Publication Page
17

A swarm intelligence based searching strategy for articulated 3D human body tracking. X Zhang, W Hu, X Wang, Y Kong, N Xie, H Wang, H Ling, S Maybank 2010 IEEE Computer Society Conference on Computer Vision and Pattern ..., 2010.

Found on Publication Page
16

Compact visual codebook for action recognition. Q Wei, X Zhang, Y Kong, W Hu, H Ling 2010 IEEE International Conference on Image Processing, 3805-3808, 2010.

Found on Publication Page
In 2009
15

Group action recognition using space-time interest points. Q Wei, X Zhang, Y Kong, W Hu, H Ling Advances in Visual Computing: 5th International Symposium, ISVC 2009, Las ..., 2009.

Found on Publication Page
14

Learning group activity in soccer videos from local motion. Y Kong, W Hu, X Zhang, H Wang, Y Jia Asian Conference on Computer Vision, 103-112, 2009.

Found on Publication Page
In 2008
13

Group action recognition in soccer videos. Y Kong, X Zhang, Q Wei, W Hu, Y Jia 2008 19th International Conference on Pattern Recognition, 1-4, 2008.

Found on Publication Page
Unspecified
12

Conference Proceeding. Y Kong, Y Fu .

Found on Publication Page
11

Recognizing Human Interaction by Multiple. Z Dong, Y Kong, C Liu, Y Jia .

Found on Publication Page
10

Supplemental Material Max-Margin Heterogeneous Information Machine for RGB-D Action Recognition. Y Kong, Y Fu .

Found on Publication Page
9

Special Session 1: Human Activity Recognition in Smart Environment. G Jeong, HS Yang, Y Kong, X Zhang, W Hu, Y Jia, MS Ryoo, J Joung, ... .

Found on Publication Page
8

ICME 2012. Y Kong, Y Jia .

Found on Publication Page
7

Supplement Materials for Catch Missing Details: Image Reconstruction with Frequency Augmented Variational Autoencoder. X Lin, Y Li, J Hsiao, C Ho, Y Kong .

Found on Publication Page
6

Moamar Sayed-Mouchaweh, High National Eng School of Mines of Douai, France Yun Raymond Fu, Northeastern University John Anderson, University of Manitoba Plamen Angelov .... G Batista, R Bayindir, F Luo, N Bouguila, L Liao, M Shao, A Dourado, ... .

Found on Publication Page
5

Deep Reinforced Accident Anticipation with Visual Explanation Supplementary Materials. W Bao, Q Yu, Y Kong .

Found on Publication Page
4

Evidential Deep Learning for Open Set Action Recognition Supplementary Materials. W Bao, Q Yu, Y Kong .

Found on Publication Page
3

OpenTAL: Towards Open Set Temporal Action Localization Supplementary Material. W Bao, Q Yu, Y Kong .

Found on Publication Page
2

Supplementary: GateHUB: Gated History Unit with Background Suppression for Online Action Detection. J Chen, G Mittal, Y Yu, Y Kong, M Chen RED 7, 45.3, 0.

Found on Publication Page
1

Call for Papers on Blockchain in Healthcare. H Kolivand, Y Kong, A Bagula, P Kieseberg, B Balamurugan .

Found on Publication Page
Search Profiles
Colleagues
Profile Picture of Hwan Shim
Rochester Institute of Technology
Profile Picture of Elise Adams
Rochester Institute of Technology
Profile Picture of Elliot Reza Emadian
Rochester Institute of Technology
People Also Viewed
Profile Picture of Megan Morand
Wayne State University
Profile Picture of Jennifer Amy Richkus
The MITRE Corporation
Profile Picture of Carol Rafalski
University of West Florida
Recommended Grants