GitHub - LimGeunTaekk/LimGeunTaekk

My research focuses on multimodal AI, with a particular emphasis on video-centric AI modeling. My research focuses on empowering AI systems to aid humans in interpreting complex video content, thereby facilitating higher-level reasoning across various applications such as sports analytics, surveillance and media-content. I am deeply interested in exploring the following three key areas for a comprehensive understanding of videos:
- Efficient Video Representation: The high computational demands of video data necessitate the use of efficient techniques like keyframe selection and tokenization.
- Perception and Reasoning in Videos: Understanding temporal information in video, such as frame continuity, causality, and diversity, remains a challenging problem.
- Multi-modal Learning: Audio, when encoded in video streams, introduces semantic information that complements but is distinct from visual semantics.

G Lim, H Kim, J Kim Y Choi, Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization. July 2024, ACM MM. [paper][code]
W Jo, G Lim, G Lee, H Kim, B Ko Y Choi, VVS: Video-to-Video Retrieval with Irrelevant Frame Suppression. February 2024, AAAI. [paper][code]
W Jo, G Lim, Y Hwang, G Lee, J Kim, J Yun, J Jung, Y Choi, Simultaneous Video Retrieval and Alignment. March 2023, IEEE Access. [paper]
J Kim, W Jo, G Lim, J Yun, S Kwak, S Jung, W Cheong, H Choo, J Seo, and Y Choi, Compression Method for MPEG CDVA Global Feature Descriptors CDVA, Journal of Broadcast Engineering. May 2022. [paper]
W Jo, G Lim, J Kim, J Yun, Y Choi, Exploring the Temporal Cues to Enhance Video Retrieval on Standardized CDVA. Apr 2022, IEEE Access. [paper][code]

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
README.md		README.md

Provide feedback