My research interests are at the intersection of security, privacy, AI. Specifically, most of our research aims at improving the trustworthiness and safety of generative AI systems, as well as investigating the application of AI in enhancing security measures, like vulnerability discovery, static and dynmaic analysis. Recently, my primary research interests are as follows.
1. AI Security and Privacy.
The rapid advancement of large models has transformed the AI landscape, leading to a wide array of applications in real-world scenarios. However, this progress has also revealed significant security and privacy concerns, including unintended system behaviors, privacy breaches, and the spread of harmful information. Our objective is to investigate these security vulnerabilities and develop defense solutions to create trustworthy and responsible AI system. Recently, we have focused on LLM-based and VLM-based agents to examine vulnerabilities and explore defenses against various potential threats, such as jailbreaking and prompt injection. These agents are being deployed in safety-critical domains, reflecting the growing use of large model-powered agents across diverse applications.
• Jailbreak/prompt injection attacks and defenses
• Data/model extractiony
• Safety alignment, Red-team testing
• Large Models Memorization, Machine Unlearning
• Intellectual Property (IP) Protection of Models and Datasets
2. AI Safety.
The emergence of AIGC (Artificial Intelligence Generated Content) has revolutionized the content creation landscape, enabling a multitude of applications across various industries. However, this rapid evolution has introduced critical challenges, including issues of content authenticity, ethical concerns, and the risk of generating misleading information, especially through deepfake technology. DeepFakes can create hyper-realistic audio and video content, raising significant risks such as misinformation and malicious uses. Our objective is to explore these challenges and develop frameworks for responsible AI-generated content. We are focusing on advanced generative models to assess their impact on content quality and investigate biases and misinformation. As AIGC, including DeepFakes, becomes more prevalent in marketing, journalism, and entertainment, addressing these concerns is vital for fostering trust and accountability in AI-driven content creation.
• DeepFake Passive Detection and Proactive Defense
• DeepFake Evasion Attack
• NSFW Attacks and Defenses
• Multimodal Learning, Vision-language Models
• Diffusion Models, Text2Image Generation, Text2Video Generation
3. AI for Security.
The rapid evolution of large language models (LLMs) has significantly transformed the cybersecurity landscape, creating new opportunities for dynamic and static analysis, as well as vulnerability discovery and exploitation. Our goal is to expand the use of LLMs in cybersecurity by developing intelligent framework for automated vulnerability detection and exploitation. Currently, we are enhancing dynamic analysis through advanced protocol fuzz testing, where LLMs generate diverse input data to simulate attack scenarios and uncover vulnerabilities. They also facilitate comprehensive code audits by automatically identifying security flaws like buffer overflows and injection vulnerabilities. Additionally, LLMs generate effective exploit code based on identified vulnerabilities, aiding security researchers in understanding these weaknesses and evaluating defense robustness. As these AI tools are increasingly adopted in safety-critical domains, it is crucial to address their limitations and improve their reliability to advance cybersecurity.
• LLM-assisted Program Analysis
• Vulnerability Discovery and Exploitation
[2023.1-2025.12] PI, The National Natural Science Foundation of China. Research on the Key Technologies of DeepFake Video Forensics for Key Figures in Open World Settings
[2021.12-2024.11] Co-PI, The National Key Research and Development Program of China. The Defense and Evaluation Technology for AI Security
[2021.10-2023.09] PI, Natural Science Foundation of Hubei Province. Research on DeepFake Video Passive Detection and Proactive Defense
[2021.07-2023.06] PI, The Fellowship of China National Postdoctoral Program for Innovative Talents. Research on DeepFake Video Detection and Provenance
[2021.05-2021.10] Co-PI, The Equipment Development Department of the Central Military Commission. Research on Deep Learning based DeepFake Video Detection
FakePotter,基于监测神经元行为来识别人工智能合成的假脸。对神经元覆盖率和相互作用的研究已经成功地表明,它们可以作为深度学习系统的测试标准,特别是在暴露于对抗性攻击的情况下。在这里,我们推测监测神经元行为也可以作为检测假脸的一种资产,因为逐层的神经元激活模式可能会捕捉到对假脸检测器很重要的更微妙的特征。通过检测用最先进的GANs合成的四种类型的假人脸并规避四种扰动攻击的实验结果表明了我们方法的有效性和稳健性。
DeepSonar,基于说话人识别(SR)系统(即深度神经网络)的神经元行为监测 (DNN),以辨别人工智能合成的假声音。分层神经元 行为提供了一个重要的洞察力来仔细捕捉 广泛用于建筑的投入之间的差异 安全、稳健和可解释的DNN。在这项工作中,我们利用 分层神经元激活模式的力量,目的是它们可以捕捉真实神经元之间的细微差异 和人工智能合成的假声音,为 分类器而不是原始输入。实验在三个 数据集(包括谷歌、百度等的商业产品) 包含英文和中文,以证实 高检测率(平均准确率98.1%)和低误报 DeepSonar在辨别假声音方面的准确率(约2%的错误率)。 此外,大量的实验结果也证明了其 对操纵攻击的鲁棒性(例如,语音转换和 添加的真实世界噪声)。
在这项工作中,我们调查 现有的基于GAN的人脸操作方法的架构,并观察到上采样方法的缺陷 其中可以作为GAN合成假图像检测和伪造定位的重要资产。基于 在此基础上,我们提出了一种新的方法, 称为FakeLocator,以获得高定位精度 分辨率,在操纵的面部图像上。尽我们所能 知识,这是第一次尝试解决基于GAN的 灰度伪映射的伪定位问题 保留了更多虚假区域的信息。为了改进 FakeLocator在各种面部属性中的通用性, 我们引入了一种注意力机制来指导的训练 模型。为了提高FakeLocator在不同DeepFake方法中的通用性,我们提出了部分数据扩充 以及对训练图像进行单样本聚类。实验的 在流行的FaceForensics++、DFFD数据集和七个数据集上的结果 不同的最先进的基于GAN的人脸生成方法 已经证明了我们方法的有效性。
我们设计了一种简单而强大的方法,称为FakePol-isher,通过 一本学习过的线性词典,旨在有效地 减少在图像合成期间引入的伪影。特别是,我们首先训练一个字典模型来捕捉的模式 真实的图像。根据这本字典,我们寻求表示法 通过线性在低维子空间中的DeepFake图像 投影或稀疏编码。然后,我们可以表演浅 重建DeepFake图像的“无假”版本, 这在很大程度上减少了DeepFake引入的伪像模式。对3种最先进的DeepFake检测方法和16种流行的基于GAN的伪图像的综合评价 伪图像生成技术,证明了其有效性