Found 21 repositories(showing 21)
tml-epfl
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
ZJU-LLM-Safety
[AAAI-2026]MAJIC: Markovian Adaptive Jailbreaking. An automated black-box attack framework against LLMs that iteratively selects and fuses innovative disguise strategies guided by a dynamically updated Markov transition matrix.
gnekt
Pirates of the RAG: Adaptively Attacking LLMs to Leak Knowledge Bases
Neural-alchemy
LLM Penetration Testing Framework - Discover vulnerabilities in AI applications before attackers do. 100attacks + AI-powered adaptive mode.
dbsxodud-11
Official Code for Active Attacks: Red-teaming LLMs via Adaptive Environments
ML-Watermarking
Adaptive Attacks against LLM content watermarks
Little experiment ran in educational environment. Tested resistance of qwen2.5:7b-instruct to potential attacks. Replicated it with qwen3.5:2b launched 02/03/2026.
jiakaiy
No description available
kingstarfly
No description available
lhurr
Adapted from: https://github.com/tml-epfl/llm-adaptive-attacks/tree/main
gengirish
Bayesian attack planning engine for LLM security — turns garak scan results into adaptive red team campaigns
francordel
Agentic red-teaming framework for LLMs, combining DirectRequest, FewShot, and GPTFuzz with adaptive attack selection and reproducible evaluation.
NewLJing
The project is about privacy-preserving fine-tuning framework for LLMs and an adaptive invisible backdoor attack against vision-language LLMs under the framework.
d12o6aa
A multi-layer defense system to protect LLMs from prompt injection attacks, featuring dynamic regex generation with RL, transformer analysis, and adaptive reverse defense. Optimized for on-premise use with <2s latency.
Sherazkarim1
A lightweight Prompt-Injection Firewall for LLMs. Screens user inputs, detects malicious or jailbreak attempts, sanitizes risky text, and blocks repeated attacks using adaptive memory. Built with Python, scikit-learn, and Streamlit.
codernate92
Adversarial red-teaming framework for LLM agents — ATT&CK-inspired taxonomy, 39 attack probes, adaptive campaign execution, CVSS-analog vulnerability scoring, and professional red-team report generation
Trinity-SYT-SECURITY
This is an AI-driven penetration testing framework that uses LLMs autonomously plan, execute, and adapt security attacks. Unlike traditional scanners, it understands application logic, generates context-aware payloads, and learns from failed attempts
manishkhadka13
Adaptive red-teaming framework to evaluate LLM safety degradation under post-training quantization (FP16 → INT8 → INT4). Features a chain-of-thought mutation attacker (Qwen2.5-14B) with Crescendo escalation and an adaptive Vector DB defense (ChromaDB) that learns from successful jailbreaks. Evaluated on HarmBench using LlamaGuard3-8B as judge.
TanmayKhedekar
**LLMNetGuard** is a network threat detection system using Large Language Models (LLMs) to analyze network traffic in real time. It detects potential security threats such as malware and DDoS attacks by identifying abnormal traffic patterns. With adaptive ML ,LLMNetGuard continuously improves, providing smarter, proactive protection for networks.
We design ARUGUS that uses deterministic statistical and structural signals not only to score webpages, but to dy- namically decide what to verify, which evidence sources to trust, and how to constrain LLM reasoning so that decisions remain grounded under adaptive attack
rattlesczck
SecuRL – AI-Powered Cyber Threat Detection for 6G Networks SecuRL is a Reinforcement Learning-based cybersecurity framework that detects and mitigates threats in real time using Deep Q-Networks, a Cyber Knowledge Graph (CKG), and NASim. It adapts to evolving attacks, integrates real-time threat intelligence, and enables LLM-powered threat analysis.
All 21 repositories loaded