Search Results

Found 21 repositories(showing 21)

llm-adaptive-attacks

tml-epfl

🧡56

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]

380

MIT

Shell

Updated 1 week ago

[AAAI-2026]MAJIC: Markovian Adaptive Jailbreaking. An automated black-box attack framework against LLMs that iteratively selects and fuses innovative disguise strategies guided by a dynamically updated Markov transition matrix.

MIT

Python

Updated 1 day ago

Pirates-of-the-RAG

gnekt

💛70

Pirates of the RAG: Adaptively Attacking LLMs to Leak Knowledge Bases

MIT

Python

Updated 18 hours ago

artificial-intelligencerobustness

promptxploit

Neural-alchemy

🧡50

LLM Penetration Testing Framework - Discover vulnerabilities in AI applications before attackers do. 100attacks + AI-powered adaptive mode.

MIT

Python

Updated 1 month ago

ai-pentestingai-securitychatgpt-security+12

active_attacks

dbsxodud-11

🧡65

Official Code for Active Attacks: Red-teaming LLMs via Adaptive Environments

Python

Updated 4 days ago

ada-llm-wm

ML-Watermarking

❤️35

Adaptive Attacks against LLM content watermarks

Updated 1 year ago

The-Attacker-Moves-Second-Adaptive-Attacks-on-LLM-Safety-Mechanisms

Attila0769

❤️45

Little experiment ran in educational environment. Tested resistance of qwen2.5:7b-instruct to potential attacks. Replicated it with qwen3.5:2b launched 02/03/2026.

Updated 1 month ago

llm-adaptive-attacks

jiakaiy

❤️25

No description available

NOASSERTION

Python

Updated 7 months ago

llm-adaptive-attacks

kingstarfly

❤️30

No description available

MIT

Shell

Updated 1 year ago

llm-jailbreak

lhurr

❤️35

Adapted from: https://github.com/tml-epfl/llm-adaptive-attacks/tree/main

MIT

Jupyter Notebook

Updated 9 months ago

adversary-planner

gengirish

🧡65

Bayesian attack planning engine for LLM security — turns garak scan results into adaptive red team campaigns

Python

Updated 2 days ago

manipulador

francordel

❤️40

Agentic red-teaming framework for LLMs, combining DirectRequest, FewShot, and GPTFuzz with adaptive attack selection and reproducible evaluation.

MIT

Python

Updated 7 months ago

PFT-BabPFT

NewLJing

❤️35

The project is about privacy-preserving fine-tuning framework for LLMs and an adaptive invisible backdoor attack against vision-language LLMs under the framework.

Python

Updated 6 months ago

prompt-injection-defense

d12o6aa

❤️25

A multi-layer defense system to protect LLMs from prompt injection attacks, featuring dynamic regex generation with RL, transformer analysis, and adaptive reverse defense. Optimized for on-premise use with <2s latency.

MIT

Jupyter Notebook

Updated 4 months ago

prompt-firewall

Sherazkarim1

❤️35

A lightweight Prompt-Injection Firewall for LLMs. Screens user inputs, detects malicious or jailbreak attempts, sanitizes risky text, and blocks repeated attacks using adaptive memory. Built with Python, scikit-learn, and Streamlit.

Updated 6 months ago

red-agent

codernate92

🧡60

Adversarial red-teaming framework for LLM agents — ATT&CK-inspired taxonomy, 39 attack probes, adaptive campaign execution, CVSS-analog vulnerability scoring, and professional red-team report generation

MIT

Python

Updated 2 weeks ago

CogniSploit

Trinity-SYT-SECURITY

🧡50

This is an AI-driven penetration testing framework that uses LLMs autonomously plan, execute, and adapt security attacks. Unlike traditional scanners, it understands application logic, generates context-aware payloads, and learns from failed attempts

MIT

Python

Updated 2 months ago

adaptive_red_teaming

manishkhadka13

🧡55

Adaptive red-teaming framework to evaluate LLM safety degradation under post-training quantization (FP16 → INT8 → INT4). Features a chain-of-thought mutation attacker (Qwen2.5-14B) with Crescendo escalation and an adaptive Vector DB defense (ChromaDB) that learns from successful jailbreaks. Evaluated on HarmBench using LlamaGuard3-8B as judge.

Python

Updated 1 week ago

LLMNetGuard

TanmayKhedekar

❤️35

**LLMNetGuard** is a network threat detection system using Large Language Models (LLMs) to analyze network traffic in real time. It detects potential security threats such as malware and DDoS attacks by identifying abnormal traffic patterns. With adaptive ML ,LLMNetGuard continuously improves, providing smarter, proactive protection for networks.

Python

Updated 6 months ago

ARGUS-Adaptive-Reasoning-Grounded-Under-Statistics-for-Phishing-Detection

swei8272

🧡50

We design ARUGUS that uses deterministic statistical and structural signals not only to score webpages, but to dy- namically decide what to verify, which evidence sources to trust, and how to constrain LLM reasoning so that decisions remain grounded under adaptive attack

MIT

Python

Updated 1 month ago

SecuRL

rattlesczck

❤️35

SecuRL – AI-Powered Cyber Threat Detection for 6G Networks SecuRL is a Reinforcement Learning-based cybersecurity framework that detects and mitigates threats in real time using Deep Q-Networks, a Cyber Knowledge Graph (CKG), and NASim. It adapts to evolving attacks, integrates real-time threat intelligence, and enables LLM-powered threat analysis.

Jupyter Notebook

Updated 1 year ago

All 21 repositories loaded

GitHub Explorer

Search Results

llm-adaptive-attacks

MAJIC-AAAI2026

Pirates-of-the-RAG

promptxploit

active_attacks

ada-llm-wm

The-Attacker-Moves-Second-Adaptive-Attacks-on-LLM-Safety-Mechanisms

llm-adaptive-attacks

llm-adaptive-attacks

llm-jailbreak

adversary-planner

manipulador

PFT-BabPFT

prompt-injection-defense

prompt-firewall

red-agent

CogniSploit

adaptive_red_teaming

LLMNetGuard

ARGUS-Adaptive-Reasoning-Grounded-Under-Statistics-for-Phishing-Detection

SecuRL

llm-adaptive-attacks

MAJIC-AAAI2026

Pirates-of-the-RAG

promptxploit

active_attacks

ada-llm-wm

The-Attacker-Moves-Second-Adaptive-Attacks-on-LLM-Safety-Mechanisms

llm-adaptive-attacks

llm-adaptive-attacks

llm-jailbreak

adversary-planner

manipulador

PFT-BabPFT

prompt-injection-defense

prompt-firewall

red-agent

CogniSploit

adaptive_red_teaming

LLMNetGuard

ARGUS-Adaptive-Reasoning-Grounded-Under-Statistics-for-Phishing-Detection

SecuRL