Search Results

Found 997 repositories(showing 30)

llm-attacks

💛76

Universal and Transferable Attacks on Aligned Language Models

4.6k

615

MIT

Python

Updated 2 hours ago

BruteForceAI

MorDavid

💛74

Advanced LLM-powered brute-force tool combining AI intelligence with automated login attacks

1.4k

261

NOASSERTION

Python

Updated 11 minutes ago

aibruteforcebugbounty+2

agentdojo

ethz-spylab

💛72

A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

510

131

MIT

Python

Updated 7 hours ago

benchmarklarge-language-modelsprompt-injection+1

PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022

472

MIT

Python

Updated 20 hours ago

adversarial-attacksagiagi-alignment+9

OpenAnt

knostic

💛71

OpenAnt from Knostic is an open source LLM-based vulnerability discovery product that helps defenders proactively find verified security flaws while minimizing both false positives and false negatives. Stage 1 detects. Stage 2 attacks. What survives is real.

439

Apache-2.0

Python

Updated 8 hours ago

aicybercybersecurity+1

Open-Prompt-Injection

liu00222

💛71

This repository provides a benchmark for prompt injection attacks and defenses in LLMs

421

MIT

Python

Updated 1 day ago

llmllm-securityllms+3

llm-adaptive-attacks

tml-epfl

🧡56

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]

380

MIT

Shell

Updated 1 week ago

COLD-Attack

Yu-Fangxu

🧡65

[ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability

177

Python

Updated 4 days ago

augustus

praetorian-inc

🧡65

LLM security testing framework for detecting prompt injection, jailbreaks, and adversarial attacks — 190+ probes, 28 providers, single Go binary

173

Apache-2.0

Updated 2 hours ago

ai-securitycapability

Hallucination-Attack

PKU-YuanGroup

❤️45

Attack to induce LLMs within hallucinations

164

MIT

Python

Updated 1 month ago

adversarial-attacksai-safetydeep-learning+5

claudini

romovpa

💛70

Autoresearch for LLM adversarial attacks

162

Apache-2.0

Python

Updated 12 hours ago

JailTrickBench

usail-hkust

💛70

Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)

162

MIT

Python

Updated 2 days ago

BrokenHill

BishopFox

🧡60

A productionized greedy coordinate gradient (GCG) attack tool for large language models (LLMs)

159

MIT

Python

Updated 2 weeks ago

BIPIA

microsoft

💛70

A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.

117

NOASSERTION

Python

Updated 2 days ago

llm-security

PromptAttack

GodXuxilie

🧡60

An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)

114

Python

Updated 2 days ago

gandalf-llm-pentester

MrMoshkovitz

💛70

Automated red-team toolkit for stress-testing LLM defences - Vector Attacks on LLMs (Gendalf Case Study)

113

NOASSERTION

Jupyter Notebook

Updated 6 days ago

LLM-Conversation-Safety

niconi19

🧡65

[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey

111

Updated 5 days ago

AISecurity

DmitrL-dev

🧡65

AI Security Platform: Defense (61 Rust engines + Micro-Model Swarm) + Offense (39K+ payloads)

103

NOASSERTION

Python

Updated 2 days ago

adversarial-attacksagentic-ai-securityai-firewall+17

ArtPrompt

uw-nsl

💛70

[ACL24] Official Repo of Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`

MIT

Python

Updated 15 hours ago

JailBreakV_28K

SaFo-Lab

❤️45

[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and further assess the robustness and safety of MLLMs against a variety of jailbreak attacks.

Python

Updated 1 month ago

jailbreakv-28k

CJA_Comprehensive_Jailbreak_Assessment

Junjie-Chu

❤️35

This is the public code repository of paper 'Comprehensive Assessment of Jailbreak Attacks Against LLMs'

Python

Updated 7 months ago

AmpleGCG

OSU-NLP-Group

🧡65

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM

NOASSERTION

Python

Updated 14 hours ago

adversarial-attacksgcgnlp+1

attack_flow_detector

ezztahoun

🧡51

Find relevant incidents, logs, events, and alerts to all of your incidents. [Attack Flows, Attack Chains, & Root Cause Discovery - NO LLMs, NO Queries, Just Explainable Machine Learning] >> Use it for free here: https://app.cypienta.io

MIT

Python

Updated 1 month ago

correlationcyber-analyticscybersecurity+4

Threats_2_MITRE_AI_Mapper

LiuYuancheng

💛70

The objective of this program is to leverage AI-LLM technology to process of human language-based CTI documents to succinctly summarize the attack flow path outlined within such materials via mapping the attack behaviors to the MITRE-ATT&CK and matching the vulnerabilities to MITRE-CWE.

MIT

Python

Updated 4 days ago

SECA

Buyun-Liang

🧡65

[NeurIPS 2025] SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

MIT

Python

Updated 4 days ago

adversarial-attackslarge-language-modelsllm-hallucination+1

AutoDefense

XHMY

🧡60

AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks

MIT

Python

Updated 2 weeks ago

panda-guard

Beijing-AISI

🧡55

Panda Guard is designed for researching jailbreak attacks, defenses, and evaluation algorithms for large language models (LLMs).

MIT

Python

Updated 1 week ago

LLMSecurityGuide

requie

🧡65

A comprehensive reference for securing Large Language Models (LLMs). Covers OWASP GenAI Top-10 risks, prompt injection, adversarial attacks, real-world incidents, and practical defenses. Includes catalogs of red-teaming tools, guardrails, and mitigation strategies to help developers, researchers, and security teams deploy AI responsibly.

Updated 1 day ago

ai-safetyai-securityai-security-tool+10

Meta_SecAlign

facebookresearch

🧡55

Repo for the paper "Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks".

NOASSERTION

Python

Updated 20 hours ago

CodeBreaker

datasec-lab

❤️45

[USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection

Python

Updated 1 month ago

GitHub Explorer

Search Results

llm-attacks

BruteForceAI

agentdojo

PromptInject

OpenAnt

Open-Prompt-Injection

llm-adaptive-attacks

COLD-Attack

augustus

Hallucination-Attack

claudini

JailTrickBench

BrokenHill

BIPIA

PromptAttack

gandalf-llm-pentester

LLM-Conversation-Safety

AISecurity

ArtPrompt

JailBreakV_28K

CJA_Comprehensive_Jailbreak_Assessment

AmpleGCG

attack_flow_detector

Threats_2_MITRE_AI_Mapper

SECA

AutoDefense

panda-guard

LLMSecurityGuide

Meta_SecAlign

CodeBreaker

llm-attacks

BruteForceAI

agentdojo

PromptInject

OpenAnt

Open-Prompt-Injection

llm-adaptive-attacks

COLD-Attack

augustus

Hallucination-Attack

claudini

JailTrickBench

BrokenHill

BIPIA

PromptAttack

gandalf-llm-pentester

LLM-Conversation-Safety

AISecurity

ArtPrompt

JailBreakV_28K

CJA_Comprehensive_Jailbreak_Assessment

AmpleGCG

attack_flow_detector

Threats_2_MITRE_AI_Mapper

SECA

AutoDefense

panda-guard

LLMSecurityGuide

Meta_SecAlign

CodeBreaker