Found 24 repositories(showing 24)
Coldwave96
Scripts for evaluating LLM security abilities.
No description available
A secure multi-PDF Retrieval-Augmented Generation (RAG) chatbot that enables question answering over documents with built-in safety guards. The system integrates LLM-based input/output validation and automatic evaluation of responses on faithfulness, coherence, and completeness.
Secure AI sandbox for patient portal messaging, developed in collaboration with Johns Hopkins Bayview Medical Center and a faculty member from the Johns Hopkins School of Nursing, using IRB-approved data. Due to data governance and institutional policies, the full source code and datasets are not publicly shared.
manjunathnp
No description available
Harshitha0531
No description available
Nikhil-UCEOU
No description available
LLM-as-Judge framework for automated chatbot security evaluation (MLOps + Docker)
IDSDataset
Supplementary appendix describing the full evaluation setup, threat model, defense layers, metrics, and release-gate procedures for secure agentic LLM systems. Includes system configuration, attacker capabilities, telemetry, reproducibility details, and parameter glossary.
No description available
SiddharthWayne
No description available
Evaluating LLMs on SEC 10K filings using RAG
No description available
anonymous-project-2026
No description available
No description available
No description available
achamorrofdz14
Testing LLM evaluation framework for financial RAG systems using Promptfoo. Evaluates models, retrievers, prompts, and security with SEC 10-Q filings.
faheemahmad02042019
RAG-powered financial filing analyst using LangChain, LlamaIndex, and LLMs for SEC 10-K analysis with numerical reasoning, guardrails, and evaluation pipelines
kumaraadya
Production RAG system with multi-stage retrieval (FAISS dense + BM25 sparse + cross-encoder reranking) over SEC 10-K filings. Fine-tuned transformer models, LLM integration (GPT-4), evaluation framework, and FastAPI deployment. Python | PyTorch | FAISS | FastAPI
nilesh-auradkar05
AI-powered financial analyst agent that autonomously researches companies, analyzes SEC filings, evaluates market sentiment, and generates investment memos with citations. Built with LangGraph, RAG, and local LLMs.
An end-to-end local AI assistant running open-source LLMs via Ollama with a FastAPI interface. Benchmarks multiple models (Llama3, Mistral, Phi) using metrics like latency, tokens/sec, and time-to-first-token. Includes Pydantic-validated structured outputs, retry logic, and a model evaluation framework.
xnaleb-ml
A dual-architecture agentic RAG pipeline for auditing SEC filings. Version 1 features a FastAPI backend powered by cloud LLMs. Version 2 is fully localized, serving open-weight models via vLLM. Built with LangGraph and evaluated on FinanceBench.
iliaadam
Build a Financial QA System using SEC Filings data. Employ two pre-trained Language Models (LLMs) like BERT and ELECTRA to compare performance using BLEU scores and latency. Find the code to implement, preprocess, and evaluate the QA system with ease.
RhondaMeloMsc
Comprehensive adversarial red-teaming audit of a simulated institutional financial advisory LLM. This repository documents high-fidelity attack vectors—including persona-based context injection and narrative stress induction—to evaluate regulatory compliance (SEC/MiFID II) and logic integrity in high-stakes AI governance scenarios.
All 24 repositories loaded