Found 13 repositories(showing 13)
darkmatter2222
Python pipeline for synthetic data generation with a custom Llama sentence generator. It creates field values, prompts & validated sentences (stored in JSON) and includes a training template focused on PII redaction, data sensitivity & compliance.
Michael-A-Kuykendall
FeedMe: A hungry, memory-safe streaming data pipeline in Rust. Efficient streaming ETL with ownership transfer, bounded resources, PII redaction, validation, dead-letter queues, and Prometheus metrics. Production-ready with comprehensive testing.
miniarjabri
No description available
simpli-support
PII detection, data redaction, and privacy risk scanning for AI-safe support data pipelines
Mmm11222
Advanced Data Analysis Pipeline using Python to process 32M+ rows of Instacart sales data. Features: Data Wrangling, PII Redaction, and Customer Profiling.
ahlemtr
A secure data pipeline in Python that automates PII redaction and implements AES-based symmetric encryption to protect sensitive financial data-in-transit.
DevanshuNEU
Serverless data ingestion pipeline on GCP - Cloud Run, Pub/Sub, Firestore. Handles 1000+ RPM with multi-tenant isolation and automatic PII redaction.
Automated PDF redaction pipeline using Python, AI (Gemini), and OCR. Permanently removes sensitive data (PII) from text, images, and metadata instead of just hiding it.
abdulmdev
An event-driven, zero-trust microservices pipeline using Spring Boot and Kafka to asynchronously extract structured data from medical claims via LLMs, featuring edge-level PII redaction and automated rate-limit handling.
Built an event-driven PII redaction pipeline using Apache Kafka and Spark Structured Streaming. The system detects and masks sensitive data in real time, with Dockerized microservices for scalability and seamless data flow across ingestion, processing, and storage layers.
ddihora1604
A whitebox LLMOps framework designed to enhance security and transparency in AI pipelines. It integrates secure RAG, automated PII redaction, and governance layers to protect sensitive data while enabling auditable, privacy-preserving, and compliant large language model deployments.
A zero-trust multi-agent system that continuously audits enterprise data pipelines for PII leaks and compliance violations. It auto-generates data redaction policies, simulates them in ephemeral sandboxes, and uses XAI to send human-readable risk summaries to Data Protection Officers via Slack/Teams for 1-click remediation approval.
edwardtatem38-pixel
Enterprise-grade HIPAA document pipeline built in n8n. Features a custom JavaScript redaction layer to scrub PII (SSNs/Phone) before processing via Groq Llama 3.1. Automatically generates clinical summaries in Google Docs and pushes real-time data to external REST APIs via Webhooks, demonstrating secure, decoupled AI workflow architecture.
All 13 repositories loaded