Found 8 repositories(showing 8)
Generate synthetic test data for Retrieval-Augmented Generation (RAG) pipelines using LangChain and ollama, llama3 and gemma2
TaylorBeck
FastAPI backend for structured synthetic text data generation using LLMs
privet1mir
Comparative study of LLM-based synthetic data generation strategies for emotion classification in Russian texts.
mcleverson
End-to-end pipeline to fine-tune a small LLM for character-style text generation, using synthetic data and LoRA.
danielrosehill
GUI to facilitate capturing voice data for TTS / voice clone training with LLM synthetic text generation and saving logic (Ubuntu Linux)
nitinp14920914
Hybrid AI framework that combines synthetic data generation using LLMs, a domain-informed rule engine, and a spaCy- and LLM-driven text classification pipeline to predict depressive tendencies from textual data LLM-as-a-Judge pipeline, using in-house tools to validate synthetic samples for clinical plausibility, cultural relevance.
hoaanna
Designed and implemented an automatic Question Answering (QA) system for Vietnamese historical texts using LLMs and RAG, including OCR preprocessing, synthetic data generation, and model «ne-tuning with LoRA to achieve high accuracy in information retrieval.
kaiyr666
A fine-tuned LLM chatbot expert on the history of Kazakhstan, capable of answering detailed questions based on specialized historical texts. This project uses OCR for data extraction, GPT-4o-mini for synthetic dataset generation, and Unsloth for efficient fine-tuning of Llama 3.1.
All 8 repositories loaded