Found 35 repositories(showing 30)
ServiceNow
Data Augmentation for Intent Classification with Off-the-Shelf Large Language Models is a ServiceNow Research project
kkyuhun94
[ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling
VITA-MLLM
Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation
J4NN0
LLMs prompt augmentation with RAG by integrating external custom data from a variety of sources, allowing chat with such documents
97kjmin
This repository includes the data and scripts utilized in the study titled "Improving LLM-based Verilog Code Generation with Data Augmentation and RL (DATE25)".
TTriantoro
Data Synthesis, Augmentation, and NLP Insights with LLMs
LLaMA Factory is an end-to-end LLM fine-tuning and deployment pipeline, integrating data augmentation & engineering, cloud-based training, and cloud data management. With one-click fine-tuning and local deployment, it enables rapid iteration of models while keeping your infrastructure flexible and secure. Perfect for enterprise-grade applications.
Noxfr69
Text algorithms, CNN and transfer learning, data augmentation. Authored a paper on Text classification with LLM
EunCheolChoi0123
The repository contains scripts and data used in the paper FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs (https://doi.org/10.1145/3589335.3651504).
rradhakr-git
Fine-tune Distilbert model for health payer intent classification. The projects uses open weight locally hosted LLM models for data augmentation. Fine tuning using pytorch libraries with huggingface interfaces.
In this, we contributed to sub-Task A of Sem-Eval 2025 Task 1. The task required ranking captions based on how effectively they utilized a nominal compound in alignment with its usage in a query sentence. Our approach involved data augmentation, developing a novel loss function, and fine-tuning LLMs.
Text train and test data augmented with LLMs and accuracy change has been observed.
JunJin1218
No description available
Implementation LLM-based text-augmentation pipeline that enlarges IMDB and AG News datasets with PEGASUS/T5 paraphrasers, embeds all texts with all-MiniLM-L6-v2, trains an MLP classifier, and reports accuracy gains of up to ~10 pp thanks to augmented train / test ensembles.
No description available
seunghee6022
Tabular Data Augmentation with LLM
jayyuci
data augmentation with LLM-Diffusion Augmention
vaventt
Adaptive Data Augmentation with Diffusion Models and LLMs
yerimmms
Efficient Korean Domain-Specific LLM with Novel Iterative Data Augmentation
leokam
Get item-level analytics for your scanned supermarket receipts with LLM data augmentation
ercanasli
Prompt engineering and data augmentation with LLMs like OpenAI's GPT involve techniques to improve model performance.
Jmichael-Labs
👁️ Advanced LLM Bias Detection for Meta Superintelligence Labs - Multi-lingual bias analysis with counterfactual data augmentation
rajavavek
DAugSindhi addresses the challenges of Sindhi text classification in Natural Language Processing (NLP) due to limited annotated datasets. The study uses data augmentation techniques like Easy Data Augmentation (EDA), Back Translation, Paraphrasing, and Text Generation with Large Language Models (LLMs) to artificially expand the dataset.
NIGASH333
Agentic Graph RAG as a Service — a system that integrates LLMs with graph databases like Neo4j to enable intelligent retrieval, reasoning, and knowledge augmentation across structured and unstructured data.
sarahsorahi
LLM-based data augmentation for different functional uses of the English verb look. Synthetic samples are generated with Ollama and labeled for function classification.
Talha9509
A high-performance RESTful API built with FastAPI to manage student data with full CRUD functionality. Integrates with Ollama for real-time AI-generated student summaries using LLMs. Designed for scalability, clean architecture, and seamless AI augmentation.
Qiulin-1018
Here we present \emph{MagMatLLM}, a hybrid framework that integrates LLM-driven crystal generation with genetic algorithms and targeted data augmentation for iterative, property-driven exploration of vast chemical spaces.
luckykumar25
VecbBloom is a research-driven toolkit combining vector-enhanced Bloom filters with synthetic data generation via large language model (LLM) distillation. This hybrid system is designed for efficient, privacy-preserving data filtering and augmentation in modern AI workflows.
AI-powered credit card fraud detection using fine-tuned LLMs (Qwen2.5 + QLoRA) for synthetic fraud data augmentation, classical ML classifiers (Random Forest, Gradient Boosting, Logistic Regression), a DuckDB transaction store, and a Gradio dashboard with a natural language → SQL query agent. Built on Google Colab with T4 GPU support.
This project explores sentiment classification using transformer-based LLMs. The GoEmotions dataset was processed and mapped into positive, negative, and neutral classes. RoBERTa and DeBERTa were fine-tuned with data augmentation, class balancing, and weighted loss, and their performance was compared using accuracy and F1-macro.