Found 3 repositories(showing 3)
Implementation LLM-based text-augmentation pipeline that enlarges IMDB and AG News datasets with PEGASUS/T5 paraphrasers, embeds all texts with all-MiniLM-L6-v2, trains an MLP classifier, and reports accuracy gains of up to ~10 pp thanks to augmented train / test ensembles.
rajavavek
DAugSindhi addresses the challenges of Sindhi text classification in Natural Language Processing (NLP) due to limited annotated datasets. The study uses data augmentation techniques like Easy Data Augmentation (EDA), Back Translation, Paraphrasing, and Text Generation with Large Language Models (LLMs) to artificially expand the dataset.
mojing122
HSOP (Hidden State Optimized Prompt-tuning), a new framework that mixes soft prompt-tuning with hidden state-based data augmentation for more reliable detection. HSOP utilizes a small-scale LLM ($\leq$4B parameters) to generate paraphrased versions of each input text as augmented samples.
All 3 repositories loaded