Search Results

Found 35 repositories(showing 30)

data-augmentation-with-llms

ServiceNow

❤️30

Data Augmentation for Intent Classification with Off-the-Shelf Large Language Models is a ServiceNow Research project

Apache-2.0

Python

Updated 5 months ago

dalda

kkyuhun94

❤️45

[ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling

Python

Updated 1 month ago

data-augmentationdiffusion-modellarge-language-model+1

Sparrow

VITA-MLLM

❤️35

Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation

Apache-2.0

Jupyter Notebook

Updated 4 months ago

llm-rag

J4NN0

❤️35

LLMs prompt augmentation with RAG by integrating external custom data from a variety of sources, allowing chat with such documents

MIT

Python

Updated 11 months ago

chat-applicationchatappchatbot+17

VeriLogos

97kjmin

🧡55

This repository includes the data and scripts utilized in the study titled "Improving LLM-based Verilog Code Generation with Data Augmentation and RL (DATE25)".

BSD-3-Clause

Python

Updated 3 weeks ago

ODSC_East2024

TTriantoro

❤️35

Data Synthesis, Augmentation, and NLP Insights with LLMs

Updated 1 year ago

Enterprise-Enhanced-LLaMA-Factory-Pro-Advanced-FineTuning-Local-Deployment-Pipeline

SuleynanAuir

❤️40

LLaMA Factory is an end-to-end LLM fine-tuning and deployment pipeline, integrating data augmentation & engineering, cloud-based training, and cloud data management. With one-click fine-tuning and local deployment, it enables rapid iteration of models while keeping your infrastructure flexible and secure. Perfect for enterprise-grade applications.

MIT

Python

Updated 3 weeks ago

autodleasy-datafine-tuning+2

Text_image_classification

Noxfr69

❤️35

Text algorithms, CNN and transfer learning, data augmentation. Authored a paper on Text classification with LLM

Jupyter Notebook

Updated 1 year ago

web24-short-fact-gpt

EunCheolChoi0123

❤️35

The repository contains scripts and data used in the paper FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs (https://doi.org/10.1145/3589335.3651504).

Jupyter Notebook

Updated 1 year ago

-distilbert-health-payer-intent-classification

rradhakr-git

❤️40

Fine-tune Distilbert model for health payer intent classification. The projects uses open weight locally hosted LLM models for data augmentation. Fine tuning using pytorch libraries with huggingface interfaces.

Apache-2.0

Jupyter Notebook

Updated 3 months ago

ADMIRE-Ranking-Idiomatic-Phrases-as-per-Usage

arygup

❤️35

In this, we contributed to sub-Task A of Sem-Eval 2025 Task 1. The task required ranking captions based on how effectively they utilized a nominal compound in alignment with its usage in a query sentence. Our approach involved data augmentation, developing a novel loss function, and fine-tuning LLMs.

Jupyter Notebook

Updated 3 months ago

Data-Augmentation-with-LLMs

YEnesK

❤️35

Text train and test data augmented with LLMs and accuracy change has been observed.

Jupyter Notebook

Updated 1 year ago

Data-Augmentation-with-LLMs

JunJin1218

❤️25

No description available

Python

Updated 4 months ago

data-augmentation-with-paraphrase-llms

YEnesK

❤️40

Implementation LLM-based text-augmentation pipeline that enlarges IMDB and AG News datasets with PEGASUS/T5 paraphrasers, embeds all texts with all-MiniLM-L6-v2, trains an MLP classifier, and reports accuracy gains of up to ~10 pp thanks to augmented train / test ensembles.

MIT

Jupyter Notebook

Updated 8 months ago

dataset-codemixed-indonesian-javanese-and-indonesian-sundanese-augmentation-with-llms

abusifyid

❤️25

No description available

Updated 5 months ago

tabular-data-augmentation

seunghee6022

❤️40

Tabular Data Augmentation with LLM

MIT

Updated 9 months ago

LLM-DiffAug

jayyuci

❤️35

data augmentation with LLM-Diffusion Augmention

Updated 1 year ago

ada-fusion

vaventt

❤️40

Adaptive Data Augmentation with Diffusion Models and LLMs

MIT

Python

Updated 1 year ago

llm2llm-kor-med

yerimmms

❤️40

Efficient Korean Domain-Specific LLM with Novel Iterative Data Augmentation

MIT

Python

Updated 1 year ago

receipt-contextualizer

leokam

❤️40

Get item-level analytics for your scanned supermarket receipts with LLM data augmentation

MIT

Python

Updated 2 years ago

LLM_Project

ercanasli

❤️35

Prompt engineering and data augmentation with LLMs like OpenAI's GPT involve techniques to improve model performance.

Jupyter Notebook

Updated 9 months ago

project-argus-bias-detection

Jmichael-Labs

❤️40

👁️ Advanced LLM Bias Detection for Meta Superintelligence Labs - Multi-lingual bias analysis with counterfactual data augmentation

NOASSERTION

Python

Updated 8 months ago

DAugSindhi

rajavavek

❤️40

DAugSindhi addresses the challenges of Sindhi text classification in Natural Language Processing (NLP) due to limited annotated datasets. The study uses data augmentation techniques like Easy Data Augmentation (EDA), Back Translation, Paraphrasing, and Text Generation with Large Language Models (LLMs) to artificially expand the dataset.

Apache-2.0

Jupyter Notebook

Updated 9 months ago

AgenticGraphHub_Backend

NIGASH333

❤️45

Agentic Graph RAG as a Service — a system that integrates LLMs with graph databases like Neo4j to enable intelligent retrieval, reasoning, and knowledge augmentation across structured and unstructured data.

Python

Updated 2 months ago

look-function-augmentation

sarahsorahi

❤️35

LLM-based data augmentation for different functional uses of the English verb look. Synthetic samples are generated with Ollama and labeled for function classification.

Updated 3 months ago

FealtyX-assignment

Talha9509

❤️35

A high-performance RESTful API built with FastAPI to manage student data with full CRUD functionality. Integrates with Ollama for real-time AI-generated student summaries using LLMs. Designed for scalability, clean architecture, and seamless AI augmentation.

Python

Updated 8 months ago

MagMatLLM

Qiulin-1018

❤️35

Here we present \emph{MagMatLLM}, a hybrid framework that integrates LLM-driven crystal generation with genetic algorithms and targeted data augmentation for iterative, property-driven exploration of vast chemical spaces.

Updated 6 months ago

vecbloom

luckykumar25

❤️35

VecbBloom is a research-driven toolkit combining vector-enhanced Bloom filters with synthetic data generation via large language model (LLM) distillation. This hybrid system is designed for efficient, privacy-preserving data filtering and augmentation in modern AI workflows.

Updated 11 months ago

ai_enhanced_credit_card_fraud_detection

omkargp1

❤️45

AI-powered credit card fraud detection using fine-tuned LLMs (Qwen2.5 + QLoRA) for synthetic fraud data augmentation, classical ML classifiers (Random Forest, Gradient Boosting, Logistic Regression), a DuckDB transaction store, and a Gradio dashboard with a natural language → SQL query agent. Built on Google Colab with T4 GPU support.

Jupyter Notebook

Updated 1 month ago

Comparison-between-RoBERTa-and-DeBERTa-for-Sentiment-Analysis

sstanoevska

❤️45

This project explores sentiment classification using transformer-based LLMs. The GoEmotions dataset was processed and mapped into positive, negative, and neutral classes. RoBERTa and DeBERTa were fine-tuned with data augmentation, class balancing, and weighted loss, and their performance was compared using accuracy and F1-macro.

Jupyter Notebook

Updated 1 month ago

GitHub Explorer

Search Results

data-augmentation-with-llms

dalda

Sparrow

llm-rag

VeriLogos

ODSC_East2024

Enterprise-Enhanced-LLaMA-Factory-Pro-Advanced-FineTuning-Local-Deployment-Pipeline

Text_image_classification

web24-short-fact-gpt

-distilbert-health-payer-intent-classification

ADMIRE-Ranking-Idiomatic-Phrases-as-per-Usage

Data-Augmentation-with-LLMs

Data-Augmentation-with-LLMs

data-augmentation-with-paraphrase-llms

dataset-codemixed-indonesian-javanese-and-indonesian-sundanese-augmentation-with-llms

tabular-data-augmentation

LLM-DiffAug

ada-fusion

llm2llm-kor-med

receipt-contextualizer

LLM_Project

project-argus-bias-detection

DAugSindhi

AgenticGraphHub_Backend

look-function-augmentation

FealtyX-assignment

MagMatLLM

vecbloom

ai_enhanced_credit_card_fraud_detection

Comparison-between-RoBERTa-and-DeBERTa-for-Sentiment-Analysis

data-augmentation-with-llms

dalda

Sparrow

llm-rag

VeriLogos

ODSC_East2024

Enterprise-Enhanced-LLaMA-Factory-Pro-Advanced-FineTuning-Local-Deployment-Pipeline

Text_image_classification

web24-short-fact-gpt

-distilbert-health-payer-intent-classification

ADMIRE-Ranking-Idiomatic-Phrases-as-per-Usage

Data-Augmentation-with-LLMs

Data-Augmentation-with-LLMs

data-augmentation-with-paraphrase-llms

dataset-codemixed-indonesian-javanese-and-indonesian-sundanese-augmentation-with-llms

tabular-data-augmentation

LLM-DiffAug

ada-fusion

llm2llm-kor-med

receipt-contextualizer

LLM_Project

project-argus-bias-detection

DAugSindhi

AgenticGraphHub_Backend

look-function-augmentation

FealtyX-assignment

MagMatLLM

vecbloom

ai_enhanced_credit_card_fraud_detection

Comparison-between-RoBERTa-and-DeBERTa-for-Sentiment-Analysis