Search Results

Found 2,654 repositories(showing 30)

TextAttack

QData

💛78

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

3.4k

441

MIT

Python

Updated 2 days ago

adversarial-attacksadversarial-examplesadversarial-machine-learning+5

ray-educational-materials

ray-project

🧡66

This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.

457

Apache-2.0

Jupyter Notebook

Updated 6 days ago

deep-learningdistributed-machine-learninggenerative-ai+9

SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in combination with self-training and knowledge-distillation, or for retrieving paraphrases.

359

NOASSERTION

Python

Updated 2 months ago

pignlproc

ogrisel

❤️46

Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.

161

Java

Updated 2 months ago

DinkyTrain

princeton-nlp

🧡60

Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃

117

MIT

Python

Updated 2 weeks ago

GTS-Engine

IDEA-CCNL

❤️35

GTS Engine: A powerful NLU Training System。GTS引擎（GTS-Engine）是一款开箱即用且性能强大的自然语言理解引擎，聚焦于小样本任务，能够仅用小样本就能自动化生产NLP模型。

Apache-2.0

Python

Updated 5 months ago

natural-language-processingnlinlp+6

yoruba-text

Niger-Volta-LTI

🧡55

Yorùbá language training text for NLP, ASR and TTS tasks

GPL-3.0

Python

Updated 2 weeks ago

african-languagesasrdiacritization+7

LightLM

CLUEbenchmark

❤️35

高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task

Python

Updated 7 months ago

bertchinesechinese-language-model+4

document2slides

IBM

❤️30

This repository contains the code to reconstruct the training dataset from NLP/ML Papers in PDF format together with their corresponding slides.

Apache-2.0

Python

Updated 3 months ago

debias

chrisc36

❤️35

Methods of training NLP models to ignored biased strategies

Apache-2.0

Python

Updated 1 year ago

Fake-News-Detector

AmirhosseinHonardoust

🧡60

A complete NLP and Machine Learning project to detect fake and real news using TF-IDF and Logistic Regression. Includes full training pipeline, evaluation charts, and an interactive Streamlit web app for real-time credibility analysis. Dataset adapted from Kaggle’s Fake and Real News Dataset.

MIT

Python

Updated 1 week ago

ai-projectdata-sciencedata-visualization+12

Udacity-Deep-Learning-Nanodegree

MrinmoiHossain

❤️35

The course is contained knowledge that are useful to work on deep learning as an engineer. Simple neural networks & training, CNN, Autoencoders and feature extraction, Transfer learning, RNN, LSTM, NLP, Data augmentation, GANs, Hyperparameter tuning, Model deployment and serving are included in the course.

GPL-3.0

Jupyter Notebook

Updated 8 months ago

convolutional-networksconvolutional-neural-networksdeep-learning+16

NLP-Series-NewWordsMining-PTMPretraining

zhoujx4

❤️35

NLP实验：新词挖掘+预训练模型继续Pre-training

Python

Updated 3 months ago

keras_adversarial_training

bojone

❤️35

Adversarial Training for NLP in Keras

Python

Updated 2 years ago

wikipedia2corpus

GermanT5

❤️35

Wikipedia text corpus for self-supervised NLP model training

MIT

Python

Updated 5 months ago

corpusgerman-nlpmachine-learning+4

stanza-train

stanfordnlp

❤️25

Model training tutorials for the Stanza Python NLP Library

Python

Updated 4 months ago

natural-language-processingnlpstanza

bothub

bothub-it

❤️25

Bothub is an open platform for predicting, training and sharing NLP datasets in multiple languages

Makefile

Updated 2 months ago

bothubbotschatbot+12

Building-a-Small-Language-Model-SLM-

ChaitanyaK77

💛70

This Repository provides a Jupyter Notebook for building a small language model from scratch using 'TinyStories' dataset. Covers data preprocessing, BPE tokenization, binary storage, GPU memory management, and training a Transformer in PyTorch. Generate sample stories to test your model. Ideal for learning NLP and PyTorch.

MIT

Jupyter Notebook

Updated 5 days ago

gpu-computingllmsmall-language-models+2

ner-tools

georgebrock

❤️35

Tools for training Stanford NLP's NER models

Makefile

Updated 1 year ago

Ling-Gender

ksdkamesh99

❤️35

A Natural Language Processing model trained with over 1,00,000 (1 Lakh) names is used to predict a gender of a person based on the first name of the person.This model is created using Long Short Term Memory(LSTM) a variant of Recurrent Nueral Network which has training accuracy of 99.35% and tested over 11,000 samples with a test accuracy of 89.08% which is quite high in nlp for out of sample test cases.

MIT

Jupyter Notebook

Updated 2 months ago

deep-learninggender-classificationgender-detection+5

Prompt-Engineering-with-50000-Prompts

rohanmistry231

💛70

A comprehensive collection of 50,000 prompts for AI model training and prompt engineering, designed to enhance NLP model performance and creativity. Includes categorized prompts and tools for generating, testing, and optimizing prompts for various AI applications.

MIT

Updated 6 days ago

promptprompt-engineeringprompt-learning

TextAttack-A2T

QData

❤️40

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

MIT

Python

Updated 1 month ago

Russian_subtitles_dataset

dbklim

🧡50

Preprocessing of the dataset of 347 subtitles for the TV series (thanks to Taiga Corpus) to build a word2vec model, JamSpell model, neural network training, chat bot training or in any other NLP task.

Apache-2.0

Python

Updated 1 month ago

botcnncorpus+14

ai-engineering-workshop

dwhitena

❤️40

Materials for the "Modern NLP: Pre-training, Fine-tuning, Prompt Engineering, and Human Feedback" workshop at ODSC East 2023

Apache-2.0

Jupyter Notebook

Updated 7 months ago

pytorch-nlp-multitask

AaronGrainer

❤️35

A simple project training 3 separate NLP tasks simultaneously using Multitask-Learning

Apache-2.0

Python

Updated 1 year ago

concept-tagging-training

nasa

❤️25

Contains code for training NLP models that takes in text and predicts concepts & keywords from a list of standardized NASA keywords. Code for the API that uses models trained by this repo is in `concept-tagging-api` repository.

MIT

Python

Updated 7 months ago

concept-taggingmachine-learningmakefile+4

nmatheg

ARBML

❤️45

A simple strategy for training and finetuning NLP models for Arabic. Specify the parameters and just wait for the results. A simple design that makes use of the different tools in our NLP pipeline.

Jupyter Notebook

Updated 1 month ago

arabicarabic-nlpauto-train+1

nlpboost

avacaondata

❤️35

Python library for automatic training, optimization and comparison of Transformer models on most NLP tasks.

MIT

Python

Updated 5 months ago

deep-learninghyperparameter-optimizationhyperparameter-tuning+7

atra

flozi00

❤️40

An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands

MIT

Jupyter Notebook

Updated 1 year ago

asrchatbotinference+5

NLP_Training

Droidtown

❤️35

NLP Training/Teaching Materials with Articut

Python

Updated 6 months ago

GitHub Explorer

Search Results

TextAttack

ray-educational-materials

SentAugment

pignlproc

DinkyTrain

GTS-Engine

yoruba-text

LightLM

document2slides

debias

Fake-News-Detector

Udacity-Deep-Learning-Nanodegree

NLP-Series-NewWordsMining-PTMPretraining

keras_adversarial_training

wikipedia2corpus

stanza-train

bothub

Building-a-Small-Language-Model-SLM-

ner-tools

Ling-Gender

Prompt-Engineering-with-50000-Prompts

TextAttack-A2T

Russian_subtitles_dataset

ai-engineering-workshop

pytorch-nlp-multitask

concept-tagging-training

nmatheg

nlpboost

atra

NLP_Training

TextAttack

ray-educational-materials

SentAugment

pignlproc

DinkyTrain

GTS-Engine

yoruba-text

LightLM

document2slides

debias

Fake-News-Detector

Udacity-Deep-Learning-Nanodegree

NLP-Series-NewWordsMining-PTMPretraining

keras_adversarial_training

wikipedia2corpus

stanza-train

bothub

Building-a-Small-Language-Model-SLM-

ner-tools

Ling-Gender

Prompt-Engineering-with-50000-Prompts

TextAttack-A2T

Russian_subtitles_dataset

ai-engineering-workshop

pytorch-nlp-multitask

concept-tagging-training

nmatheg

nlpboost

atra

NLP_Training