Search Results

Found 347 repositories(showing 30)

monolingual

reader-dict

🧡66

The most comprehensive universal, multilingual, and monolingual dictionaries—perfect for e-readers, phones, tablets, and desktop apps. Powered by Wiktionary.

871

MIT

Python

Updated 1 day ago

bilingualdictdictionary+15

ipa-dict

open-dict-data

🧡67

Monolingual wordlists with pronunciation information in IPA

742

111

MIT

Updated 49 minutes ago

dictionariesg2pgrapheme-to-phoneme+8

Monolingual

IngmarStein

🧡56

Remove unnecessary language resources from macOS.

565

GPL-3.0

Swift

Updated 1 week ago

macosswiftutility+1

stopes

facebookresearch

❤️46

A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.

300

MIT

Python

Updated 2 weeks ago

datasetdataset-generationmachine-learning+4

TED-Multilingual-Parallel-Corpus

ajinkyakulkarni14

🧡56

TED parallel Corpora is growing collection of Bilingual parallel corpora, Multilingual parallel corpora and Monolingual corpora extracted from TED talks www.ted.com for 109 world languages.

255

Updated 2 weeks ago

BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s layers? A closer look at the NLP pipeline in monolingual and multilingual models"

143

Apache-2.0

Python

Updated 1 month ago

good-translation-wrong-in-context

lena-voita

🧡50

This is a repository with the data and code for the ACL 2019 paper "When a Good Translation is Wrong in Context: ..." and the EMNLP 2019 paper "Context-Aware Monolingual Repair for Neural Machine Translation"

Python

Updated 2 weeks ago

wechsel

CPJKU

🧡60

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

MIT

Python

Updated 2 weeks ago

bertlanguage-modelnatural-language-processing+3

copyisallyouneed

jcyk

❤️30

Code for our ACL2021 paper Neural Machine Translation with Monolingual Translation Memory

Python

Updated 11 months ago

monolingual-word-aligner

ma-sultan

❤️25

No description available

Python

Updated 2 years ago

MetaVec

ikergarcia1996

❤️40

A monolingual and cross-lingual meta-embedding generation and evaluation framework

GPL-3.0

Python

Updated 5 months ago

embeddingembedding-evaluationembedding-models+8

bivec

lmthang

❤️45

Train bilingual embeddings as described in our NAACL 2015 workshop paper "Bilingual Word Representations with Monolingual Quality in Mind". Besides, it has all the functionalities of word2vec with added features and code clarity. See README for more info.

Apache-2.0

Matlab

Updated 1 month ago

UNMT

IlyaGusev

❤️35

Code inspired by Unsupervised Machine Translation Using Monolingual Corpora Only

Apache-2.0

Jupyter Notebook

Updated 2 years ago

lm-prior-for-nmt

cbaziotis

❤️35

This repository contains source code for the paper "Language Model Prior for Low-Resource Neural Machine Translation"

MIT

Jupyter Notebook

Updated 12 months ago

language-modellm-priorsmachine-translation+5

parallel-corpora-tools

M4t1ss

❤️40

Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.

MIT

PHP

Updated 1 year ago

cleaningcorporacorpus-tools+14

TransferLearning-CLVC

cjerry1243

❤️25

Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion

Python

Updated 10 months ago

Lingua-Corpus

Caucasus-Rosetta

❤️40

Caucasus languages focused multilingual and monolingual corpuses for Natural Language Processing(NLP)

Apache-2.0

Python

Updated 1 month ago

neural-machine-translationparallel-corpus

focus

konstantinjdobler

🧡50

[EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"

MIT

Python

Updated 2 months ago

indonesian-text-classification-multilingual

ilhamfp

❤️35

Improving Indonesian text classification using multilingual language model

Jupyter Notebook

Updated 6 months ago

cross-lingual-transferenglish-languagehate-speech-detection+11

microbert

lgessler

❤️30

A tiny BERT for low-resource monolingual models

HTML

Updated 3 months ago

nlppytorch

UncSamp

wxjiao

❤️30

Implementation of our paper "Self-training Sampling with Monolingual Data Uncertainty for Neural Machine Translation" to appear in ACL-2021.

Python

Updated 1 year ago

neural-machine-translationself-trainingtranslation-uncertainty

hgiyt

adapter-hub

❤️45

Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"

Python

Updated 2 months ago

poki

kulupu-lapo

❤️45

A library / monolingual corpus of Toki Pona texts.

Python

Updated 2 days ago

toki-pona

goldfish

tylerachang

🧡50

Goldfish: Monolingual language models for 350 languages.

Shell

Updated 1 week ago

programming-language-translator

AmrHendy

❤️35

An easy way to use the released TransCoder by Facebook AI Research to convert code from one programming language to another using unsupervised neural machine translation (NMT) systems that use deep-learning to translate text from one natural language to another and is trained only on monolingual source data.

Jupyter Notebook

Updated 5 months ago

machine-translationnlpprogramming-language+4

sanskrit_text_gitasupersite

cltk

🧡60

sanskrit monolingual corpus

NOASSERTION

Python

Updated 1 week ago

BertForRD

yhcc

❤️30

This is the code for the EMNLP2020 Finding paper "BERT for Monolingual and Cross-Lingual Reverse Dictionary"

Python

Updated 2 years ago

bitext-lexind

facebookresearch

❤️30

Bilingual lexicons map words in one language to their translations in another, and are typically induced by learning linear projections to align monolingual word embedding spaces. In this paper, we show it is possible to produce much higher quality lexicons with methods that combine (1) unsupervised bitext mining and (2) unsupervised word alignment. Directly applying a pipeline that uses recent algorithms for both subproblems significantly improves induced lexicon quality and further gains are possible by learning to filter the resulting lex-ical entries, with both unsupervised and semi-supervised schemes. Our final approach out-performs the state of the art on the BUCC 2020shared task by 14 F1 points averaged over 12 language pairs, while also providing a more interpretable approach that allows for rich reasoning of word meaning in context.

MIT

Python

Updated 6 months ago

chatterbox-multilingual-finetuning

99eren99

🧡50

Monolingual Finetuning for Chatterbox Multilingual

MIT

Python

Updated 3 weeks ago

en-hi-codemixed-corpus

mrinaldhar

❤️30

Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus

Updated 3 years ago

GitHub Explorer

Search Results

monolingual

ipa-dict

Monolingual

stopes

TED-Multilingual-Parallel-Corpus

bertje

good-translation-wrong-in-context

wechsel

copyisallyouneed

monolingual-word-aligner

MetaVec

bivec

UNMT

lm-prior-for-nmt

parallel-corpora-tools

TransferLearning-CLVC

Lingua-Corpus

focus

indonesian-text-classification-multilingual

microbert

UncSamp

hgiyt

poki

goldfish

programming-language-translator

sanskrit_text_gitasupersite

BertForRD

bitext-lexind

chatterbox-multilingual-finetuning

en-hi-codemixed-corpus

monolingual

ipa-dict

Monolingual

stopes

TED-Multilingual-Parallel-Corpus

bertje

good-translation-wrong-in-context

wechsel

copyisallyouneed

monolingual-word-aligner

MetaVec

bivec

UNMT

lm-prior-for-nmt

parallel-corpora-tools

TransferLearning-CLVC

Lingua-Corpus

focus

indonesian-text-classification-multilingual

microbert

UncSamp

hgiyt

poki

goldfish

programming-language-translator

sanskrit_text_gitasupersite

BertForRD

bitext-lexind

chatterbox-multilingual-finetuning

en-hi-codemixed-corpus