GitHub Explorer

by Alexey Ratnikov

GitHub Explorer

GitHub Explorer|TRENDING COMPARE|FEEDBACK

Back to search

bhaashik/BhaashikPyAlign - GitHub Explorer | GitHub Explorer | Trending | Compare

Back to search

BhaashikPyAlign

bhaashik•PUBLIC

View on GitHub

A Python API library for alignment of linguistic units based on parallel data, such as word alignment, using libraries such as awesome-align. Also meant to be used as the base for using such aligned data.

Apache License 2.0

Created on Nov 2, 2025

Updated on Apr 7, 2026

Stars

Forks

Watchers

Open Issues

Repository Health Score

🧡

65/100

Fair

Overall repository health assessment

Score Breakdown

Activity

Active development - updated this week

30/30

100%

Recent Commits

feat: add pyproject.toml for uv/setuptools wheel builds

Bhaashik•4 days ago

8c51ac1View on GitHub

Added heuristics-based transfer of linguistic knowledge based on word alignment. Before that, word alignment also improved using linguistic knowledge based on morphologicand, POS and dependecy parse from the source language to reduce NULL alignments for example, so as to correct obvious omissions from alignments. Updated requirements files, YAML files,docs, CLI and usage for this extension. Effectively merges the SyntheticWrdAlignedUDTB project into this project, so that it does not just word alignment and provides data structure API (with SSF and CoNLL-U support), but also aligns and creates treebanks (not necessarily synthetic).

Anil Kumar Singh•2 months ago

9a38370View on GitHub

Extended docs to include details about the new options for scoring etc..

Anil Kumar Singh•2 months ago

9b13e35View on GitHub

Added alignment scoring using some common ways. Added these options to logging, CLI and usage.

Anil Kumar Singh•2 months ago

d16da96View on GitHub

Integrated the ability to use any BERT model such as xlm-roberta-large or indic-bert-v2 etc. Improved the CLI and usage help in the main run scription.

Anil Kumar Singh•2 months ago

9863bb8View on GitHub

CLI improved with better usage and help options.

bhaashik•3 months ago

784aeecView on GitHub

Version 3.0 working. Added word alignment with awesome-align and fast_align, converter from their output to GIZA++ .A3.final files. Some docs and tests added.

bhaashik•3 months ago

6a8c26cView on GitHub

Version 2.0 working.

Anil Kumar Singh•4 months ago

361f777View on GitHub

Version 1.0: Fully functional API for aligned linguistic units, as well as awesome-align based word alignment. Includes two notebooks for each of them.

Anil Kumar Singh•5 months ago

6b09295View on GitHub

Initial commit

bhaashik•5 months ago

d64c347View on GitHub

View all commits