Found 502 repositories(showing 30)
openai
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
jimmc414
Specify a github or local repo, github pull request, arXiv or Sci-Hub paper, Youtube transcript or documentation URL on the web and scrape into a text file and clipboard for easier LLM ingestion
dqbd
Online playground for OpenAPI tokenizers
pkoukk
go version of tiktoken
niieani
The fastest JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT models (gpt-5, gpt-o*, gpt-4o, etc.). Port of OpenAI's tiktoken with additional features.
M4THYOU
High-Performance Implementation of OpenAI's TikToken.
tiktoken-go
Pure Go implementation of OpenAI's tiktoken tokenizer
karpathy
The missing tiktoken training code
zurawiki
Ready-made tokenizer library for working with GPT and tiktoken
dmitry-brazhenko
SharpToken is a C# library for tokenizing natural language text. It's based on the tiktoken Python library and designed to be fast and accurate.
daulet
Go, Wasm bindings for HF Tokenizers and Tiktoken
CNSeniorious000
An elegant LLM chat UI forked from chatgpt-demo of @anse-app. Index site at https://free-chat.asia
IAPark
Unofficial ruby binding for tiktoken by way of rust
yethee
This is a port of the tiktoken
saschaschramm
Analysis of OpenAI's ChatGPT
johannschopplich
📐 Fast token estimation at 96% accuracy of a full tokenizer in a 2kB bundle
aiqinxuancai
Token calculation for OpenAI models, using `o200k_base` `cl100k_base` `p50k_base` encoding.
ceifa
OpenAI's tiktoken but with node bindings
MNeMoNiCuZ
Nodes: Wildcard Processor, Get File Path, Save Text File, Download Image from URL, Tiktoken Tokenizer, String Cleaning, String Text Splitter, String Text Extractor, Format Date Time, Load Text-Image Pairs, Metadata Extractor, Audio Visualizer, Load Image Advanced, Colorful Starting Image, Groq LLM, VLM, ALM API
tryAGI
High-performance .NET BPE tokenizer — up to 618 MiB/s, competitive with Rust. Zero-allocation counting, multilingual cache, o200k/cl100k/r50k/p50k encodings + HuggingFace tokenizer.json support.
openshieldai
OpenShield is a new generation security layer for AI models
aespinilla
Openai's tiktoken implementation written in Swift
jiangyy
What are learned in tiktoken?
ElmiraGhorbani
The ChatGPT Long Term Memory package is a powerful tool designed to empower your projects with the ability to handle a large number of simultaneous users and external sources.
cahya-wirawan
A fast RWKV Tokenizer written in Rust
kelvich
tiktoken tokenizer for postgres
meta-pytorch
C++ implementations for various tokenizers (sentencepiece, tiktoken etc).
coder
A faster than tiktoken tokenizer with first-class support for Vercel's AI SDK.
Systemcluster
Fast tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.
aallam
Kotlin multiplatform BPE tokenizer library for OpenAI models