mothertoken — benchmark explorer

benchmark explorer

token counter

sort

This explorer shows how the bundled token counters handle each language. It is curated, not exhaustive; listed model names are examples, not a complete compatibility map.

language	script	chars/token	fertility	efficiency (rtc)

benchmark command

The benchmark shown here has been generated with the following command:

 $ uv run mothertoken benchmark run \ 
    --languages eng_Latn,fra_Latn,spa_Latn,por_Latn,deu_Latn,arb_Arab,cmn_Hans,jpn_Jpan,tha_Thai,hin_Deva,kor_Hang,tur_Latn,ukr_Cyrl,vie_Latn,swh_Latn \ 
    --models gpt-4o,gpt-4,qwen3,mistral,qwen2.5,deepseek-v3,gpt-oss,gpt2,gpt-3,codex,codex-edit,opt,tinyllama,pythia,bert-base-uncased,roberta-base,xlm-roberta-base,distilbert-base-uncased \ 
    --output src/mothertoken/data/default_benchmark.json