Thai Handwritten OCR (TrOCR)

A Thai Handwritten OCR model fine-tuned from Microsoft TrOCR for recognizing Thai handwritten text.

Model Details

Model Description

This model is developed to convert Thai handwritten images into text using the TrOCR architecture, which combines Vision Transformer (ViT) for image processing and Transformer Decoder for text generation.

Developed by: Warit Sirikosityanggoon
Model type: Vision Encoder-Decoder (TrOCR)
Language(s): Thai (th)
License: Apache 2.0
Finetuned from: microsoft/trocr-base-handwritten

Model Sources

Repository: waritkan/Thai-Hand-Written-TrOCR-Webapp

Uses

Direct Use

This model can be used directly for converting Thai handwritten images into text. Suitable for:

Converting Thai handwritten documents
Real-time handwriting recognition systems
Digitizing handwritten notes

Out-of-Scope Use

Not suitable for languages other than Thai
May not perform well on extremely difficult handwriting or low-quality images

Training Details

Training Data

Trained on iapp/thai_handwriting_dataset, which contains Thai handwritten images paired with their corresponding text labels.

Tokenizer

Uses SentencePiece with Unigram algorithm instead of Dictionary-based Word Segmentation because:

Handles Out-of-Vocabulary words effectively
Supports misspelled or incomplete words from handwriting
No pre-tokenization required

Tokenizer Configuration:

Vocab Size: 30,000
Character Coverage: 0.9995
Algorithm: Unigram

Training Hyperparameters

Parameter	Value
Epochs	250
Batch Size	16
Learning Rate	1e-5
Optimizer	AdamW
Training Regime	fp16 mixed precision

Training Infrastructure

Hardware: NVIDIA GPU (HPC Cluster)
Framework: PyTorch + Hugging Face Transformers

Evaluation

Metrics

Metric	Value
CER (Character Error Rate)	0.488%

How to Evaluate

import editdistance

def calculate_cer(pred, label):
    """Character Error Rate (lower is better)"""
    if len(label) == 0:
        return 1.0 if len(pred) > 0 else 0.0
    distance = editdistance.eval(pred, label)
    return distance / len(label)

How to Get Started with the Model

Installation

pip install transformers torch sentencepiece pillow

Usage

import torch
from PIL import Image
import sentencepiece as spm
from transformers import VisionEncoderDecoderModel, ViTImageProcessor

# Load model
model = VisionEncoderDecoderModel.from_pretrained('microsoft/trocr-base-handwritten')
image_processor = ViTImageProcessor.from_pretrained('microsoft/trocr-base-handwritten')

# Load Thai tokenizer
sp = spm.SentencePieceProcessor()
sp.Load('thai_sp_30000.model')

# Load trained weights
checkpoint = torch.load('best_model.pt', map_location='cpu')
model.decoder.resize_token_embeddings(sp.GetPieceSize())
model.load_state_dict(checkpoint['model_state_dict'], strict=False)
model.eval()

# Inference
image = Image.open('handwriting.jpg').convert('RGB')
pixel_values = image_processor(image, return_tensors='pt').pixel_values

with torch.no_grad():
    generated_ids = model.generate(
        pixel_values,
        max_length=128,
        num_beams=4,
    )

# Decode
ids = generated_ids[0].tolist()
text = sp.DecodeIds(ids)
print(text)

Model Architecture

Input Image
    |
    v
Vision Transformer (ViT) Encoder
    |
    v
Cross-Attention
    |
    v
Transformer Decoder
    |
    v
SentencePiece Tokenizer (Unigram)
    |
    v
Thai Text Output

Limitations

Performance depends on image quality and handwriting clarity
May not perform well on handwriting styles significantly different from training data
Supports Thai language only

Citation

@misc{thai-handwritten-trocr,
  author = {Warit Sirikosityanggoon},
  title = {Thai Handwritten OCR using TrOCR},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://github.com/waritkan/Thai-Hand-Written-TrOCR-Webapp}}
}

Acknowledgements

Microsoft TrOCR for Pretrained Model
iApp Technology for Thai Handwriting Dataset
SentencePiece for Tokenizer

Model Card Contact

Author: Warit Sirikosityanggoon
GitHub: waritkan/Thai-Hand-Written-TrOCR-Webapp

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for waritkan/thai-ocr-model

Base model

microsoft/trocr-base-handwritten

Finetuned

(34)

this model

waritkan
/

thai-ocr-model