← Back to Projects
AI

Burmese Handwriting Dataset Collector

A data collection tool for gathering high-quality handwritten Burmese character samples for ML training.

PythonDatasetML PipelineBurmeseData Collection
Burmese Handwriting Dataset Collector preview

Key Highlights

  • Structured character annotation interface
  • Export to training-ready dataset formats
  • Data quality validation and deduplication
  • Designed to scale community data collection

Overview

A purpose-built tool for collecting, annotating, and exporting handwritten Burmese character samples - addressing the critical lack of open datasets for Burmese script ML research. Features structured annotation workflows, export to training-ready formats, and data quality validation pipelines.

This project is not yet published on GitHub. Dataset and tooling are being prepared for open release alongside the Myanmar Handwriting Recognition model.