Khmer OCR

Khmer OCR This is an official page for Khmer OCR. The technology of Cambodians.

We fine tuned with Non-Khmer Language OCR models on our well-prepared Cambodian ID (CID) card dataset (Synthetic and Rea...
22/02/2025

We fine tuned with Non-Khmer Language OCR models on our well-prepared Cambodian ID (CID) card dataset (Synthetic and Real). As a result, we developed a testing API endpoint that is available publicly to from today until the end of February. Feel free to check out and test.

API Docs: https://48e6-202-58-16-250.ngrok-free.app/docs #/ocr/extract_api_ocr_extract_post

[English below]αž‚αž˜αŸ’αžšαŸ„αž„ OCR αžšαž”αžŸαŸ‹ Techo Startup Center αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž„αŸ’αž€αžΎαžαž‘αžΎαž„ αž€αŸ’αž“αž»αž„αž‚αŸ„αž›αž”αŸ†αžŽαž„αžŠαžΎαž˜αŸ’αž”αžΈαž–αž„αŸ’αžšαžΉαž„αž“αžΌαžœαžŠαŸ†αžŽαžΎαžšαž€αžΆαžš KYC (Know Your...
07/08/2024

[English below]
αž‚αž˜αŸ’αžšαŸ„αž„ OCR αžšαž”αžŸαŸ‹ Techo Startup Center αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αž„αŸ’αž€αžΎαžαž‘αžΎαž„ αž€αŸ’αž“αž»αž„αž‚αŸ„αž›αž”αŸ†αžŽαž„αžŠαžΎαž˜αŸ’αž”αžΈαž–αž„αŸ’αžšαžΉαž„αž“αžΌαžœαžŠαŸ†αžŽαžΎαžšαž€αžΆαžš KYC (Know Your Customer) αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž―αž€αžŸαžΆαžšαžšαž”αžŸαŸ‹αž’αžαž·αžαž·αž‡αž“αž‡αžΆαž—αžΆαžŸαžΆαžαŸ’αž˜αŸ‚αžš αž“αž·αž„αž”αž„αŸ’αž€αžΎαž“αž”αŸ’αžšαžŸαž·αž‘αŸ’αž’αž·αž•αž›αž€αžΆαžšαž„αžΆαžšαžŸαž˜αŸ’αžšαžΆαž”αŸ‹ αž’αŸ’αž“αž€αž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αž—αžΆαžŸαžΆαžαŸ’αž˜αŸ‚αžšαž‘αžΌαž‘αŸ…αŸ” αž’αžαŸ’αžαž”αž‘αžŸαŸ’αžšαžΆαžœαž‡αŸ’αžšαžΆαžœαž…αŸ†αž“αž½αž“ ០៣ αžŠαŸ‚αž›αž”αžΆαž“αž…αŸαž‰αž•αŸ’αžŸαžΆαž™αžŠαŸ„αž™αž€αŸ’αžšαž»αž˜αž’αŸ’αž“αž€αžŸαŸ’αžšαžΆαžœαž‡αŸ’αžšαžΆαžœ αž–αžΆαž€αŸ‹αž–αŸαž“αŸ’αž’αž“αžΉαž„αž‚αž˜αŸ’αžšαŸ„αž„αž˜αž½αž™αž“αŸαŸ‡αž˜αžΆαž“αž—αŸ’αž‡αžΆαž”αŸ‹αž‡αžΌαž“αž€αŸ’αž“αž»αž„αžšαžΌαž”αž—αžΆαž–αžαžΆαž„αž€αŸ’αžšαŸ„αž˜αŸ” αž™αŸ„αž„αž‘αŸ…αžαžΆαž˜αž€αžΆαžšαž…αŸαž‰αž•αŸ’αžŸαžΆαž™αž€αŸ’αž“αž»αž„ αž‚αŸαž αž‘αŸ†αž–αŸαžš facebook αž•αŸ’αž›αžΌαžœαž€αžΆαžšαžšαž”αžŸαŸ‹ Techo Startup Center, αž€αž˜αŸ’αžšαž·αžαž—αžΆαž–αž›αŸ’αž’αŸ€αž„αž“αŸƒαžαž½αžšαž’αž€αŸ’αžŸαžš (Character Error Rate) αž˜αžΆαž“αžαž·αž…αž‡αžΆαž„ ៑%αŸ” πŸŽ‰

The Techo Startup Center OCR project is designed to strengthen the KYC (Know Your Customer) process for customer documents in Khmer and to increase productivity for general Khmer users. Three research articles published by researchers related to this project are attached in the image below. According to the official release of the Techo Startup Center page, the Character Error Rate is less than 1%.

References: Year in Review 2023, Techo Startup AI Website

Get the latest update by
Join our Telegram Channel: https://t.me/khmerocr

[English below]OCR αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž—αžΆαžŸαžΆαžαŸ’αž˜αŸ‚αžšαž™αžΎαž„ αž“αŸ…αžαŸ‚αž‘αžΆαž˜αž‘αžΆαžšαž²αŸ’αž™αž˜αžΆαž“αž€αžΆαžšαžŸαŸ’αžšαžΆαžœαž‡αŸ’αžšαžΆαžœ αž“αž·αž„αž’αž—αž·αžœαžŒαŸ’αžαž“αŸαž”αž“αŸ’αžαŸ‚αž˜ αžŠαžΎαž˜αŸ’αž”αžΈαž²αŸ’αž™αž˜αžΆαž“αž”αŸ’αžšαžŸαž·αž‘αŸ’αž’αž—αžΆαž–αžαŸ’αž–αžŸαŸ‹ αž“αž·αž„...
05/08/2024

[English below]
OCR αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž—αžΆαžŸαžΆαžαŸ’αž˜αŸ‚αžšαž™αžΎαž„ αž“αŸ…αžαŸ‚αž‘αžΆαž˜αž‘αžΆαžšαž²αŸ’αž™αž˜αžΆαž“αž€αžΆαžšαžŸαŸ’αžšαžΆαžœαž‡αŸ’αžšαžΆαžœ αž“αž·αž„αž’αž—αž·αžœαžŒαŸ’αžαž“αŸαž”αž“αŸ’αžαŸ‚αž˜ αžŠαžΎαž˜αŸ’αž”αžΈαž²αŸ’αž™αž˜αžΆαž“αž”αŸ’αžšαžŸαž·αž‘αŸ’αž’αž—αžΆαž–αžαŸ’αž–αžŸαŸ‹ αž“αž·αž„αž€αž˜αŸ’αžšαž·αžαž›αŸ†αž’αŸ€αž„αž‘αžΆαž”αž”αŸ†αž•αž»αžαŸ”
OCR - αžŠαŸ„αž™ Techo Startup Center : https://ai.techostartup.center
Tesseract OCR - αžŠαŸ„αž™ Google : https://github.com/tesseract-ocr/tesseract
Khmer OCR - αžŠαŸ„αž™ Institute of Digital Research and Innovation : https://ocr.idri.edu.kh

OCR for Khmer language still needs more work to increase the performance efficiency and decrease the error rate as possible.

Get the latest update by
Join our Telegram Channel: https://t.me/khmerocr

[English below]Optical Character Recognition (OCR) αž”αžΆαž“αžœαž·αžœαžαŸ’αžαž“αŸαž–αžΈαž§αž”αž€αžšαžŽαŸαžŠαŸ†αž”αžΌαž„αžŠαŸ‚αž›αž‡αžΆ Optophone αžšαž”αžŸαŸ‹ Gustav Tauschek αž€αŸ’αž“αž»αž„αž‘αžŸαŸ’...
01/08/2024

[English below]
Optical Character Recognition (OCR) αž”αžΆαž“αžœαž·αžœαžαŸ’αžαž“αŸαž–αžΈαž§αž”αž€αžšαžŽαŸαžŠαŸ†αž”αžΌαž„αžŠαŸ‚αž›αž‡αžΆ Optophone αžšαž”αžŸαŸ‹ Gustav Tauschek αž€αŸ’αž“αž»αž„αž‘αžŸαŸ’αžŸαžœαžαŸ’αžŸαžšαŸαž†αŸ’αž“αžΆαŸ† 1920 αž‘αŸ…αž‡αžΆαž”αŸ’αžšαž–αŸαž“αŸ’αž’αž”αž…αŸ’αž…αŸαž€αžœαž·αž‘αŸ’αž™αžΆ AI αžŠαŸαž‘αŸ†αž“αžΎαž”αž“αžΆαž–αŸαž›αž”αž…αŸ’αž…αž»αž”αŸ’αž”αž“αŸ’αž“αž“αŸαŸ‡αŸ” αžŠαŸ†αž”αžΌαž„αž‘αžΎαž™ OCR αžαŸ’αžšαžΌαžœαž”αžΆαž“αž”αŸ’αžšαžΎαž”αŸ’αžšαžΆαžŸαŸ‹αžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž€αžΆαžšαžŸαž˜αŸ’αž‚αžΆαž›αŸ‹αžαž½αžšαž’αž€αŸ’αžŸαžšαžŠαŸ‚αž›αž˜αžΆαž“αž›αž€αŸ’αžαžŽαŸˆαžŸαžΆαž˜αž‰αŸ’αž‰ αžαŸ‚αž₯αž‘αžΌαžœαž“αŸαŸ‡αžœαžΆαž”αžΆαž“ αžšαž½αž˜αž”αž‰αŸ’αž…αžΌαž›αž‡αžΆαž˜αž½αž™ AI αžŠαžΎαž˜αŸ’αž”αžΈαžŠαŸ†αžŽαžΎαžšαž€αžΆαžšαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž—αžΆαžŸαžΆαž…αŸ’αžšαžΎαž“αž”αŸ’αžšαž—αŸαž‘, αž€αžΆαžšαžŸαžšαžŸαŸαžšαžŠαŸ„αž™αžŠαŸƒ αž“αž·αž„αž˜αžΆαž“αž”αŸ’αžšαžŸαž·αž‘αŸ’αž’αž—αžΆαž–αžαŸ’αž–αžŸαŸ‹αŸ”

Optical Character Recognition (OCR) has evolved from early devices like Gustav Tauschek's Optophone in the 1920s to advanced deep learning systems today. Initially used for basic text recognition, OCR now features high accuracy and versatility, integrating with AI to process diverse languages, handwriting, and complex layouts efficiently.

Get the latest update by
Join our Telegram Channel: https://t.me/khmerocr

[English below]αžŸαžΌαž˜αžŸαŸ’αžœαžΆαž‚αž˜αž“αŸαž˜αž€αž€αžΆαž“αŸ‹ αž‚αŸαž αž‘αŸ†αž–αŸαžšαž•αŸ’αž›αžΌαžœαž€αžΆαžšαžšαž”αžŸαŸ‹ Khmer OCR πŸ€— Khmer OCR αž‡αžΆαž”αž…αŸ’αž…αŸαž€αžœαž·αž‘αŸ’αž™αžΆαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž”αž˜αŸ’αž›αŸ‚αž„αžšαžΌαž”αž—αžΆαž– αž‘αŸ…αž‡αžΆαž’αž€αŸ’αžŸαžšαž...
31/07/2024

[English below]
αžŸαžΌαž˜αžŸαŸ’αžœαžΆαž‚αž˜αž“αŸαž˜αž€αž€αžΆαž“αŸ‹ αž‚αŸαž αž‘αŸ†αž–αŸαžšαž•αŸ’αž›αžΌαžœαž€αžΆαžšαžšαž”αžŸαŸ‹ Khmer OCR πŸ€—
Khmer OCR αž‡αžΆαž”αž…αŸ’αž…αŸαž€αžœαž·αž‘αŸ’αž™αžΆαžŸαž˜αŸ’αžšαžΆαž”αŸ‹αž”αž˜αŸ’αž›αŸ‚αž„αžšαžΌαž”αž—αžΆαž– αž‘αŸ…αž‡αžΆαž’αž€αŸ’αžŸαžšαžαŸ’αž˜αŸ‚αžšαŸ”

Welcome to Khmer OCR official page πŸ€—
Khmer OCR is a technology used to convert images to Khmer text.

Join our Telegram channel: https://t.me/khmerocr

Address

Phnom Penh
12000

Alerts

Be the first to know and let us send you an email when Khmer OCR posts news and promotions. Your email address will not be used for any other purpose, and you can unsubscribe at any time.

Share