�ݺ�ߣshows by User: dotronghop

�ݺ�ߣshows by User: dotronghop / http://www.slideshare.net/images/logo.gif �ݺ�ߣshows by User: dotronghop / Mon, 03 Apr 2023 19:15:14 GMT �ݺ�ߣShare feed for �ݺ�ߣshows by User: dotronghop NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Script.pdf /slideshow/nomnaocr-the-first-dataset-for-optical-character-recognition-on-hannom-scriptpdf/257143976 80434-230403191514-b8925f2b
In this article, we introduce the NomNaOCR dataset for the old Hán-Nôm script based on 3 tremendous and valuable historical works of Vietnam, including Lục Vân Tiên, Truyện Kiều, and Đại Việt Sử Ký Toàn Thư. With 2953 handwritten Pages collected from the Vietnamese Nôm Preservation Foundation for analyzing and semi-annotating the bounding boxes to generate additional 38,318 Patches containing text along with Hán-Nôm strings in digital form. This makes NomNaOCR currently become the biggest dataset for Hán-Nôm script in Vietnam, serving 2 main problems in Optical Character Recognition: Text Detection and Text Recognition. A difference here is that our implementations were all done at the sequence level, which not only saves the annotation cost but also helps us retain the context in the sequence instead of just performing on each individual character as in most previous works. For basic results, we experimented on the validation set of NomNaOCR. By using DBNet model for Text Detection, we reached a F1-score up to 99.65%. With Text Recognition, we used CRNN model and achieved an accuracy of 29.41% at sequence level and 84.73% at character level.]]>
In this article, we introduce the NomNaOCR dataset for the old Hán-Nôm script based on 3 tremendous and valuable historical works of Vietnam, including Lục Vân Tiên, Truyện Kiều, and Đại Việt Sử Ký Toàn Thư. With 2953 handwritten Pages collected from the Vietnamese Nôm Preservation Foundation for analyzing and semi-annotating the bounding boxes to generate additional 38,318 Patches containing text along with Hán-Nôm strings in digital form. This makes NomNaOCR currently become the biggest dataset for Hán-Nôm script in Vietnam, serving 2 main problems in Optical Character Recognition: Text Detection and Text Recognition. A difference here is that our implementations were all done at the sequence level, which not only saves the annotation cost but also helps us retain the context in the sequence instead of just performing on each individual character as in most previous works. For basic results, we experimented on the validation set of NomNaOCR. By using DBNet model for Text Detection, we reached a F1-score up to 99.65%. With Text Recognition, we used CRNN model and achieved an accuracy of 29.41% at sequence level and 84.73% at character level.]]> Mon, 03 Apr 2023 19:15:14 GMT /slideshow/nomnaocr-the-first-dataset-for-optical-character-recognition-on-hannom-scriptpdf/257143976 dotronghop@slideshare.net(dotronghop) NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Script.pdf dotronghop In this article, we introduce the NomNaOCR dataset for the old Hán-Nôm script based on 3 tremendous and valuable historical works of Vietnam, including Lục Vân Tiên, Truyện Kiều, and Đại Việt Sử Ký Toàn Thư. With 2953 handwritten Pages collected from the Vietnamese Nôm Preservation Foundation for analyzing and semi-annotating the bounding boxes to generate additional 38,318 Patches containing text along with Hán-Nôm strings in digital form. This makes NomNaOCR currently become the biggest dataset for Hán-Nôm script in Vietnam, serving 2 main problems in Optical Character Recognition: Text Detection and Text Recognition. A difference here is that our implementations were all done at the sequence level, which not only saves the annotation cost but also helps us retain the context in the sequence instead of just performing on each individual character as in most previous works. For basic results, we experimented on the validation set of NomNaOCR. By using DBNet model for Text Detection, we reached a F1-score up to 99.65%. With Text Recognition, we used CRNN model and achieved an accuracy of 29.41% at sequence level and 84.73% at character level. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/80434-230403191514-b8925f2b-thumbnail.jpg?width=120&height=120&fit=bounds" /><br> In this article, we introduce the NomNaOCR dataset for the old Hán-Nôm script based on 3 tremendous and valuable historical works of Vietnam, including Lục Vân Tiên, Truyện Kiều, and Đại Việt Sử Ký Toàn Thư. With 2953 handwritten Pages collected from the Vietnamese Nôm Preservation Foundation for analyzing and semi-annotating the bounding boxes to generate additional 38,318 Patches containing text along with Hán-Nôm strings in digital form. This makes NomNaOCR currently become the biggest dataset for Hán-Nôm script in Vietnam, serving 2 main problems in Optical Character Recognition: Text Detection and Text Recognition. A difference here is that our implementations were all done at the sequence level, which not only saves the annotation cost but also helps us retain the context in the sequence instead of just performing on each individual character as in most previous works. For basic results, we experimented on the validation set of NomNaOCR. By using DBNet model for Text Detection, we reached a F1-score up to 99.65%. With Text Recognition, we used CRNN model and achieved an accuracy of 29.41% at sequence level and 84.73% at character level.

NomNaOCR The First Dataset for Optical Character Recognition on Han-Nom Script.pdf from Đỗ Hợp

]]> 224 0 https://cdn.slidesharecdn.com/ss_thumbnails/80434-230403191514-b8925f2b-thumbnail.jpg?width=120&height=120&fit=bounds document Black http://activitystrea.ms/schema/1.0/post

http://activitystrea.ms/schema/1.0/posted

0 Kernel fisher discriminant /slideshow/kernel-fisher-discriminant/4268547 kernelfisherdiscriminant-100524095233-phpapp01
]]>
]]> Mon, 24 May 2010 09:52:25 GMT /slideshow/kernel-fisher-discriminant/4268547 dotronghop@slideshare.net(dotronghop) Kernel fisher discriminant dotronghop <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/kernelfisherdiscriminant-100524095233-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds" /><br>

Kernel fisher discriminant from Đỗ Hợp

]]> 1618 2 https://cdn.slidesharecdn.com/ss_thumbnails/kernelfisherdiscriminant-100524095233-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds document Black http://activitystrea.ms/schema/1.0/post

http://activitystrea.ms/schema/1.0/posted

0 Cyclic code /slideshow/m-vng-cyclic-code/4262414 mavong-100524050351-phpapp01
]]>
]]> Mon, 24 May 2010 00:41:42 GMT /slideshow/m-vng-cyclic-code/4262414 dotronghop@slideshare.net(dotronghop) Cyclic code dotronghop <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/mavong-100524050351-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds" /><br>

Cyclic code from Đỗ Hợp

]]> 14429 5 https://cdn.slidesharecdn.com/ss_thumbnails/mavong-100524050351-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds document Black http://activitystrea.ms/schema/1.0/post

http://activitystrea.ms/schema/1.0/posted

0 https://cdn.slidesharecdn.com/profile-photo-dotronghop-48x48.jpg?cb=1681382129 http://www.flickr.com/photos/69247886@N03/ https://cdn.slidesharecdn.com/ss_thumbnails/80434-230403191514-b8925f2b-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/nomnaocr-the-first-dataset-for-optical-character-recognition-on-hannom-scriptpdf/257143976 NomNaOCR The First Dat... https://cdn.slidesharecdn.com/ss_thumbnails/kernelfisherdiscriminant-100524095233-phpapp01-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/kernel-fisher-discriminant/4268547 Kernel fisher discrimi... https://cdn.slidesharecdn.com/ss_thumbnails/mavong-100524050351-phpapp01-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/m-vng-cyclic-code/4262414 Cyclic code