Algorithms for certain classes of Tamil
Spelling correction
牀牆牀む牀む 牀牀牆牀牆牀牀巌牀 ezhillang@gmail.com
牀.牀牆牀牀逗朽 牀牆 tshrinivasan@gmail.com
Algorithms for certain classes of tamil spelling correction
牀 牆牀園牀牀逗巌 牀む牀む逗萎牀む牀む逗牀橿
牀む牀 牆 牀牆牀巌 - hunspell
Used in Google Docs
Text Processing Library in Python
Free/Open Source Software
牀牆牀む逗牆牀牀逗巌 牀む牀む逗萎牀む牀む
牀 牆牀園牀牀逗巌 牀む 牀む逗萎牀む牀む
Edit Distance Search Algorithm
verb declension,
affix removal,
morpheme extraction,
applying corrections to root word
synthesizing it.
Norvig Algorithm
1. Deleted,
2. Substituted - with alternate letter
3. Inserted - with alternate letter
Driver Algorithm for Spell Checker
This is non-word
error correcting algorithm
i.e. it corrects only words not in dictionary
牀む牀逗巌逗迦 牀牀橿牀 牀牀牀牆牀牀牆牀迦 牀牀巌牀む牀む牀牀橿
牀, , 牀 牀朽萎逗巌 .
牀, 牀 牀朽萎逗巌 .
牀, 牀, 牀 牀朽萎逗巌 .
牀, 牀, 牀朽萎逗巌 .
牀 牆牀迦牀む逗萎牀む牀む逗牀逗迦 牀牀牀逗牀 牀牀逗萎迦
牀 牀牆牀牀朽牀朽牆牀牀逗牀む 牀牀む牀朽牀
牀牀橿牀橿牀牆 牀牀牆牀牆牀牆牀牀牀牆牀 牀 牆牀迦 牀萎逗牆牀牀む, 牀牀迦牀迦む
牀む朽園牀牀む ?
牀む朽園牀 牀 牆牀迦 牀牀牆牀 牀牀牆 牀む牀む逗迦 牀牀む牆 牀牆牀園牀園牆牀牀橿
牀牀牆牀牀牀牆牀 ?
牀萎逗牆牀 牀 牆牀園牀牀橿, 牀牀む牀朽む 牀朽牀朽萎 牀牀牆牀む牀, 牀牆牀牀萎牀牆牀牀
牀牀園牀園牀牆 牆牀牆牀む 牀牀逗萎逗牆牀牀牆牀牀牆牀 牀 牆牀園牀牀橿 牀牀巌牀牀む牀む牀牆
牀萎 牀萎 牀牀逗牆牀牀牀萎牀む逗牀逗迦 牀牆牀牀迦牀牆.
牀牀橿牀橿牆, ,牀 牀牆
Algorithm for conjoined word recognition
牀牀む牆牀園迦牀牆牀園牀園 = 牀牀む牆牀園迦 + 牀牆牀園牀園
牀牆牀萎逗牀巌牀牆牀む =
[['牀牆', '牀牀萎逗牀巌牀牆牀む'],
['牀牆牀萎', '牀牀牀巌牀牆牀む'],
['牀牆牀萎逗牆', '牀牀巌牀牆牀む'],
['牀牆牀萎逗牀巌牀牆牀む', '牀']]
Typographical error correction in Tamil
牀む牀逗巌99 牀牀迦牀迦む 牀牀牆 牀迦 牀朽牀牆牀牆牀 牀朽逗巌 牀牀橿逗牆 牀朽巌
牀牀橿牀橿牀牆 牀 牀牆牀巌牀牀牀逗迦 牀牀巌牀む牀む牀牆牀牀逗巌 牀牀橿
Recurrent Neural Networks (RNN)
Long Short-term Memory (LSTM)
Word Sense Disambiguation
牀牀牆牀朽牀 牀牀逗朽牆
牀牀牆牀朽牀 牀朽牆
Optimizing a Spell Checker Implementation
Algorithm for fast Unicode letter detection
is_tamil_unicode_predicate = lambda x: x >= chr(2946) and x <= chr(3066)
Caching results
Redis / distributed DB
 牀牀巌牀 牀牀迦 牀牆牀園牀園
 牀 牆牀迦 牀朽牀む牀迦 - 牀牆牀園牀牆牀牀牀巌牀む牀む
 牀む牀逗巌 牀牀牆牀牀橿
 Tamil Stemmer
 牀む牀逗巌 牀牆牀む逗牆牀牀逗巌 牀む逗萎牀む牀む
 牀牀牀萎牀牆牀牀迦 牀牀牆牀牀朽巌牀
 牀 牆牀迦 牀牀牆牀牀
 牀む牀逗巌 牀牀巌牀む牀む牀牆 牀牀逗巌 牀む逗萎牀む牀む
 牀牀園牀牆牀牆牀園 (牀牆牀牀逗朽牀牆牀牆) 牀牆牀園牀園
 牀む牀逗巌 N-牀牀逗萎牀牆
 Tamil game Minnal
 牀む牀逗巌 牀牆牀牀逗牀逗萎牀牆
 牀む牀逗巌 牀牀牆牀牀逗萎牀牆
 牀 牆牀迦牀巌牀 牀む逗萎牀牆牀牀 牀牀巌牀む
 牀む牀逗巌 牀牀巌牀 牀牆牀萎牀牆牀牀
 Tamil Morse code
