I've been in Copenhagen, attending CLARA summer school in semantic annotation. the course was great. the city was amazing.
Words to Bytes
computational linguistics
Farsi morphoogical analyzer
http://pars-morph.appspot.com/
here is the link to a morphological analyzer in Farsi..for now, it has just inflection, not derivation.
the guy who developed this and deployed it on Google app engine is: Vahid Mavaji
here is the link to a morphological analyzer in Farsi..for now, it has just inflection, not derivation.
the guy who developed this and deployed it on Google app engine is: Vahid Mavaji
one percent more
"Many areas in NLP are like this. You can get 92% accuracy in a few hours of work, and then you can get 93% after a week or work, and then you can write a whole PhD thesis about how you got 94% accuracy."
Farsi pos tagger
Finally, I have a Farsi pos tagger. I trained a unigram, bigram and TnT tagger on 2 million tagged words of BijanKhan corpus. check the funny part: I trained them on all my data, without splitting it to train and test. now must re-train them again. I just looked at the results, were promising. so after re-training and evaluating, I will post the result. If my classmate, Vahid, helps, would deploy it on a web server.
NLTK 3.0
the good news is that NLTK 3.0 would be available by mid 2011. no more unicode problems, way to go with Farsi and NLTK modules.
New Year
happy blah blah..Does anyone have a clue about sense tagging a corpus? before the lung cancer! takes over me, I should know this. there's a kind of romance in the air that i can use. I can picture how my useless life would end: I like a Chekhovian end with long, dry coughs. quitting cigar could be a new year resolution, but no way.
Tiger and hyponymy extraction
The curious case of my late studies could be a matter of laughter. on my age I should be lecturing the shi*. anyway finally I wrote my very first academic paper! yeyy!!
"tiger, tiger, burning bright, in the forests of night"
چکيده
تشخیص الگو یکی از روشهای استخراج دانش و کشف روابط میان مفاهیم زبانی است. بنابراین برای استخراج دانش مفهومی از میان دادههای زبانی باید به طراحی و ساخت الگوهای معنایی پرداخت. مقاله حاضر ضمن بررسی روشهای موجود مبتنی بر الگو به معرفی چند الگوی واژگانی- نحوی برای تشخیص رابطه شمول معنایی میپردازد. دادههای لازم برای آزمایش الگوها از ویکیپدیای فارسی انتخاب شده است. این انتخاب به این دلیل صورت گرفته که ویکیپدیا به عنوان یک متن ساخت یافته، منبع خوبی برای استخراج روابط معنایی است. الگوهای معرفی شده در این نوشتار بر روی متون موجود در ویکیپدیا آزمایش شده و دقت هر الگو مورد ارزیابی قرار گرفته است.
God knows why?
the blogspot is blocked in Iran! God knows why. either I go on writing here, or I migrate to a new blog server. I can post, but nobody in Iran can read. will be back with "hyponymy extraction"!
Denmark of our heart
There was something rotten in state of our heart. Our young Hamlet was trying to cheer everybody up,"dudes, I take back my question!", but coffee-sipping and chain-smoking, we were already shocked.
Subscribe to:
Posts (Atom)