Issue with inflections in glossary and memory
论题张贴者: nivaca
nivaca
nivaca
哥伦比亚
Local time: 02:20
Mar 23, 2015

The glossary and translation memory doesn't work well with inflected languages as Latin. For instance, if in the source I have the word "dicendum" (the gerundive of "dico" ["I say"]), and if I add it to the glossary as it appears, then other inflections of the word will not be recognised by the glossary: "dicitur", "dicamus", etc.

I would be quite useful if OmegaT allowed for certain ways of dealing with inflections in the glossary. One simple way might be the following: use or rege
... See more
The glossary and translation memory doesn't work well with inflected languages as Latin. For instance, if in the source I have the word "dicendum" (the gerundive of "dico" ["I say"]), and if I add it to the glossary as it appears, then other inflections of the word will not be recognised by the glossary: "dicitur", "dicamus", etc.

I would be quite useful if OmegaT allowed for certain ways of dealing with inflections in the glossary. One simple way might be the following: use or regex in glossary entries. E.g.: "dic[endum, atum]".
Is this possible as of today?
Collapse


 
Susan Welsh
Susan Welsh  Identity Verified
美国
Local time: 03:20
Russian俄语译成English英语
+ ...
It does Mar 23, 2015

The tokenizer function in OmegaT does that. Check the users' manual. How well it works for Latin I can't say, but it works for Russian and German.

Susan


 
nivaca
nivaca
哥伦比亚
Local time: 02:20
主题发起人
Not for Latin. Mar 24, 2015

But there is no tokenizer for Latin, I'm afraid.

 
Didier Briel
Didier Briel  Identity Verified
法国
Local time: 09:20
English英语译成French法语
+ ...
It relies on the Hunspell dictionary Mar 24, 2015

nivaca wrote:

But there is no tokenizer for Latin, I'm afraid.

For languages not covered by Lucene, the tokenizer is provided by Hunspell (you have to install the Hunspell dictionary corresponding to the source language).

I tried and I couldn't get it to work. That might be because the Hunspell dictionary I installed doesn't contain the necessary information, or because it does accept some stemming, but not the one I tried.

I tried these two dictionaries:
http://rpmfind.net/linux/rpm2html/search.php?query=hunspell-la
http://extensions.openoffice.org/en/project/latin-spelling-and-hyphenation-dictionaries

You can find information on Hunspell stemming information here:
http://manpages.ubuntu.com/manpages/dapper/man4/hunspell.4.html

Didier


 
nivaca
nivaca
哥伦比亚
Local time: 02:20
主题发起人
Worked in Linux Mar 24, 2015

Didier,

Your recommendation of using Hunspell plus Latin dictionary worked fine in Linux. (Xubuntu 14.10). The glossary seems to work correctly now with inflections.

However, it doesn't work for me on Mac OS X. (I installed Hunspell with Brew, and used the very same dictionary.) I supposed there's still some fiddling to do in order to make it work.

Thanks.

Nicolas


 


本论坛没有专门指派版主。
如需报告网站违规或寻求帮助,请联系网站工作人员 »


Issue with inflections in glossary and memory






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »