How to extract acronyms from source text? Thread poster: Erik Freitag
| Erik Freitag Germany Local time: 10:33 Member (2006) Dutch to German + ...
Dear colleagues, This may not be the best forum for my question, but here goes: I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals. Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context. If anyone know a way... See more Dear colleagues, This may not be the best forum for my question, but here goes: I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals. Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context. If anyone know a way to achieve this with SDL Trados Studio 2017, TermExtract, or third party software, I'd be grateful for a hint. Many thanks in advance, kind regards, Erik ▲ Collapse | | | Adam Łobatiuk Poland Local time: 10:33 Member (2009) English to Polish + ...
For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks. In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.
[Edited at 201... See more For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks. In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.
[Edited at 2018-02-22 20:00 GMT] ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » How to extract acronyms from source text? Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |