How to extract acronyms from source text? Thread poster: Erik Freitag
| Erik Freitag Germany Local time: 01:00 Member (2006) Dutch to German + ...
Dear colleagues, This may not be the best forum for my question, but here goes: I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals. Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context. If anyone know a way... See more Dear colleagues, This may not be the best forum for my question, but here goes: I'm looking for a convenient way to extract all acronyms/abbreviations from the source text, by which I basically (as a working definition) mean words that are not found in standard monolingual dictionaries and are written in capitals. Ideally, I'd like to have them exported as a list, possibly with the whole sentence they appear in for context. If anyone know a way to achieve this with SDL Trados Studio 2017, TermExtract, or third party software, I'd be grateful for a hint. Many thanks in advance, kind regards, Erik ▲ Collapse | | | Adam Łobatiuk Poland Local time: 01:00 Member (2009) English to Polish + ...
For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks. In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.
[Edited at 201... See more For a rough list of acronyms with capital letters, you can copy and paste the text in MS Word, search with wildcards for <[A-Z]{2;}> (see note below) and replace with just bold formatting, and then search (without wildcards) for non-bold formatting and replace with ^p. That should leave you with just 2-letter or longer words in ALL CAPS with line breaks. In the regular expression, you may need to use {2,} instead of {2;} depending on your system settings.
[Edited at 2018-02-22 20:00 GMT] ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » How to extract acronyms from source text? Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |