LaTex + SDL Trados 2011 - how to?
Thread poster: Vitals
Vitals
Vitals  Identity Verified
Lithuania
Local time: 11:03
English to Lithuanian
+ ...
Mar 20, 2015

Dear all,

I was offered a LaTex format files to be translated in SDL Trados 2011.

How do I configure Trados for LaTex tags, etc.?

Does any of you have experience in that and can provide instructions?

Would be very grateful for your help!

Sincerely,
Vitaly


 
Erik Freitag
Erik Freitag  Identity Verified
Germany
Local time: 10:03
Member (2006)
Dutch to German
+ ...
No easy answer Mar 20, 2015

Dear Vitaly,

I do sometimes get LaTeX files to translate. It looks that there still is no filter available, and it seems that the task of creating one will be worth your while only if you get a whole lot of these files to translate.

Others have invested quite some time bef
... See more
Dear Vitaly,

I do sometimes get LaTeX files to translate. It looks that there still is no filter available, and it seems that the task of creating one will be worth your while only if you get a whole lot of these files to translate.

Others have invested quite some time before ultimately failing (look here: http://www.proz.com/forum/sdl_trados_support/238618-regular_expression_delimited_text_filter_for_latex.html)

You may want to search for a product called "Tortoise Tagger" - I have no personal experience with it, though.

My solution ultimately is to translate LaTeX files as plain text and just leave the tags untouched. No problem if you know what you're doing.

Sorry that I can't be of any real help. If you get a working filter, please let me know - it's quite some time ago that I looked into this.

Kind regards,
Erik
Collapse


 
RWS Community
RWS Community
United Kingdom
Local time: 10:03
English
There is an article here... Mar 20, 2015

... by Richard Puschmann that might be useful: http://rpuschmann.jimdo.com/2015/03/20/translate-latex-documents-in-studio-sure/

Regards

Paul
SDL Community Support


 
Erik Freitag
Erik Freitag  Identity Verified
Germany
Local time: 10:03
Member (2006)
Dutch to German
+ ...
Too easy? Mar 20, 2015

Dear Paul,

Thanks for the link - looks promising, but it looks like you'd have to repeat steps 8ff. for any LaTeX command (of which there is a large number if you take the standard commands, and an infinite number if you want to include user defined commands or packages, which is quite a standard procedure in LaTex)?

Regards,
Erik


Nieves Pueyo
 
Meta Arkadia
Meta Arkadia
Local time: 15:03
English to Indonesian
+ ...
Theoretical workflow Mar 22, 2015

[still downloading tons of LaTex related stuff, fascinating stuff]

I think the following could work:

- Open the Tex file in an editor that can handle it, and can
- Print the resulting text as PDF (requires plug-in). That PDF file would contain the visible text (that needs to be translated) plus images, formulas, and things, but no code
- Extract the text from the PDF, save as plain TXT, delete any (references to) images/formulas
- Translate in your fav
... See more
[still downloading tons of LaTex related stuff, fascinating stuff]

I think the following could work:

- Open the Tex file in an editor that can handle it, and can
- Print the resulting text as PDF (requires plug-in). That PDF file would contain the visible text (that needs to be translated) plus images, formulas, and things, but no code
- Extract the text from the PDF, save as plain TXT, delete any (references to) images/formulas
- Translate in your favourite CAT tool, and save the translation as a TMX file (or other suitable memory format)
- Start a new project with the original .json file as plain text file*
- Run "Insert perfect matches from memory"**
- Populate empty source segments**
- Export.

* You may want to add " as a segment delimiter.
**I take it the CAT tool offers those features, don't know about Trados.

Cheers,

Hans


[Edited at 2015-03-22 11:37 GMT]

[Edited at 2015-03-22 12:33 GMT]
Collapse


 
Meta Arkadia
Meta Arkadia
Local time: 15:03
English to Indonesian
+ ...
Nice try... Mar 22, 2015

... but the procedure above can - and therefore will - go wrong if there's any extra formatting within a segment.

An example from the Wiki:



Since I don't have the plug-in yet (still downloading the el-cheapo way, at night only), I can't produce the PDF yet, but it's there one the very same Wiki page. Because of formatting, simply extracting the text from the PDF is going to produce misery. If there's formatting within a sentence/segment, like LaTeX and TeX above, the text will have to be copied piece-by-piece, copying things like LaTeX and TeX as a separate segment.

H

[Edited at 2015-03-22 12:56 GMT]


 
RWS Community
RWS Community
United Kingdom
Local time: 10:03
English
You're right... Mar 23, 2015

Erik Freitag wrote:

Thanks for the link - looks promising, but it looks like you'd have to repeat steps 8ff. for any LaTeX command (of which there is a large number if you take the standard commands, and an infinite number if you want to include user defined commands or packages, which is quite a standard procedure in LaTex)?



... it is tricky because of the number of commands and particularly the user defined stuff. I couldn't see an obvious place where all this information is available in an easily digestible format but for the specific files I have seen from Vitaly (he emailed me off forum) I resolved it like this... for interest.

I extracted all new lines that started without any code as these seemed to be translatable text in these files and I added the specific lines starting with code that also seemed to have translatable text. So this gives me an import of all the stuff that should be handled plus any inline tags. I did this with a single expression that can be expanded as necessary in the document structure section of the regex filetype:

Opening Pattern
^(?!\w|\\chapter|\\item).*|^

Closing Pattern
$

So here you can see I added \chapter and \item as these also seemed to contain translatable text.

Then I used the inline tags to protect code in the extracted lines. I needed three placeholde rules for these files:

\\chapter{
}$
\\item

This does a neat job on the three files I saw. But you're right that this won't do them all. I also think that even if we created a filetype to handle all the specified tags (whatever they are?) then you would still need the flexibility to add your own custom tags as well. So tricky as you noted.

But perhaps this approach would allow you to handle most of what you could identify on a scan through the files and then you could always handle any exceptions in exactly the same way you do now, but making sure you added them to your filetype so that over time your filetype became more and more capable, encompassing more variations.

The best solution of course would be to see a thorough specification and then create a proper filetype for these files. Interesting problem though.

Regards

Paul
SDL Community Support


 
RWS Community
RWS Community
United Kingdom
Local time: 10:03
English
Actually on reflection... Mar 23, 2015

... the inline tags could be better handled like tnis I think and then you "might" get away with only having to edit the document structure line by adding any new elements that contain translatable text:

\\\w+
}$
{


All good fun and depends on what the names of these elements look like!

Regar
... See more
... the inline tags could be better handled like tnis I think and then you "might" get away with only having to edit the document structure line by adding any new elements that contain translatable text:

\\\w+
}$
{


All good fun and depends on what the names of these elements look like!

Regards

Paul
SDL Community Support
Collapse


 
Dan Lucas
Dan Lucas  Identity Verified
United Kingdom
Local time: 09:03
Member (2014)
Japanese to English
I recommend the simple approach Mar 23, 2015

SDL Community wrote:
All good fun and depends on what the names of these elements look like!

I was mildly obsessed with TeX/LaTeX back in the late 1980s as there wasn't much else out there that would do "real" typesetting back then. I still have the TeX and Metafont books hanging around. Brings it all flooding back! It is an elegant and fascinating system and still widely used in academia.

Having said that, I'd be inclined to suck it all into Studio and just translate the text by overwriting it, as suggested by someone earlier. Once I would have seen this as a challenge, now I just think "Life is too short".

Regards
Dan


 
Vitals
Vitals  Identity Verified
Lithuania
Local time: 11:03
English to Lithuanian
+ ...
TOPIC STARTER
Some help is still needed... Mar 27, 2015

SDL Community wrote:

... the inline tags could be better handled like tnis I think and then you "might" get away with only having to edit the document structure line by adding any new elements that contain translatable text:

\\\w+
}$
{


All good fun and depends on what the names of these elements look like!

Regards

Paul
SDL Community Support


Dear all,

Paul has been extremely helpful in my problem.

I am still struggling with a few files, and did not quite understand this recent post. Paul is away on a leave, so he is unavailable.

Can anyone explain how I can implement this:

\\\w+
}$
{


Thank you in advance.

Vitaly


 
RWS Community
RWS Community
United Kingdom
Local time: 10:03
English
I sent you... Mar 27, 2015

... an updated settings file. These last three files you sent me start to make it trickier but I think I got it. used these:

Opening Structure
^(?!\w|\\chapter|\\item|\s*\\caption|\s*\w).*|^

Closing Structure
$


Inline tags
\\\w+
}
{
^(?!\w|\\chapter|\\item|\s*\\caption|\s*\w).*|^

Closing Structure
$


Inline tags
\\\w+
}
{
(This one I also added an "Exclude" in the advanced settings to break the text)
\\\\

Hope this does it for all your files now.

Regards

Paul
SDL Community Support
Collapse


 
Nieves Pueyo
Nieves Pueyo
Spain
Local time: 10:03
English to Spanish
+ ...
SDL support on Latex document processing doesn't work for me. Could you help me solve it? Apr 3, 2019

SDL Community wrote:

... by Richard Puschmann that might be useful: http://rpuschmann.jimdo.com/2015/03/20/translate-latex-documents-in-studio-sure/

Regards

Paul
SDL Community Support


Hi,

We are trying to process Latex documents within Trados Studio and came across this post solving the issue, but I've tried it and it hasn't worked.

The first problem was that, for this three opening patterns:
\\newstep{
\\note
\\newcommand{\\[a-zA-Z]{0,}}{
Trados Studio "warned" me that you can't use line break characters in opening patterns (but the warning only pops for these three, I don't know why). Anyway, so I delete the "\\" and then everything seems fine. I create a project and Trados recognises the file as "translatable", but then I see on the analysis report that it has only recognised a minimum part of the text (around 140 words out of 13000).

Could you please help me solve this issue? This client said he would have plenty of other similiar jobs (same format) in the future and I would need to be able to process this type of file in Trados Studio.

Thank you very much in advance,

Nieves


 
Nieves Pueyo
Nieves Pueyo
Spain
Local time: 10:03
English to Spanish
+ ...
SDL support on Latex document processing doesn't work for me. Could you help me solve it? Apr 4, 2019

Hi Paul,

As I said, I tried that solution but it didn't work.

"The first problem was that, for this three opening patterns

\\newstep{
\\note
\\newcommand{\\[a-zA-Z]{0,}}{

Trados Studio "warned" me that you can't use line break characters in opening patterns (but the warning only pops for these three, I don't know why). Anyway, so I delete the "\\" and then everything seems fine. I create a project and Trados recognises the file as "
... See more
Hi Paul,

As I said, I tried that solution but it didn't work.

"The first problem was that, for this three opening patterns

\\newstep{
\\note
\\newcommand{\\[a-zA-Z]{0,}}{

Trados Studio "warned" me that you can't use line break characters in opening patterns (but the warning only pops for these three, I don't know why). Anyway, so I delete the "\\" and then everything seems fine. I create a project and Trados recognises the file as "translatable", but then I see on the analysis report that it has only recognised a minimum part of the text (around 140 words out of 13000)."

Is there any other way to set that file type structure so that it recognizes all the content and not only a minimum part of it? I could send you one of the files so that you can help me work it out.

Thank you and best regards,

Nieves
Collapse


 
Filip Látal
Filip Látal
Czech Republic
Local time: 10:03
English to Czech
+ ...
Some more TEX opening patterns to deal with Aug 29, 2023

Hi Paul,
many thanks for helping me get started loading a large LaTex project into Trados Studio. I'm currently using an opening pattern based on yours:
^(?!\w|\\textsc|\\textbf|\\bfseries|\{?\\large|\\copyright|\\makebox|\\input\|\\bibitem|\\emph|\(|\{|\\item|\\verb|\\caption|\\section|\\subsection| *\\item).*|^
I've viewed about 10% of my project and I'm still finding more section tags to add to the above regex.
Just now I came across a section which reads:
... See more
Hi Paul,
many thanks for helping me get started loading a large LaTex project into Trados Studio. I'm currently using an opening pattern based on yours:
^(?!\w|\\textsc|\\textbf|\\bfseries|\{?\\large|\\copyright|\\makebox|\\input\|\\bibitem|\\emph|\(|\{|\\item|\\verb|\\caption|\\section|\\subsection| *\\item).*|^
I've viewed about 10% of my project and I'm still finding more section tags to add to the above regex.
Just now I came across a section which reads:
··\item \textbf{for cyklus}, jehož syntaxe je
··\begin{verbatim}
··for i=počáteční hodnota:krok:koncová hodnota
······posloupnost příkazů
··end
··\end{verbatim}

Notice the leading spaces (replaced by the · character). Your very useful regex captures the first line beginning with ··\item and ignores the second line beginning with ··\begin, just as intended. But how do I make Trados read lines 3, 4, and 5 that begin with a series of spaces?


In relation to LaTex files, or Regex Delimited Text files in general, I found that I'm unable to define more than one inline tag of Tag Pair type where the Opening rules are different but the Closing rule is the same. I.e. I have tag \textbf{...} which is translatable, and tag \pageref{...} which is not. Currently I'm using Opening rule \\\w+\{ and Closing rule \}, which leaves the non-translatable tags accessible. Can I work around this limitation? Or am I doing something wrong?
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

LaTex + SDL Trados 2011 - how to?







Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »