OmegaT match insert/replace without tags

Situation

After having translated a complete user manual that you converted from PDF to ODT to be able to work on it in OmegaT, you receive another manual from the same client, but this time it’s a DOCX file. Great! You can start right away, without converting anything. That should be a peace of cake — half of the manual looks almost the same as the one you have just done.

Problem

After starting to work with it you find out that getting a lot of 95-97% would be really awesome, if it wasn’t for all those nasty tags that are very different in the source and in the match. And there is no “Insert match without tags” menu item in OmegaT (yet).

Solution

Here’s a little solution that you can use until this functionality would be present in OmegaT.

Bash script for GNU/Linux

#!/bin/bash
# Strip tags from match
# Created by Kos Ivantsov
# Requires:
# keyboard automation software (xdotool, xmacro, xte) - xte used here
# xclipboard command line manager (xsel, xclip) - xclip used here
# window interaction software (xdotool,wmctrl) - wmctrl used here

case $1 in
replace)
if [ !  -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
xte "keydown Control_L" "key r" "keyup Control_L"
export PREVCLIP=`xclip -o -selection clipboard`
xte "keydown Control_L" "key a" "keyup Control_L"
xte "keydown Control_L" "key x" "keyup Control_L"
else
exit 0
fi

sleep 0.5
xclip -o -selection clipboard |\
sed 's/<[^>]*>//g' |\
tr -d '\n' | xclip -i -selection clipboard

if [ !&nbsp; -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
xte "keydown Control_L" "key v" "keyup Control_L"
else
exit 0
fi

sleep 8
echo "$PREVCLIP"| xclip -i -selection clipboard
exit 0
;;

insert)
export PREVCLIP=`xclip -o -selection clipboard`
sleep 0.2
echo ""|tr -d '\n'| xclip -i -selection clipboard
if [ !&nbsp; -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
xte "keydown Control_L" "key x" "keyup Control_L"
echo ""|tr -d '\n'| xclip -i -selection clipboard
sleep 0.1
xte "keydown Control_L" \
"keydown Shift_L" "key Home" "keyup Shift_L" "keyup Control_L"
xte "keydown Control_L" "key x" "keyup Control_L"
sleep 0.2
PREV=`xclip -o -selection clipboard`
echo ""|tr -d '\n'| xclip -i -selection clipboard
else exit 0
fi

if [ !&nbsp; -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
xte "keydown Control_L"\
    "keydown Shift_L" "key End" "keyup Shift_L" "keyup Control_L"
xte "keydown Control_L" "key x" "keyup Control_L"
sleep 0.2
REST=`xclip -o -selection clipboard`
echo ""|tr -d '\n'| xclip -i -selection clipboard
else exit 0
fi

if [ !&nbsp; -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
xte "keydown Control_L" "key r" "keyup Control_L"
xte "keydown Control_L" "key a" "keyup Control_L"
xte "keydown Control_L" "key x" "keyup Control_L"
sleep 0.2
MATCH=`xclip -o -selection clipboard |sed 's/<[^>]*>//g'`
echo ""|tr -d '\n'| xclip -i -selection clipboard
else exit 0
fi

if [ !&nbsp; -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
echo "$PREV" |tr -d '\n'| xclip -i -selection clipboard
xte "keydown Control_L" "key v" "keyup Control_L"
sleep 0.2
else exit 0
fi

if [ !&nbsp; -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
echo "$MATCH" | tr -d '\n'|xclip -i -selection clipboard
sleep 0.2
xte "keydown Control_L" "key v" "keyup Control_L"
else
exit 0
fi

if [ !&nbsp; -z "$(wmctrl -l|grep OmegaT-)" ] ; then
wmctrl -a "OmegaT-"
echo "$REST" | tr -d '\n'|xclip -i -selection clipboard
sleep 0.2
xte "keydown Control_L" "key v" "keyup Control_L"
else
exit 0
fi

unset PREV
unset MATCH
unset REST
sleep 3
echo "$PREVCLIP"| tr -d '\n'| xclip -i -selection clipboard
exit 0
;;
esac

It should be saved somewhere with a convenient name (like striptagmatch), made executable and invoked with either of two parameters: insert or replace. To do that, you should assign two keyboard shortcuts, one for striptagmatch replace, and another one for striptagmatch insert (or, if the script was not saved in $PATH, the whole path to the striptagmatch should be specified).
In case someone wants to recreate this functionality, what this script does is:

  1. Saves the current clipboard to $PREVCLIP so that the clipboard is back to what it was before the script was executed.
  2. Checks on different stages if OmegaT is running and focused, so that the text gets cut from and inserted to the proper window.
  3. a. When “replace” is executed, it focuses OmegaT, simulates Ctrl+R (insert match, replacing what is in the editor text input field), then Ctrl+A and Ctrl+X (select all and cut). Now the match is in the clipboard.
    b. When “insert” is executed, then after focusing OmegaT it simulates pressing Shift+Ctrl+Home, Ctrl+X (select to the very beginning of the segment and cut) and saves this first part of the segment as a $PREV. Then it simulates Shift+Ctrl+End and Ctrl+X (select to the very end of the segment and cut). This last part of the segments is saved as $REST. Finally everything from p. 3a is run.
  4. The content of the clipboard is stripped from tags using sed: sed 's/<[^>]*>//g'
  5. This new clean match is saved as $MATCH
  6. a. In case of “replace” $MATCH is inserted into OmegaT (upon checking if it’s running and bringing it up).
    b. For “insert” first $PREV gets inserted, then $MATCH, and then $REST.
  7. Eventually the clipboard is filled up with what was stored in $PREVCLIP

Originally the idea was shared by Jean-Christophe Helary at OmegaT Yahoo! Group.

The above script can be greatly enhanced and optimized, but hopefully it won’t be necessary as all of that would be possible natively in OmegaT.

In case you’re a Windows user

Here’s an AutoIt script that removes tags from the text in the editor input field (it can be your own translations with wrong tags, the source text with tags or an inserted match). Kudos to Kerry Swatridge for sharing this.

#Include <Array.au3>
HotKeySet("^@", "StripTags")
while True
    sleep(1000)
WEnd
func StripTags()
    local $str, $fileopen
    if winactive("OmegaT", "") then
    send("^a")
    send("^+c")
    sleep(100)
    $fileopen = fileopen("selection.txt",128)
    $str=fileread($fileopen)
    fileclose($fileopen)
    $str=stringregexpreplace($str,"</?[a-z]{0,3}[0-9]{0,5}/?>","")
    send("^a")
    clipput($str)
    send("^v")
    EndIf
endfunc

Line 2 of the above listing assigns a hotkey Ctrl+@, it can be changed as needed.


But as of now
Good luck!


UPDATE:

All of this can be regarded as obsolete or redundant, since there are ways to achieve the same functionality in a crossplatform manner using groovy scripts in OmegaT. Here’s the recipe.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s