Substitute Template For Each Project

Update: Please, download scripts from the dedicated SF.net project page where they are maintained. Scripts at the links below might be obsolete (though most likely still working).

Here I have a script that reads a tab-separated file (any number of tabs between items), each line of which contains the patterns to be found in the first position, and what it should be replaced with in the second. This file MUST be named subst_template.txt (well, it can be changed in the script, so maybe such a loud “must” isn’t really needed). The first pair should start on the first line, no empty lines between the pairs, and after the final pair there should be exactly one empty line. Below you’ll find an example of such file.
The file ought to be placed in OmegaT project’s root. That is made intentionally so that one can have a unique set of substitute patterns for each project. For example, I had an English to Ukrainian Christian project where names of the Bible books needed to be translated using one particular Ukrainian Bible version (Khomenko Bible), while for another project they needed to be taken from another version (Ohiyenko Bible). While English abbreviations remained the same, Ukrainian needed to be quite different (for instance, “Jn.” was “Йо.” in one, and “Ів.” in the other). So having a separate substitute pattern file in each projects I could use just one script to get Bible references with proper abbreviations in each of them.Following are the script and a sample subst_template.txt (headings are clickable, they link to pastebin.com)

  • replace_with_template.groovy
    /*
     *  Substitute with template
     *
     * @author  Kos Ivantsov
     * @date    2013-06-20
     * @version 0.1
     */
    
    import static javax.swing.JOptionPane.*
    import static org.omegat.util.Platform.*
    
    def prop = project.projectProperties
    if (!prop) {
      final def title = 'Replace using substitution file'
      final def msg   = 'Please try again after you open a project.'
      showMessageDialog null, msg, title, INFORMATION_MESSAGE
      return
    }
    
    def folder = prop.projectRoot
    def fileloc = folder+'/subst_template.txt'
    subst_file = new File(fileloc)
    
    if (! subst_file.exists()) {
    	final def title = 'No file' ;
    	final def msg   = 'Substitution file ' + subst_file + ' doesn\'t exist.' ;
    	showMessageDialog null, msg, title, INFORMATION_MESSAGE ;
    	return
    	}
    
    length = subst_file.readLines().size() ;
    search_array = []
    replace_array = []
    def count = 0 ;
    
    while ( count < length ) {
    	ln = subst_file.readLines().get(count).tokenize('\t')
    	sr = ln[0]
    	rp = ln[1]
    	search_array.add(sr)
    	replace_array.add(rp)
    	count++ ;
    	}
    
    def range = 0..(search_array.size() - 1)
    
    /*
     * The script can either use source text, replacing the target after all  
     * substitutions, or it can use the text that is already placed as translation,
     * replacing it with itself after substituting
     * To enable desired behaviour, use one of the two following lines (be 
     * sure to have the other commented out) 
     */
    
    //target = editor.currentEntry.getSrcText();
    target = editor.getCurrentTranslation()
    
    for ( i in range) {
    	target = (target =~ search_array[i] ).replaceAll( replace_array[i] )
    	}
    
    editor.replaceEditText(target);
    
  • By default the script works on what is placed as translation for current segment. In that portion of text it sequentially substitutes everything that is specified in the template, and replaces the translation with the changed text. Uncommenting line 55 and commenting out line 56 will make it use the source segment as the text in which substitutions will be made, but the resultant text will replace translation in either case.

  • subst_template.txt
    \"([\p{L}\p{Nd}])	«$1
    ([\p{L}\p{Nd}])\"	$1»
    Mt\s(\d+)		Мт. $1
    (\d+([,\.]?\d+)*)\s?€	€$1
    
  • In this example the first line is replacing quote-sign and the following letter or digit into Cyrillic opening smart-quote following the same letter; the second is reverse from the first, but with the closing smart-quote; then English two letters Mt followed by a space and a digit/digits is replaced with Ukrainian letters Мт, space and a period, and then the found digits; and the last line is putting Euro sign in front of the digits that used to have the sigh behind them.
    The list can easily be expanded to include whatever substitutions one can think of (like comma and period for thousand separator, measure units names in source and target languages etc).


I hope someone can find it helpful, but if you know of a way to improve it, please share. I’d love to include a way to perform arithmetic operations in the replaced patterns, so one could search for (\d+)(\s?)mile(s?) and replace with ($1*1.6)$2km. Anyone knows how to achieve that?


UPDATE:

In this post there’s an enhanced version (with unit conversions and other fancy stuff) of the script plus two others for global search and replace.


But as of now,
Good luck!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s