Write all source segments to a file

Update: Please, download scripts from the dedicated SF.net project page where they are maintained. Scripts at the links below might be obsolete (though most likely still working).

Since we started to make OmegaT write stuff to files, let’s try to dump all source segments to one file. I’m pretty sure one can find some use for it.

  • write_source2file.groovy
    /*
     * #Purpose:	Write all source segments to a file
     * #Files:	Writes 'allsource.txt' in the current project's root
     * 
     * @author:	Kos Ivantsov
     * @date:	2013-07-16
     * @version:	0.2
     */
    
    /* change "includefilenames" to anything but 'yes' (with quotes)
     * if you don't need filenames to be included in the file */
    
    def includefilenames = 'no'
    def includerepetitions = 'no'
    
    import static javax.swing.JOptionPane.*
    import static org.omegat.util.Platform.*
    
    // abort if a project is not opened yet
    def prop = project.projectProperties
    if (!prop) {
    	final def title = 'Source to File'
    	final def msg   = 'Please try again after you open a project.'
    	showMessageDialog null, msg, title, INFORMATION_MESSAGE
    	return
    }
    
    def folder = prop.projectRoot+'/script_output'
    def fileloc = folder+'/allsource.txt'
    writefile = new File(fileloc)
    if (! (new File(folder)).exists()) {
    	(new File(folder)).mkdir()
    	}
    
    writefile.write("", 'UTF-8')
    def count = 0
    def uniqline
    
    if (includefilenames == 'yes') {
    	files = project.projectFiles;
    	for (i in 0 ..< files.size())
    	{
    		fi = files[i];
    		marker = "+${'='*fi.filePath.size()}+\n"
    		writefile.append("$marker|$fi.filePath|\n$marker", 'UTF-8')
    		for (j in 0 ..< fi.entries.size())
    		{
    		ste = fi.entries[j];
    		source = ste.getSrcText();
    		writefile.append source +"\n", 'UTF-8'
    		count++;
    		}
    	}
    } else {
    	project.allEntries.each { ste ->
    	source = ste.getSrcText();
    	writefile.append source+"\n",'UTF-8'
    	count++
    		}
    	console.println "$count segments found in all files"
    	if (includerepetitions != 'yes') {
    		count = 0
    		uniqline = writefile.readLines().unique()
    		//console.println uniqline
    		writefile.write("",'UTF-8')
    		uniqline.each {
    		writefile.append "$it\n\n",'UTF8';
    		count++
    				}
    			}
    	}
    
    console.println count +" segments written to "+ writefile
    final def title = 'Source to File'
    final def msg   = count +" segments"+"\n"+"written to \n"+ writefile
    showMessageDialog null, msg, title, INFORMATION_MESSAGE
    return
    

    Once the script is invoked, it’ll create a file named “allsource.txt” in the current project’s root folder, where each segment will be on a new line. It’ll contain all the segments, even the ones that are already translated, and all the repetitions. The script can either just dump all segments into the file, or write out a filename in a box like this
    +====+
    |file|
    +====+

    followed by all the segments that belong to this file, and then a new filename and respective segment, and so on, or just dump all the segments in the order they appear in OmegaT without indicating what files they belong to. This behavior can be triggered by changing line 13. When it says def includefilenames = 'yes', you’ll get filenames written to the allsource.txt, but if you don’t want the filenames, change ‘yes’ to anything else or even leave it empty, making sure you have quotes, i.e. it can say def includefilenames = 'no, thanks' or even def includefilenames = '', but not def includefilenames = no (no quotes in the last example).
    The way the filenames get marked is defined in lines 44, 45.
    If filenames are not included, one can choose whether to include repetitions (line 14). 'yes' means “yes”, anything else, even 'yep', means “no”.

Suggestions, enhancements, bug reports, donations, postcards, invitations to a cup of coffee, feature requests, interesting translation projects with a good pay etc. are always welcome. Criticism isn’t, but will be accepted too.


wordpress visitor

But as of now,
Good luck

SVN status

Here’s a little script that checks current SVN status of various files in an OmegaT team project. May not be awfully useful, but sometimes it can help you prevent or solve SVN sync issues. It doesn’t do anything special, just shows you the status of the project’s project_save.tmx, main writable glossary, current file and the whole project folder. Continue reading

Batch Search and Replace and Selective Pretranslation in OmegaT

In this post I want to share three scripts that can do an extended search and replace in OmegaT project. Search and replace templates for each script are specified in external plain text files located in project’s root folder, so these scripts without any modifications can be used for different projects with different sets of search and replace patterns — the user needs to modify only those plain text files as needed. On top of text modification there is a possibility to do a simple math on what is being found by the script thus enabling the user to have a per project unit converter.
Each script should be accompanied by its own external file located in a subfolder named .ini in the project’s root (details under each script further on). The format of these files is the same for all three:


  • Only one empty line in the file — the very last one
  • Each line consists of tree blocks:
    1. Search pattern (regex aware)
    2. Tab
    3. Replace pattern

So, if you need to replace “Владимир Владимирович” (taking into consideration different cases of Russian nouns) with “the President of Russian Federation“, here’s what you need to specify in the substitution file:
Владимир\p{L}?+\sВладимирович\p{L}?+ the President of Russian Federation Continue reading