Clear #OmegaT volatile and backup files

OmegaT has a great safety feature: it automatically backs up project memory files (i. e. project_save.tmx) on every project (re)load, plus it creates another backup file on every save (if the project was changed since the last load or save). Those backup files can be used in an extremely rare case when something happens to the main project memory and all of the work seems to be lost. It happened to me once when I was only starting to use OmegaT as my main tool, and was I glad this backup feature was thought of!

But this very feature can become a tiny problem, especially in ongoing projects where project_save.tmx keeps on growing bigger and bigger. While creating backups is great and very helpful, there’s no routine to remove old backup files. It isn’t uncommon in some of the projects that I work at that project_save.tmx is a few MB’s, but omegat folder where that file is located, is over 100 MB’s or more, and only because of all the backups. With modern disk sizes it’s not a big deal, and it doesn’t degrade OmegaT’s performance a bit, but sometimes there’s a need to make a project slim again (like when you’re going to send it to your colleague or client, or copy it to cloud storage or another computer, or you’re obsessed with keeping everything trim and slim and tidy).

240px-broom_icon-svg

So, anyway, after all these numerous words here’s what I’m getting at. At Sourceforge.net (download link) there’s this script that removes all the backups of project_save.tmx (including the ones created in team projects before performing sync), everything in target folder (as usually in the projects that need this cleaning, source files are changed, but old target files sometimes just keep on piling up), and also three TMX files created in the root of the project every time target files are produced.
About a year ago I put together the original version of this script, but this new version downloadable from Sourceforge.net contains a few improvements and can easily be localised if you care to be warned about file deletion in your language.

Use it to your heart’s content and your own risk, and I’ll be very thankful for any questions, suggestions or comments.

DISCLAIMER: If you lose your work without any hope to recover because all backups have been deleted by this script, it ain’t my fault. You should backup regularly and not hope that OmegaT will do it for you.


But as of now,
Good luck!

Advertisements

Merging and Splitting Segments in #OmegaT without editing segmentation rules.

One of the complains OmegaT gets is impossibility to split and merge segments without editing projects’ or global segmentation rules.  There were a few attempts to address the issue, but they required a third-party utility that would edit segmentation.conf. One of the most recent attempt was Dimitry Prihodko’s Merge utility. If I understood it right, Dimitry asked Yu Tang to rework his thingy, and Yu Tang came up with a Groovy script that did all the merging using only OmegaT internals. It wasn’t limited to any OS or dependent on other tools (so much for hard Pascal coding, Dimitry). There was only a minor issue that the script couldn’t be used to split segments. And that’s what I’ve added and what I’m sharing here. Continue reading

#ISO 9:1995 #Transliteration in #OmegaT

This short announcement might be of some interest to those OmegaT users who work with Cyrillic text. Below you’ll find a script that transliterates current target or selection according to transliteration standard ISO 9 (one of the very few reversible Cyrillic translit systems).  The script is a tiny adaptation of the one discussed in the article Translit для JavaScript.

All you need to do is to copy it into your scripts folder and run it when there’s something you need transliterated (can be run multiple times — it’ll toggle the text between Cyrillic and Latin). If the text is not transliterateable, the script will not change it.

Here’s the link to the script: http://pastebin.com/npXEthmc (download).

//:name=Utils - Translit :description=Transliterate current target or selection
/*******************************************************************************
* @Name   : "translit(a, b)"                         // Имя
* @Params :   str  - транслитерируемая строка        // Параметры запуска
              typ  - [123456]
                   system A = 1-диакритика
                   system B =(2-Беларусь;3-Болгария;4-Македония;5-Россия;6-Украина)
                   Если typ отрицательное - обратная транслитерация
* @Descrp : Прямая и обратная транслитерация         // Описание
            по стандарту ISO 9 или ISO 9:1995 или ГОСТ 7.79-2000 системы А и Б
* @ExtURL : ru.wikipedia.org/wiki/ISO_9              // Внешний URL
* #Guid   : {E7088033-479F-47EF-A573-BBF3520F493C}   // GUID
* @Exampl : "example()"                              // Пример использования
* GPL applies. No warranties XGuest[11.02.2015/03:44:01] translit [ver.1.0.1]
*******************************************************************************/
var dia = false;
//var loc = java.util.Locale.getDefault().getLanguage();
var prop = project.getProjectProperties();
var ste = editor.currentEntry;
if (editor.selectedText){
	var target = editor.selectedText;
	}else{
	var target = editor.getCurrentTranslation();
	}

var tlcode = prop.getTargetLanguage().getLanguageCode();
var suplang = ["BE", "BG", "MK", "RU", "UK"];

if ((/[\u0400-\u04ff]+/ig).test(target)){
	transcode = suplang.indexOf(tlcode) ? suplang.indexOf(tlcode) + 2 : 0 ;
	transcode = dia ? 1 : transcode ;
	}else{
	transcode = suplang.indexOf(tlcode) ? -(suplang.indexOf(tlcode) + 2) : 0 ;
	transcode = dia ? -1 : transcode ;
	}

exports = function (str, typ) {
 var func = function (typ) {
 /* Function Expression
  * Вспомогательная функция.
  *
  * В ней и хотелось навести порядок.
  *
  * Проверяет направление транслитерации.
  * Предобработка строки (правила из ГОСТ).
  * Возвращает массив из 2 функций:
  *  построения таблиц транслитерации.
  *  и пост-обработки строки (правила из ГОСТ).
  *
  * @param  {Number} typ
  * @return {Array}
  */
  var abs = Math.abs(typ);             // Абсолютное значение транслитерации
  if(typ === abs) {                    // Прямая транслитерация(кирилица в латиницу)
   // Правила транслитерации (из ГОСТ).
   // "i`" только перед согласными в ст. рус. и болг.
   //  str = str.replace(/(i(?=.[^аеиоуъ\s]+))/ig, "$1`");
   str = str.replace(/(\u0456(?=.[^\u0430\u0435\u0438\u043E\u0443\u044A\s]+))/ig, "$1`");
   return [                            // Возвращаем массив функций
    function (col, row) {              // создаем таблицу и RegExp
     var chr;                          // Символ
     if(chr = col[0] || col[abs]) {    // Если символ есть
      trantab[row] = chr;              // Добавляем символ в объект преобразования
      regarr.push(row);                // Добавляем в массив RegExp
     }
    },
    // функция пост-обработки
    function (str) {                   // str - транслируемая строка.
    // Правила транслитерации (из ГОСТ).
    return str.replace(/i``/ig, "i`"). // "i`" только перед согласными в ст. рус. и болг.
    replace(/((c)z)(?=[ieyj])/ig, "$2");// "cz" в символ "c"
    }];
  } else {                             // Обратная транслитерация (латиница в кирилицу)
   str = str.replace(/(c)(?=[ieyj])/ig, "$1z"); // Правило сочетания "cz"
   return [                            // Возвращаем массив функций
    function (col, row) {              // Создаем таблицу и RegExp
     var chr;                          // Символа
     if(chr = col[0] || col[abs]) {    // Если символ есть
      trantab[chr] = row;              // Добавляем символ в объект преобразования
      regarr.push(chr);                // Добавляем в массив RegExp
     }
    },
   // функция пост-обработки
   function (str) {return str;}];      // nop - пустая функция.
  }
 }(typ);
 var iso9 = {                          // Объект описания стандарта
   // Имя - кириллица
   //   0 - общие для всех
   //   1 - диакритика         4 - MK|MKD - Македония
   //   2 - BY|BLR - Беларусь  5 - RU|RUS - Россия
   //   3 - BG|BGR - Болгария  6 - UA|UKR - Украина
   /*-Имя---------0-,-------1--,---2-,---3-,---4-,----5-,---6-*/
 "\u0449": [   "", "\u015D",   "","sth",   "", "shh","shh"], // "щ"
 "\u044F": [   "", "\u00E2", "ya", "ya",   "",  "ya", "ya"], // "я"
 "\u0454": [   "", "\u00EA",   "",   "",   "",    "", "ye"], // "є"
 "\u0463": [   "", "\u011B",   "", "ye",   "",  "ye",   ""], //  ять
 "\u0456": [   "", "\u00EC",  "i", "i`",   "",  "i`",  "i"], // "і" йота
 "\u0457": [   "", "\u00EF",   "",   "",   "",    "", "yi"], // "ї"
 "\u0451": [   "", "\u00EB", "yo",   "",   "",  "yo",   ""], // "ё"
 "\u044E": [   "", "\u00FB", "yu", "yu",   "",  "yu", "yu"], // "ю"
 "\u0436": [ "zh", "\u017E"],                                // "ж"
 "\u0447": [ "ch", "\u010D"],                                // "ч"
 "\u0448": [ "sh", "\u0161"],                                // "ш"
 "\u0473": [   "","f\u0300",   "", "fh",   "",  "fh",   ""], //  фита
 "\u045F": [   "","d\u0302",   "",   "", "dh",    "",   ""], // "џ"
 "\u0491": [   "","g\u0300",   "",   "",   "",    "", "g`"], // "ґ"
 "\u0453": [   "", "\u01F5",   "",   "", "g`",    "",   ""], // "ѓ"
 "\u0455": [   "", "\u1E91",   "",   "", "z`",    "",   ""], // "ѕ"
 "\u045C": [   "", "\u1E31",   "",   "", "k`",    "",   ""], // "ќ"
 "\u0459": [   "","l\u0302",   "",   "", "l`",    "",   ""], // "љ"
 "\u045A": [   "","n\u0302",   "",   "", "n`",    "",   ""], // "њ"
 "\u044D": [   "", "\u00E8", "e`",   "",   "",  "e`",   ""], // "э"
 "\u044A": [   "", "\u02BA",   "", "a`",   "",  "``",   ""], // "ъ"
 "\u044B": [   "",      "y", "y`",   "",   "",  "y`",   ""], // "ы"
 "\u045E": [   "", "\u01D4", "u`",   "",   "",    "",   ""], // "ў"
 "\u046B": [   "", "\u01CE",   "", "o`",   "",    "",   ""], //  юс
 "\u0475": [   "", "\u1EF3",   "", "yh",   "",  "yh",   ""], //  ижица
 "\u0446": [ "cz",      "c"],                                // "ц"
 "\u0430": [  "a"],                                          // "а"
 "\u0431": [  "b"],                                          // "б"
 "\u0432": [  "v"],                                          // "в"
 "\u0433": [  "g"],                                          // "г"
 "\u0434": [  "d"],                                          // "д"
 "\u0435": [  "e"],                                          // "е"
 "\u0437": [  "z"],                                          // "з"
 "\u0438": [   "",      "i",   "",  "i",  "i",   "i", "y`"], // "и"
 "\u0439": [   "",      "j",  "j",  "j",   "",   "j",  "j"], // "й"
 "\u043A": [  "k"],                                          // "к"
 "\u043B": [  "l"],                                          // "л"
 "\u043C": [  "m"],                                          // "м"
 "\u043D": [  "n"],                                          // "н"
 "\u043E": [  "o"],                                          // "о"
 "\u043F": [  "p"],                                          // "п"
 "\u0440": [  "r"],                                          // "р"
 "\u0441": [  "s"],                                          // "с"
 "\u0442": [  "t"],                                          // "т"
 "\u0443": [  "u"],                                          // "у"
 "\u0444": [  "f"],                                          // "ф"
 "\u0445": [  "x",      "h"],                                // "х"
 "\u044C": [   "", "\u02B9",  "`",  "`",   "",   "`",  "`"], // "ь"
 "\u0458": [   "","j\u030C",   "",   "",  "j",    "",   ""], // "ј"
 "\u2019": [  "'", "\u02BC"],                                // "’"
 "\u2116": [  "#"]                                           // "№"
  }, regarr = [], trantab = {};
 for(var row in iso9) {func[0](iso9[row], row);} // Создание таблицы и массива RegExp
 return func[1](                       // функция пост-обработки строки (правила и т.д.)
  str.replace(                         // Транслитерация
  new RegExp(regarr.join("|"), "gi"),  // Создаем RegExp из массива
  function (R) {                       // CallBack Функция RegExp
   if(                                 // Обработка строки с учетом регистра
    R.toLowerCase() === R) {
    return trantab[R];
   } else {
    return trantab[R.toLowerCase()].toUpperCase();
   }
  }));
};


if (! target){
	console.println("Target is empty");
	} else {
	var newtarget = exports(target, transcode)
	if (newtarget == target){
		console.println("Could not transliterate");
		}else{
		if (editor.selectedText){
			editor.insertText(newtarget);
			}else{
			editor.replaceEditText(newtarget);
			}
		console.clear();
		console.println(target + "\n↓\n" + newtarget);
		}
	}

Changing line 16 from ‘false’ to ‘true’ will make the script use diacritics for transliteration.

Voice Input and Translation Work (#Android, #Swype, #AIORemote), Take 2

In this article I’ll talk about using your Android device as a dictate box for your CAT tool or any other program where you need to type on your computer. Before going into specifics of this little recipe, let us overview the components in general. You’ll need these ingredients:

  1. Android keyboard that has a voice recognition option.
  2. Android application that works as a remote keyboard for your computer.
  3. Desktop application that receives input from the Android remote app.

There are quite a few nice Android keyboards and Android remote controls out there that would allow you to brew a similar goody, but I’ll describe what I believe is the most efficient mix. Continue reading

Voice Input in Translation Work (#Linux + Chrome + #OmegaT), Take 1

I always was rather skeptical about using dictate software in my translation work. But recently I read a success story where a person started to use Dragon Naturally Speaking, and it boosted his productivity by ungodly high percentage. Though it didn’t shake the deep skepticism of a die-hard Linux fanatic whose main target language isn’t supported by the major dictate software vendors, it doesn’t hurt to fool around and try a few things, does it?

As it turns out, one can save quite a few keystrokes by speaking into the cloud, and it can even be used on Linux in OmegaT. Google’s speech recognition supports my target language, several Chromium/Chrome browser’s apps and extensions kindly try to make written words out of my utterances, and then it’s up to me how I put it all together to be able to dictate instead of typing.

My working recipe is based on using SpeechPad – new voice notebook for voice input. This little thing can be installed as a Chrome app and can work in background, putting the recognized pieces into the clipboard. To enable that, one needs to put ticks in ” Restart on errors” and ” Transfer to clipboard”. It’s best to register with this application to be able to add new languages not listed by default (limited to what Google supports), add terms to the custom replacement list (to enable punctuation by voice for some languages, for instance), and do other things. It’s all done in the user’s profile (called “User data” on the main page). When the SpeechPad is fired up and listening in the background, you can switch to the app where you need to type (OmegaT in my case), dictate a logical chunk and press Ctrl+V. Some of the repeated mistakes in the text can be fixed with replace_with_template.groovy (see here for details on how to use the script). Or pasting and fixing can be done with one OmegaT script insert_modify_clipboard.groovy (the above link with details still applies, but substitution template should be named .ini/clipboard_substitution.ini).

I’ve noticed that in Ukrainian the speech gets recognized much better when I chant it (and that’s where my passion for the byzantine rite liturgical chanting comes real handy, although one of my buddies said that Rammstein style singing provides similar results). With all of it I did manage to get a productivity boost (and unplanned chanting practice). I’d be happy to hear suggestions on how to improve this recipe or change the ingredients to be able to type less and produce more.


But as of now,
Good luck

Clickable links in OmegaT# notes and comments

Here’s a GitHub project for an OmegaT plugin that converts URL’s in notes and comments into clickable items that open the URL’s in the default browser. Pretty neat, especially when you’re working in a team project and need to insert references for the editor or another translator.

Clickable links example

In order to install the plugin one needs to create a folder named LinkBuilder (or whatever sounds good and preferably makes sense) inside plugins subfolder either in the OmegaT installation folder, or in OmegaT settings folder, download the latest release, and unzip it into the newly created LinkBuilder folder. The plugin will be activated upon OmegaT restart (or in a new OmegaT instance).

I don’t know who the author of the plugin is (other than his username at GitHub is hiohiohio), but kudos anyway!!!

Major update to #OmegaT QA Script

Sometime ago my monkey approach to programming led me to creating a GUI for QA rules checking script. That was fun, the result was sometimes even usable, but since I don’t really know how to program, I got stuck with developing it. Ok, a rule or two was added now and then, but that doesn’t really count. But then all of a sudden the spellcheck script in OmegaT got drastically improved, and that meant I could mimic some new ideas. That’s exactly what I did, and here’s the new “QA – Check Rules” script:

Image

Continue reading