Translation jargon buster

This article covers some of the most common terms used by people working in the translation sector. I’ve also included a few terms used in the interpreting sector.

Terms specifically related to the following categories are labelled as such (just after the entry head):

CAT tool | Interpreting

101% match

CAT tool

See match.

102% match

CAT tool

See match.

A language / B language / C language

Interpreting

Interpreters divide their working languages into A, B and C languages:

  • Their A language is their mother tongue.

  • Their B languages are the languages in which they have a near-native level.

  • Interpreters work into their A and B languages so these are called their ‘active languages’.

  • Their C languages are other languages that they only work from, so these are called their ‘passive languages’.

Traditionally, interpreters at European institutions would only interpret into their A language but this is not an absolute rule. The massive demand for interpreters (especially after the EU expansion of 2004) and budgetary constraints led to greater flexibility.

A to B interpreting is, however, common in the private sector but even here interpreters may limit their offering to consecutive interpreting (to the exclusion of simultaneous interpreting). While C to B interpreting is another possibility, my understanding is that it’s not common.

alignment

CAT tool

The procedure by which a pair of documents, one of which is a translation of the other, is converted into a single bilingual document ready for incorporation into a translation memory.

It involves a segmentation step that breaks up each of the two documents into segments. Segments are moved up or down, split or merged to ensure that every translation unit contains a chunk of text (source segment) and its translation (target segment), no more, no less. CAT tools use their own algorithms to perform alignment but human input is still required to check the automatic alignment and to make any adjustments.

analysis

CAT tool

CAT tools offer users the ability to analyse a document that will be translated. The analysis is always performed with reference to a translation memory and the results give an indication of how much the translator’s workload will be reduced by the translation memory.

The CAT tool begins the analysis by segmenting the document. It then considers each translation unit in turn, looking for the best match — if any — in the translation memory for each source segment. The results of the analysis would look something like this:

See match and repetition to learn what each row represents.

Analysis results (sometimes called statistics) are used in conjunction with the grid to calculate discounts.

B language

See A language / B language / C language.

back-translation

Back-translation refers to the translation of a previously translated text back into the source language. Admittedly this sounds a bit daft but it’s actually a very good way to detect any shifts in meaning that the first translator might have introduced inadvertently.

For example, a document translated from English to French by one translator is then translated from French to English by another translator. Crucially, the FR-EN (back-)translator has no access to the original EN document.

During back-translation, being literal is more important than good style or sounding natural (in other words, the back-translator should know that he is back-translating).

bilingual file

CAT tool

In translator-speak, a bilingual file is a file that has been specifically designed to hold two languages for the purpose of facilitating translation.

Bilingual file formats are at the heart of how CAT tools work. A translation workflow involving a CAT tool can be broken down into the following steps:

1) The source-language text is created in a monolingual file, e.g. MonTexte.docx (Microsoft Word format).

2) The translator imports MonTexte.docx into her CAT tool, which converts it into a bilingual file, e.g. MonTexte.docx.xliff.

3) The translator translates the file in her CAT tool.

4) When she finishes translating, she uses the export function of her CAT tool to create a target-language document in the original format (i.e. that of the source-language document) e.g. MyText.docx.

By definition, proprietary file formats of CAT tools are bilingual. However, these tools can create bilingual versions of common monolingual formats, e.g. bilingual RTF files. These ‘bi-mono’ formats allow people without the corresponding CAT tool to work on bilingual files using more commonly available programs such as Microsoft Word (which supports .rtf files).

A translation workflow involving a CAT tool and an expert reviewer who doesn’t own a CAT tool would look something like this:

1) The source-language text is created as a monolingual file, e.g. MonTexte.docx (Microsoft Word format).

2) The translator imports MonTexte.docx into her CAT tool, which converts it into a bilingual file, e.g. MonTexte.docx.xliff.

3) The translator translates the file in her CAT tool.

4) When she finishes translating, she exports her translation as a bilingual .rtf file, e.g. MonTexte.rtf, which she then emails to the reviewer.

5) The reviewer reviews MonTexte.rtf in Microsoft Word, making any necessary changes, then returns the file to the translator.

6) The translator imports the revised MonTexte.rtf into her CAT tool. During the import, the CAT tool transposes the reviewer’s modifications into MonTexte.docx.xliff (the document that the translator was previously working on).

7) The translator uses the export function of her CAT tool to create a target-language document in the original format (i.e. that of the source-language document) e.g. MyText.docx. Job done!

C language

See A language / B language / C language.

CAT tool

CAT tool

CAT is an acronym of Computer-Aided Translation.

A CAT tool helps humans translate by retrieving chunks of previously translated text stored in a translation memory. This makes the process of translation more efficient because identical chunks of text are not translated over and over again at different times. This process also helps to ensure consistency within a document and between different documents in the same project.

This is how the process works in practice:

Let’s say a translation memory contains this translation unit from a document I translated two years ago: 

As I’m translating a separate document in my CAT tool, I arrive at the following translation unit:

When I click on the target segment to bring it into focus, the CAT tool pulls out the similar translation unit from the translation memory and presents it to me, helpfully highlighting the differences between the source segment I need to translate and the source segment in the  translation memory:

At a glance, I can see that I only need to change ‘a target’ to ‘a receptor’. A real time-saver!

A CAT tool can be a wonderful thing. For texts that remain largely unchanged from one version to the next, the benefits are tremendous.

The downside is that a CAT tool imposes its segment-centric approach, and this can be detrimental for certain types of translation.

The ‘retrieving previously translated content from a translation memory’ feature is the core feature of every CAT tool. Other common features are:

  • retrieving previously translated text from a machine translation provider.

  • retrieving previously translated text from a pair of monolingual documents, one of which is a translation of the other (in this case, alignment is done on the fly).

  • retrieving previously translated terms from a term base.

  • performing language quality assurance by flagging potential errors such as empty target segments, target segments that are identical to their source segments, extra spaces at the end of target segments, etc.

  • support for many file formats (via filters). Support for a given file format essentially means that the CAT tool can convert that file into its native bilingual format then convert the bilingual file back into the original file format.

Here are some well-known CAT tools:

For more information, take a look at this blog post↗(1) and the CAT tool comparison page on ProZ↗(2).

It’s good to know that many makers offer a time-limited trial of the fully functional product.

(1) www.polilingua.com | Best CAT Tools for Translation in 2025 | Elena Chiorescu | 28/JAN/2025

(2) ProZ.com’s Translator Software Comparison Tool

CEFR

CEFR stands for ‘Common European Framework of Reference’.

The CEFR was developed by the Council of Europe in 2001 as an international standard for measuring proficiency in a language. There are six levels of proficiency organised into three blocks. From the lowest to the highest proficiency, they are:

  • Basic User

    o A1

    o A2

  • Independent User

    o B1

    o B2

  • Proficient User

    o C1

    o C2

On this page↗, the Council of Europe provides a brief description of the ability at each level.

certified translation

See sworn translation.

clean file

CAT tool

See unclean and clean files.

concordance search

CAT tool

A concordance search is a basic function offered by CAT tools. It is a search for a particular word, phrase or sentence in a translation memory; it is typically carried out on demand (by the translator) when the CAT tool has not found any matches in the translation memory or in a term base for the current source segment.

fuzzy match

CAT tool

See match.

grid

CAT tool

If you and your agency client use the same CAT tool, then they may ask you for your discount grid (alternatively, they may hand you the discount grid they use as standard). It’s called a discount grid because in essence you will be providing a discount for text that the CAT tool helps you with.

This is what a grid looks like:

The first column shows different categories of matches between source segments in a document you will be translating and source segments in the translation memory you will be using. (In this particular case, the ‘new’ category corresponds to fuzzy matches below 75%.)

The second column shows what percentage of your per-word translation rate you will be paid for each match category.

Before sending you a document to translate (e.g. MySourceDocument.docx), your client will run an analysis in the CAT tool. This is what the results of an analysis look like:

Although matches are computed segment by segment, it is the words in these segments that are used to apply the grid. So in our present example, the 3 segments with a 75–84% match might consist of:

  • a 77% match for 1 segment with 9 words,

  • a 79% match for 1 segment with 6 words, and

  • an 84% match for 1 segment with 10 words.

The grid is applied to the analysis results to yield the total amount to be paid for the translation. This can be done in two ways:

Method 1

Method 2

The total weighted words might be called something else such as ‘net words’.

interpretation

Interpreting

See interpreting or read on for one of my random lexical musings.

When referring to the activity done by interpreters, I am inclined to use the word ‘interpreting’ even though I read and hear ‘interpretation’ being used. Personally I use ‘interpretation’ in the more general, non-language sense e.g. ‘A narrow interpretation of the European Convention on Human Rights’.

I wondered whether ‘interpretation’ was more readily used by non-native speakers or by people outside of the language sector; however my research didn't reveal any obvious patterns.

According to my hefty Collins English Dictionary:

  • the 4th sense of the verb ‘to interpret’ is:

    4. (intr) to act as an interpreter; translate orally

  • for the noun ‘interpretation’, 5 senses are given, none of which correspond with the 4th sense given for the verb ‘to interpret’.

  • ‘interpreting’ is nowhere to be seen.

On this page↗, the European Union uses the two words interchangeably. This makes me think that maybe the use of ‘interpretation’ has been driven by native speakers of other languages due to interference from their first language, e.g. the French interprétation or the Spanish interpretación.

What do you think? Your insights and opinions appreciated!

interpreting

Interpreting

The process by which speech in one language becomes speech in another language.

Not to be confused with translation!

ISO 639

Similarly to the way 3-letter codes are used to uniquely identify airports, 2- or 3-letter codes are used to identify languages.

The codes were standardised by the International Standards Organisation (ISO) in the ISO 639 standard. Parts 1 and 2 of the standard (ISO 639-1 and ISO 639-2) provide the 2-letter and 3-letter codes, respectively.

I discovered during research for this article that some languages have more than one 3-letter code, one used for bibliographic purposes and the other for terminology purposes.

The codes for my languages are as follows:

English                 EN                  ENG
French                  FR                  FRE (Bibliographic)             FRA (Terminology)
Greek                    EL                 GRE (Bibliographic)              ELL (Terminology)
Spanish               ES                 SPA
Catalan                CA                CAT

I’m not keen on splashing out on the actual standard↗ so here’s a handy link to the key information↗ by the US Library of Congress., which is more than sufficient for most of us.

language codes

See ISO 639.

language pair

Translators and other people working in the sector speak about their language pairs all the time. The first language mentioned is the source language, the second one is the target language.

My language pairs (using ISO 639 2-letter codes are):

FR-EN

EL-EN

ES-EN

localisation

Localising a translation means adapting it to a particular part of the world (or locale). It requires a good understanding of the language, dialect, idioms and culture of the target locale. For example, the Spanish of Spain is different to that of South America; and within South America, there is also remarkable diversity.

The cool kids sometimes say L10N instead (L followed by 10 letters then N).

LQA

CAT tool

LQA stands for ‘language quality assurance’.

After a translation has been completed, LQA is performed as a final check to make sure everything is as it should be.

The LQA tool, either within the CAT tool or an external program, flags potential errors such as:

  • empty target segments,

  • extra spaces,

  • identical source and target segments.

Next, a human goes through the potential errors one by one, either making the corresponding correction or dismissing the error because it’s not actually an error (LQA checks tend to be overzealous).

LSP

LSP stands for ‘language service provider’, a business providing translation and other language-related services.

machine translation

A machine translation (MT) is one in which a human does not invervene between the input and the output. As occurs, for example, when you visit a free online translation website, you enter your text and the translation is generated almost instantaneously.

The increasing use of machine translation for professional translations has led to a parallel rise in demand for people who provide PEMT services.

MT should not be confused with computer-aided translation (CAT).

match

CAT tool

As you translate a document in your favourite CAT tool, it helps you by continually searching through the translation memory (TM) and presenting you with any matches it finds. Matches are assigned a percentage.

A match may be a/an

exact or 100% match. The current source segment and the source segment in the TM are identical.

a fuzzy match. The match between the current source segment and the source segment in the TM is less than 100% but above a certain threshold called the fuzzy threshold, which is adjustable.

a context match. An exact match for the current source segment and also for the previous and/or next source segment. It’s sometimes called a 101% match (exact match for previous or next source segment) or a 102% match (exact match for previous and next source segment).

If the CAT tool does not find a matching segment or only finds poorly matching segments (% below the fuzzy threshold), the source segment is considered new.

See also repetition.

monolingual file

A monolingual file is a file format used by programs that we are all familiar with such as the .docx format of Microsoft Word.

When CAT tools came along, there was a new need to distinguish between monolingual files and bilingual files.

MT

See machine translation.

MTPE

MTPE is the abbreviation of ‘machine translation post-editing’. Though less common, I much prefer this term to ‘post-editing machine translation’ (PEMT).

NDA

NDA stands for ‘non-disclosure agreement’. Freelance translators are often asked to sign one.

PEMT

PEMT is the abbreviation of ‘post-editing machine translation’. MTPE (machine translation post-editing) is a synonym.

The ‘post’ in ‘post-edit’ refers to the fact that a human intervenes after the translation has been done by a machine.

In recent years, PEMT has gone mainstream, largely as a result of improvements in the quality of machine translations. Today most, if not all, CAT tools support PEMT through integration with an MT provider.

PM

PM stands for ‘project manager’.

pre-translation

CAT tool

Pre-translation is the translation done by a CAT tool before the human starts working on it. The chunks of translated text are sourced from a translation memory or a machine translation provider.

QA

QA stands for ‘quality assurance’.

See LQA.

repetition

CAT tool

In CAT tool jargon, a repetition is a match in which the translation unit has the same source segment as another translation unit in the same document.

A repetition should not be confused with a 100% match, in which the matching source segment was found in an external resource such as a translation memory.

segment, segmentation

CAT tool

Segmentation is the process by which a CAT tool breaks down the text of a document into much smaller chunks, segments, which are typically a single sentence but may be part of a sentence or even a single word or number (such as a value in a table cell).

Segments are paired up in translation units. The two segments in a translation are the source segment and the target segment, which correspond, respectively, to a chunk of text in the source language and its corresponding translation — or an empty space if the source segment has not yet been translated.

Here is a translation unit. The target side is still empty because the source segment hasn’t been translated yet.

sight translation

The process by which text in one language is verbally conveyed in another language. Interpreters are sometimes asked to perform sight translation during an assignment, most typically in a healthcare setting or a courtroom.

See translation.

SME

SME stands for ‘subject matter expertise’.

sworn translator / sworn translation

To be recognised and accepted as valid, translated documents submitted to government bodies or private entities are subject to the requirements laid down by that government or entity.

In common with some (not all) other countries, Spain has established a system of centrally approved translators. These are sworn translators (traductor jurado/traductora jurada). A sworn translation is one that has been done by a sworn translator and is accompanied by a photocopy of the original document (the photocopy must be a hard copy not an electronic file), a certificate in which the translator attests that the translation is faithful to the original along with his/her credentials and seal. To become a sworn translator in Spain, you must pass a series of exams and are then appointed by the Ministry of Foreign Affairs for a specific language pair (which is either to or from Spanish).

The system in other countries is very different. In the UK, for example, there is no system of pre-approved translators. The requirements for ‘official’ translations vary widely. It may be sufficient for the translator to declare in writing with a signature that the translation is an accurate translation (a certified translation). In these cases, my understanding is that usually the translator needs to be a member of a recognised translation body such as the Institute of Translation and Interpreting (ITI).

In short, there is no single standardised procedure for rendering a translation ‘official’. In fact, the opposite holds true: there are many procedures, ranging from straightforward and relatively inexpensive to complex and costly in time and money. The golden rule is therefore to find out exactly what is required from whomever needs the translation.

T&I

T&I stands for ‘translators and interpreters’. This shorthand is sometimes used in forums and on social media. I have also noticed how some interpreters refer to themselves as ‘terps’ but I’m not sure how widespread this term is.

TB

TB stands for ‘term base’.

TBX

CAT tool

TBX stands for ‘TermBase eXchange’.

As the name suggests, it was developed as a vendor-neutral term base format allowing users of a given CAT tool to use term bases created by another CAT tool, and vice versa.

I think all commonly used CAT tools can export and import .tbx files:

  • the export procedure converts a TB from the native format (e.g. the .sdltb format for Studio) to the .tbx format.

  • the import procedure converts a TB in the .tbx format to a TB in the native format.

TEP

TEP describes the typical workflow for translations: Translation, Editing, Proofreading.

term base

CAT tool

A term base, often abbreviated to TB, is a database containing terminology data. TBs are used by CAT tools to present to you previously translated terms as you translate.

Each entry in a TB contains, for each language, an identifier of that language and the term in that language. A TB may contain more than two languages.

Beyond the terms, each entry can store other useful data such as:

  • when the term for a particular language was added or modified, and by whom,

  • linguistic notes such as the gender of nouns (masculine/feminine/neutral), the part of speech (noun, verb, pronoun,…),etc,

  • usage notes (whether a particular term is preferred, permitted or prohibited).


term extraction

CAT tool

Building up a term base from scratch is a laborious process that can be made easier by using the automated term extraction feature of CAT tools. Terms are extracted from translation memories and from pairs of documents in which each document in a pair is a translation of the other (you have to tell the CAT tool which document in each pair is the source document and which is the target document).

TM

CAT tool

See translation memory.

tmx

CAT tool

TMX stands for ‘Translation Memory eXchange’.

As the name suggests, it was developed as a vendor-neutral translation memory format allowing users of a given CAT tool to use translation memories created by another CAT tool, and vice versa.

Most, if not all, CAT tools used today can export and import .tmx files:

  • the export procedure converts a TM from the native format (e.g. the .sdltm format for Studio) to a TM in the .tmx format,

  • the import procedure converts a TM in the .tmx format to a TM in the native format.

transcreation

‘Transcreation’ is a portmanteau of the words ‘translation’ and ‘creation’.

Although some translators object to this term, I think it has every reason to exist because some types of document, typically those used for marketing purposes, require more creativity. The translation of taglines for a collection of high-end beauty products makes different demands on the translator’s brain than the translation of a plastic manufacturer’s technical data sheets.

I think the real issue is whether translators are adequately compensated for the creative part.

transcription

The process by which speech becomes writing.

translation

The process by which text (i.e. writing) in one language becomes text in another language.

Translation should not be confused with interpreting!

This diagram should make clear the distinction between various language tasks including translation and interpreting:

NOTE: That’s Greek by the way. The word οδός appears in English words such as cathode (literally ‘down-street’) and anode (‘up-street’).

translational research

I’ve included this term even though it has nothing to do with linguistics.

‘Translational’ in ‘translational research’ and similar-sounding terms like ‘translational medicine’ or ‘translational science’ is about translating basic research findings into something that directly benefits human health, pithily summarised in the phrase ‘bench to bedside’.

transliteration

The process by which text is rendered in a different script on the basis of how it sounds.

One well-established, standardised system of transliteraton is the Hanyu Pinyin system, which is used to render Mandarin Chinese in the Latin script.

So the following ideogram become mèi in Pinyin (which is a whole lot easier to enter on my keyboard!).

translation memory (TM)

CAT tool

A database used by CAT tools for the storage of translations. The fundamental unit of translation memories is the translation unit. As you translate, the TM is the primary location where a CAT tool will look to see if the text you are translating has already been translated.

A translation memory is a bilingual file. As far as I know, TMs don't hold more than two languages.

translation unit (TU)

CAT tool

The translation unit (TU) is the fundamental unit of translation memories.

Each TU consists of a source-language segment and a target-language segment. Before the translation is done, the target segment is empty.

Here is a TU:

Here is another one (which hasn’t been translated yet):

These TUs are grossly oversimplified. In reality, TUs also store a lot of other useful data such as when the segment was added or modified and by whom, the match percentage, etc.

unclean and clean files

CAT tool

Essentially:

  • an unclean file is a bilingual document,

  • and a clean file is the translation in the monolingual format, i.e. the format in which the document to be translated was originally provided.

If you’re wondering why the term ‘clean’ is used, keep on reading.

Today, CAT tools provide a standalone translation environment. The modern workflow (as described in bilingual files) therefore involves conversion of files from everyday formats such as .docx (Microsoft Word files) into bilingual formats such as .sdlxliff or .mqxliff.

In the 2000s, one of the popular CAT tools was SDL Trados. With some file types, the translator could work in Microsoft Word on specially marked up Word documents:

Source: Translator’s Workbench User Guide (2007, SDL plc)

This translation environment was basically Word with some extra commands added in by SDL Trados. As you worked on the document, source and target segments would be presented to you in turn in coloured boxes. After completing the translation, the final result would look like a standard Word document but it would contain hidden source segments and tags that could be easily revealed. It was ‘unclean’. So before the document could be delivered to the client, it had to undergo an additional cleaning step to remove everything that SDL Trados had added. The command was actually called ‘Clean’.

The modern equivalent of ‘Clean’ (i.e. for creating the final file to be delivered to the client) is ‘Save File as Target’ or something similar. As far as I know, cleaning metaphors are no longer used.

weighted words

CAT tool

See grid.

XLIFF

CAT tool

XLIFF stands for ‘XML Localisation Interchange File Format’.

The impetus behind the creation of the XLIFF format was to provide a standardised bilingual format that would permit users of different CAT tools to work on the same document.

Many CAT tool makers have their own variants of the XLIFF format e.g. .mqxliff used by MemoQ or .sdlxliff used by Studio. The different ‘flavours’ of XLIFF are not directly interoperable but interoperability is achieved by using the XLIFF format as an intermediate format during conversions.

medical text matters
Your medical expert for language services

Please note that this is an extremely slimmed-down version of my website for smartphone screens. For the full version, please visit my website on a larger screen on a tablet, laptop or desktop computer.

SquareSpace’s Fluid Editor broke my soul.

Services

Translation

I can take on most translation projects in the medical field in the following language combinations:

  • French to English,

  • Greek to English,

  • Spanish to English.

Editing services

If you have a text that is already in English, I can — depending on how finalised it is — either proofread it or edit it to make it publication-ready. Editing is for texts that still need substantial changes to become publication-ready. Proofreading should be the very last check before a text is published. As a proofreader, I check for errors at the level of individual characters and words such as missing punctuation or misspelt words.

If you’re not sure whether your document requires editing or proofreading, send it to me so I can evaluate it — I’ll also provide a quote and delivery date at no cost and no obligation to you.

Other services

As a freelancer, I’m not boxed in by rigid or narrowly defined job descriptions. This means that in addition to the language services mentioned above, I have done other types of work, including subtitling, copywriting, writing scientific article abstracts and, in relation to scientific conferences, reviewing slideshows and designing posters.

Experience (as freelancer)

FIELDS: Medicine | Surgery | Nursing.

PRODUCTS: Traditional small-molecule drugs | Biologicals | Blood-derived products | Medical devices | Imaging devices.

THERAPEUTIC FIELDS: Oncology | Cardiovascular diseases | Endocrinology esp. diabetes and obesity | Haematology | Genetics and rare diseases.

Experience (as employee)

Journalist covering the pharmaceutical sector (2y).

Proofreader / editorial coordinator for a medical publisher (3y).

Writer / editor for a European portal on rare diseases (1y).

Medical writer / translator for a pharmaceutical industry service provider (3y).

Technical writer for a vendor of preclinical research tools (3y).

Let’s talk!

Contact me to discuss your project or to get a free quote.

I can be contacted via:

● the contact form on this website,

● email at either of these addresses:

andytheo.mtm (aτ) gmail.com

contact (aτ) medical-text-matters.com