Roeland Hofkens , Chief Product & Technology Officer, LanguageWire
Large Language Models (LLMs) have taken the world by storm.
Some of these models like OpenAI’s GPT-4 and Google’s PaLM2 have been trained on a multilingual data set and should – at least in theory – also be very capable at machine translation tasks.
But is this really the case? How do we unlock the full potential of Large Language Models for machine translation? In this technical deep dive, we will look at how LLM’s work in the context of machine translation and how they could be integrated in a Translation Management System (TMS).
Most of the current commercial machine translation tools such as Google Translate are based on neural models with a transformer architecture. These models are purpose-built for one task: machine translation. Out-of-the-box they already perform very well in tasks required for translating generic content. However, in specialised business contexts, they might miss the right vocabulary or use a suboptimal style.
Therefore, it is useful to customise these models with additional business data by training them to recognise your personalised terms and phrases. Using various customisation techniques, the model ‘learns’ to use your business’s tone of voice and terminology which then produces better machine translation result
Large Language Models are usually also based on transformer architectures. However, compared to the Neural Machine Translations (NMT) models in the previous section they are trained on much larger bodies of text and contain more model parameters. LLMs contain billions of parameters versus a few hundred million in single task, bilingual NMT models. This makes LLM models more flexible and ‘smarter’ when it comes to interpreting user instructions or ‘prompts’. This new technology opens a lot of new possibilities in terms of model customisation with business data. Because this approach is so powerful, I prefer to speak in terms of “personalisation” instead of “customisation”. Let’s explore how this personalisation works.
When using LLMs, there are basically two approaches to fine tuning the model, so it produces better quality at inference time, the moment when it generates its response.
Let’s first investigate parameter tuning.
Updating the parameters of an LLM can be a daunting task. Remember that even small LLMs have billions of parameters. Updating them is a computationally very expensive task that is usually beyond the reach of a typical consumer as the cost and complexity of doing so are simply too high.
For machine translation purposes, we will normally start with an instruction-tuned LLM model. This is a model that has been fine-tuned to be more helpful and to follow instructions, rather than to simply predict the next words. After tuning, the model will perform better on a variety of tasks like summarization, classification, and machine translation. We will be providing more information about which model to pick in future blog posts in this series.
The instruction-tuned LLMs are a good starting point for further, customer-specific optimisations. Using an approach called Parameter Efficient Fine-Tuning or PEFT, we can fine-tune an instructed model with customer data in a short, more cost-effective way.
At LanguageWire, our preferred PEFT method is LoRA (Low-Rank Adaption), which usually involves updating ca. 1,4 – 2,0% of the model weights. This means the customisation effort is reasonable, but also surprisingly effective. As you can see in the table below, the authors of the LoRA paper conclude that LoRA can prove to be even more effective than full tuning of all the model parameters!
For the best results from this method, we need to have access to a large amount of high-quality training data with matching source and target texts. If you have already built up a sizeable translation memory, then it can likely be used for this purpose. The LanguageWire AI team is constantly working to identify the ideal size of the translation memory for LoRA tuning.
Now let’s move on to the second approach, in-context or few-shot learning.
In-context learning is a method where the model learns on the fly from a small number of examples introduced by a specially crafted prompt. It is also known as few-shot learning.
In the context of Machine Translation, few-shot learning works as follows:
Few-shot learning for MT has a positive impact on fluidity, tone-of-voice, and terminology compliance. It requires fewer examples to work with, a maximum of three to five. In fact, the efficiency doesn’t improve with greater sample sizes so it does not benefit from including all your translation memory in a single prompt. Experiments have shown that LLMs do not handle large prompt contexts very well and that the quality of the results might even deteriorate!
It is through combining the benefits of LoRA and few-shot learning together, that we can implement powerful optimisations in the Large Language Model that ultimately lead to hyper-personalised, top-quality machine translation.
None of these techniques would work without a large set of high-quality, up-to-date bilingual text corpora in various language pairs. Your Translation Memories are an ideal source for this data set.
However, before it can be used, you must consider several important aspects:
If you use the LanguageWire platform, the automated Translation Memory Management module takes care of these aspects for you and no manual action is needed.
In the case that you have an existing external translation memory which you would like to use with our Platform and machine translation services, our engineers can make this possible. LanguageWire engineers have created import APIs, cleanup scripts, and Language Quality Assessment Tools to help you to make most out of your most valuable linguistic asset.
So how do we bring all this together for a typical translation project? Let’s look at an example.
LanguageWire offers a solution that is fully integrated in our technology ecosystem. This is demonstrated in high-level steps in figure 1 below.
In this example, we have taken a simple workflow where a customer wants to translate PDF or office files. The user simply uploads the content files using the LanguageWire project portal. From there on, everything is orchestrated automatically:
FIGURE 1: A simple translation project in the existing LanguageWire Platform result
In example 2 we focus on the pre-translation step using machine translation based on LLM technology. As we can see illustrated in figure 2 below, the linguistic data of the customer plays a central role.
FIGURE 2: A translation example using a Large Language Model with a mixture of LoRA customisations and optimised in-context learning prompts.
When our specially crafted prompt is handled by the LLM, the custom weights in the LoRA module will contribute to a top-quality machine translation output. Once completed, this output then travels automatically to the next step in the process. Typically, this would be a post-editing task with a human expert in the loop for maximum final quality.
In a nutshell: our customers can expect even better machine translation. The MT can adapt automatically to varying contexts, for example different business verticals, and align with the expected tone of voice and choice of words of that vertical.
Not only this will reduce the cost of post-editing, but it will also increase the delivery speed of translations. It will also open a broader scope for using the MT output directly, without human experts in the loop.
As we mentioned before, Large Language Models are very flexible. The LanguageWire AI team is investigating lots of other areas that could benefit from LLM technology.
We are currently researching:
Automated Language Quality Assessment. The LLM could check the translation of a human expert or the machine translation output of another model and give a quality score. This could reduce the cost of proofreading substantially. The underlying Machine Translation Quality Estimation (MTQE) technology can also be applied to other use cases.
Content authoring assistants. Using a combination of PEFT with LoRA and few-shot learning, we can personalise the LLM model to have a focus on content creation tasks. A customer could provide keywords and metadata allowing the model to generate a text that is using a business customised tone of voice and word choices.
Apply further customisation of the LLM with data from Termbases.
And there is lots more to come. Stay tuned for more blog posts in the coming weeks about AI and the future of LLMs at LanguageWire.
Other Articles of Interest
Your journey to a powerful, seamless language management experience starts here! Tell us about your needs and we will tailor the perfect solution to your enterprise.