8 octobre 2024
Our article translation workflow with the OpenAI API
3 minutes de lecture
At Premier Octet, we enjoy sharing our technical articles with as many people as possible. To reach an international audience, we have set up a custom script that allows us to automatically translate our articles from French into English, hassle-free. Once translated, a banner appears at the top of the article to indicate that the article is available in English (as seen above this paragraph).
In this article, I will take you through our process.
Our structure
Our website, and therefore this blog, uses the Next.js framework. Our blog articles are stored as static MDX files, a format which combines markdown and React components.
Each French article is stored in the /src/pages/blog
directory, and English articles are stored in /src/pages/blog/en
:
src/
pages/
blog/
nouveautes-react-19.mdx
en/
nouveautes-react-19.mdx
Next.js routing simply follows the structure of the folders.
Our translation workflow
In order to automatically translate our French articles into English, we have developed a script that performs this task in an automated fashion. All it takes is launching the following command to create a new English page based on the French content:
yarn post:translate src/pages/blog/my-article.mdx
The command is set up in our package.json
file:
{
"scripts": {
"post:translate": "tsx scripts/translate.ts"
}
}
Our script, translate.ts
, exposes a "self-executing" function that waits for the path of the MDX file to be translated as an argument:
;(async () => {
const mdxPath = process.argv?.[2]
if (!mdxPath) {
console.error('❌ no path')
return
}
await translate(mdxPath)
})()
Let's see now how the translate
function works.
Reading the MDX file
We start by reading the full content of the MDX file:
export const translate = async (mdxPath: string) => {
const content = await fs.readFile(mdxPath, 'utf-8')
}
Calling GPT for translation
Once the file is read, the script initiates the OpenAI client using the API key stored in our environment variables. Then, it sends a request to translate the content:
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
})
const stream = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: 'Translate the provided MDX text from French to English...' },
{ role: 'user', content },
],
stream: true,
})
The stream: true
parameter allows us to get a continuous translation, ideal for real-time feedback.
Progress feedback
For a fairly large article, the translation can take several seconds. Therefore, we have added a progress bar indicating the translation's progress.
The bar updates itself with each new block of translated text:
let translatedContent = ''
const bar = new ProgressBar('🇺🇸 Translating [:bar] :percent', { total: 100 })
let currentProgression = 0
for await (const part of stream) {
translatedContent += part.choices[0]?.delta?.content || ''
const percent = Math.floor((translatedContent.length * 100) / content.length)
if (percent > currentProgression) {
bar.tick()
currentProgression = percent
}
}
We're using the npm progress package to handle the progress bar rendering.
This allows us to track the progress of the translation in real time:
$ tsx scripts/translate.ts src/pages/blog/our-workflow-for-translating-our-articles-with-gpt.mdx
🇺🇸 Translating [░░░░░░----------------------------------------] 13%
Adjusting the translated content
Once the translation is finished, the script makes a few adjustments. It modifies the BlogPost component import path and adds the lang: en
attribute to the frontmatter:
const updatedContent = translatedContent
.replace('../../components/Layout/BlogPost', '../../../components/Layout/BlogPost')
.replace(/^(---\s*\n)/, '$1lang: en\n')
These modifications guarantee that the English version is correctly configured and follows the directory structure.
Saving the translated file
Finally, the translated file is stored in a subdirectory en
, a clean approach to separating the French and English versions:
const enPath = path.join(path.dirname(mdxPath), 'en', path.basename(mdxPath))
await fs.mkdir(path.dirname(enPath), { recursive: true })
await fs.writeFile(enPath, updatedContent)
console.log(`✅ Translation saved to: ${enPath}`)
Once saved, a confirmation message lets us know that everything went according to plan.
Thanks to this automation, we manage our multilingual articles easily. Translation becomes a quick and smooth process, letting us focus on creating content while reaching a wider audience.
And because we all love inception moments, check out this article translated by the very same script right here!
Finally, you can find the complete script on this gist.
👋