AccueilClientsExpertisesBlogOpen SourceContact

8 octobre 2024

Our article translation workflow with the OpenAI API

3 minutes de lecture

Our article translation workflow with the OpenAI API
🇫🇷 This post is also available in french

At Premier Octet, we enjoy sharing our technical articles with as many people as possible. To reach an international audience, we have set up a custom script that allows us to automatically translate our articles from French into English, hassle-free. Once translated, a banner appears at the top of the article to indicate that the article is available in English (as seen above this paragraph).

In this article, I will take you through our process.

Our structure

Our website, and therefore this blog, uses the Next.js framework. Our blog articles are stored as static MDX files, a format which combines markdown and React components.

Each French article is stored in the /src/pages/blog directory, and English articles are stored in /src/pages/blog/en:

src/
  pages/
    blog/
      nouveautes-react-19.mdx
      en/
        nouveautes-react-19.mdx

Next.js routing simply follows the structure of the folders.

Our translation workflow

In order to automatically translate our French articles into English, we have developed a script that performs this task in an automated fashion. All it takes is launching the following command to create a new English page based on the French content:

yarn post:translate src/pages/blog/my-article.mdx

The command is set up in our package.json file:

{
  "scripts": {
    "post:translate": "tsx scripts/translate.ts"
  }
}

Our script, translate.ts, exposes a "self-executing" function that waits for the path of the MDX file to be translated as an argument:

;(async () => {
  const mdxPath = process.argv?.[2]

  if (!mdxPath) {
    console.error('❌ no path')
    return
  }

  await translate(mdxPath)
})()

Let's see now how the translate function works.

Reading the MDX file

We start by reading the full content of the MDX file:

export const translate = async (mdxPath: string) => {
  const content = await fs.readFile(mdxPath, 'utf-8')
}

Calling GPT for translation

Once the file is read, the script initiates the OpenAI client using the API key stored in our environment variables. Then, it sends a request to translate the content:

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
})

const stream = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: 'Translate the provided MDX text from French to English...' },
    { role: 'user', content },
  ],
  stream: true,
})

The stream: true parameter allows us to get a continuous translation, ideal for real-time feedback.

Progress feedback

For a fairly large article, the translation can take several seconds. Therefore, we have added a progress bar indicating the translation's progress.

The bar updates itself with each new block of translated text:

let translatedContent = ''
const bar = new ProgressBar('🇺🇸  Translating [:bar] :percent', { total: 100 })
let currentProgression = 0

for await (const part of stream) {
  translatedContent += part.choices[0]?.delta?.content || ''
  const percent = Math.floor((translatedContent.length * 100) / content.length)

  if (percent > currentProgression) {
    bar.tick()
    currentProgression = percent
  }
}

We're using the npm progress package to handle the progress bar rendering.

This allows us to track the progress of the translation in real time:

$ tsx scripts/translate.ts src/pages/blog/our-workflow-for-translating-our-articles-with-gpt.mdx
🇺🇸  Translating [░░░░░░----------------------------------------] 13%

Adjusting the translated content

Once the translation is finished, the script makes a few adjustments. It modifies the BlogPost component import path and adds the lang: en attribute to the frontmatter:

const updatedContent = translatedContent
  .replace('../../components/Layout/BlogPost', '../../../components/Layout/BlogPost')
  .replace(/^(---\s*\n)/, '$1lang: en\n')

These modifications guarantee that the English version is correctly configured and follows the directory structure.

Saving the translated file

Finally, the translated file is stored in a subdirectory en, a clean approach to separating the French and English versions:

const enPath = path.join(path.dirname(mdxPath), 'en', path.basename(mdxPath))
await fs.mkdir(path.dirname(enPath), { recursive: true })
await fs.writeFile(enPath, updatedContent)
console.log(`✅ Translation saved to: ${enPath}`)

Once saved, a confirmation message lets us know that everything went according to plan.

Thanks to this automation, we manage our multilingual articles easily. Translation becomes a quick and smooth process, letting us focus on creating content while reaching a wider audience.

And because we all love inception moments, check out this article translated by the very same script right here!

Finally, you can find the complete script on this gist.

👋

À découvrir également

AI et UI #1 - Filtres intelligents avec le SDK Vercel AI et Next.js

12 Jun 2024

AI et UI #1 - Filtres intelligents avec le SDK Vercel AI et Next.js

Dans ce premier article d’une série consacrée à l’IA et l’UI, je vous propose de découvrir différentes manières d’intégrer ces modèles d’IA dans vos applications React pour améliorer l’expérience utilisateur.

par

Baptiste

Premiers pas avec Replicate

05 Sep 2023

Premiers pas avec Replicate

Filtres, transformations, machine learning, Replicate vous propose des effets pour tous les goûts...

par

Laureen

Premier Octet vous accompagne dans le développement de vos projets avec gpt

En savoir plusNous contacter
18 avenue Parmentier
75011 Paris
+33 1 43 57 39 11
hello@premieroctet.com

Suivez nos aventures

GitHub
X (Twitter)
Flux RSS

Naviguez à vue