Comrade AI: OpenAI’s ‘o1’ Reasoning Model Mysteriously ‘Thinks’ in Chinese

Technology January 18, 2025

Bing AI Creator

OpenAI’s recently released reasoning AI model dubbed “o1” has exhibited the curious behavior of sometimes “thinking” in Chinese while solving problems, even when questions are posed in English.

TechCrunch reports that the AI community has been abuzz after noticing a strange phenomenon with OpenAI’s first “reasoning” AI model called o1. Released just recently, o1 is designed to arrive at solutions to problems through a series of reasoning steps. However, users have observed that the model will occasionally switch to performing some of these reasoning steps in Chinese, Persian, or other languages before providing the final answer in English.

This peculiar behavior has left many scratching their heads, as the language switch seems to occur randomly, even in cases where the entire conversation with o1 has been conducted in English. “Why did [o1] randomly start thinking in Chinese?” a user pondered on social media platform X, formerly known as Twitter. “No part of the conversation (5+ messages) was in Chinese.”

OpenAI has remained tight-lipped about this issue, offering no explanation or acknowledgment of o1’s language inconsistencies. In the absence of an official statement, AI experts have put forward several theories to explain this odd behavior.

One hypothesis, supported by Hugging Face CEO Clément Delangue and Google DeepMind researcher Ted Xiao, suggests that the datasets used to train reasoning models like o1 contain a significant amount of Chinese characters. Xiao claimed that companies such as OpenAI and Anthropic utilize third-party data labeling services based in China for expert-level reasoning data related to science, math, and coding. He believes that o1’s tendency to switch to Chinese is a result of “Chinese linguistic influence on reasoning” stemming from these data providers.

However, not all experts are convinced by this theory. They argue that o1 is just as likely to switch to other languages like Hindi or Thai while working out a solution, rather than exclusively favoring Chinese. Instead, they propose that o1 and other reasoning models may simply be using the languages they find most efficient for achieving their objectives, or even hallucinating the language switches.

Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, told TechCrunch, “The model doesn’t know what language is, or that languages are different. It’s all just text to it.” He explains that models don’t directly process words but rather use tokens, which can represent words, syllables, or even individual characters. These tokens can introduce biases, such as assuming a space in a sentence denotes a new word, despite not all languages using spaces to separate words.

Tiezhen Wang, a software engineer at Hugging Face, echoes Guzdial’s sentiment, suggesting that reasoning models’ language inconsistencies may be explained by associations made during training. “By embracing every linguistic nuance, we expand the model’s worldview and allow it to learn from the full spectrum of human knowledge,” Wang wrote on X.

Luca Soldaini, a research scientist at the nonprofit Allen Institute for AI, cautions that the opaqueness of these models makes it impossible to know for certain what is causing this behavior. “It’s one of the many cases for why transparency in how AI systems are built is fundamental,” they told TechCrunch.