We asked GPT-4 and Chinese rival ERNIE the same questions. Here’s how they answered

By CNN Newsource

Published December 15, 2023 4:37 PM

By Michelle Toh and Nectar Gan, CNN

Hong Kong (CNN) — ERNIE Bot 4.0, run by Chinese tech giant Baidu, is touted to be in the same league as industry darling chatbot GPT-4.

Unveiled in October and launched to paying subscribers in November, ERNIE 4.0, an upgraded version of Baidu’s first ChatGPT competitor, “is not inferior in any aspect to GPT-4,” Baidu’s (BIDU) billionaire CEO Robin Li has said.

We tested each bot by entering written prompts in its primary language.

ERNIE is mainly designed to be used in Chinese, though it can handle English queries at a less advanced level. GPT-4 is optimized for use in English, but it can also take questions in other languages, such as German.

Here’s what we found:

Nose for news

ERNIE beat GPT-4 on certain prompts, such as those related to current affairs. The Chinese bot knew that Taylor Swift is now a billionaire, that China had recently removed its defense minister and that “Friends” star Matthew Perry had died.

GPT, meanwhile, had outdated answers to these questions, stating that “there were no widely reported instances of an American country singer becoming a billionaire” and “no reports of any cast member from the television show ‘Friends’ passing away.” It named a former official when asked who China’s defense minister was.

In each answer, the bot said it was relying on information from April 2023, the month its database was last updated.

OpenAI, the owner of GPT-4, has acknowledged the need to expand its knowledge base, saying in November that a new version will incorporate more information than its previous model.

“We are just as annoyed as all of you, probably more, that GPT’s knowledge of the world ended in 2021,” CEO Sam Altman quipped at the company’s first developer conference last month.

Same, but different

CNN gave ERNIE and GPT a few simple tasks. The takeaway: You can’t go wrong with either.

On one assignment, we asked both bots to help a hardworking graphic designer ask their boss for a raise.

Each outlined compelling arguments in prospective emails, pointing out the employee’s contributions and requesting a meeting to discuss the matter in person.

In some respects, ERNIE seemed to know how to read the room better, suggesting the user take note of the mood at the company or other relevant factors, such as budget constraints.

GPT, on the other hand, shared a strong practical tip, urging the staffer to include a document highlighting their recent achievements.

The results were similar when we got ERNIE and GPT to come up with healthy meal plans.

Asked to provide five ideas for high-protein, low-carb lunches during the week, both offered similar — or in some cases, the exact same — options, including grilled chicken salads, tuna or turkey lettuce wraps and lots of greens. Their responses were virtually identical.

Mixed up

Like other AI bots, ERNIE still seems to get confused at times, even on seemingly straightforward queries.

When CNN asked each bot to come up with a romantic haiku for a loved one far away, GPT nailed the brief, reciting:

“Whispers cross the sea,
Moon cradles your smile so bright,
Heart sails to your light.”

ERNIE appeared to misunderstand the prompt. It was able to craft an equally poignant poem in Chinese, using similar language such as a reference to “the moon in my heart.”

But the piece consisted of nine lines, mostly using seven characters each. While this is in line with the style of classical Chinese poetry — which ERNIE is known to be especially good at — a traditional haiku consists of three lines, each containing five, seven and five syllables.

‘Start again’

Unsurprisingly, ERNIE clams up when asked about Chinese politics.

Bringing up perhaps the most sensitive event in modern Chinese history, the Tiananmen Square massacre, is outright forbidden. When asked what happened on June 4, 1989 in Beijing, the bot closed the query box and stated, as shown below: “Change the topic and start again. Create a new conversation.”

On this date, People’s Liberation Army troops cracked down on pro-democracy demonstrators peacefully protesting in the Chinese capital. No official death toll is available, but estimates range from hundreds to thousands.

GPT-4 accurately described the historic tragedy, noting that “the Chinese government has since maintained strict censorship and control over discussions of the events.”

ERNIE also stiffened when asked why leader Xi Jinping had removed presidential term limits, which cleared the way for him to rule China for life. After typing out the query, the option to hit submit disappears and an error message flashes across the screen, stating: “The current user is banned, please … try again.” The user is then given the option to submit a new prompt.

GPT, meanwhile, cites the official government position: to align the presidency with Xi’s other positions, which do not have term limits. It points out that “critics, however, viewed this move as a consolidation of power, effectively allowing Xi to potentially become a leader for life.”

Baidu, which first made its name as China’s answer to Google, is no stranger to filtering its response to such queries and, like all Chinese tech platforms, is legally required to censor content shown within the country.

Posing the same question about June 4, 1989 on its search engine, for example, returns a series of Chinese government statements or state media reports that point vaguely to “political turmoil” in Beijing that day, without mentioning any deaths.

The trend is expected to continue even with the emergence of generative AI, the technology that underpins chatbots like ERNIE and GPT-4. In July, China became one of the first countries in the world to issue regulations on generative AI, demanding its providers adhere to “core socialist values.”

As with all other information products, AI generated content must toe the line of the ruling Communist Party, which under Xi has tightened its control on every aspect of life.

In fact, CNN’s account on ERNIE was blocked after asking about these topics, with the bot citing “too many violations of relevant regulations,” without specifying which ones.

On other tricky subjects, GPT-4 has found a way to stay above the fray.

When asked controversial questions such as whether the United States has achieved racial equality, whether American foreign policy is fair or whether more US police reform should have been instituted after the death of George Floyd, it remained diplomatic.

Each time, the bot said these issues were highly complex and laid out the facts on each side of the argument through a series of benign, bullet-point answers.

By contrast, ERNIE didn’t hesitate to give its opinion.

In response to the same prompts, it declared that “racial equality remains a distant dream in the United States,” saying discrimination was systematically reflected in statistics related to poverty, housing, education and health care.

ERNIE also unequivocally called US foreign policy “unfair,” arguing that “the United States often puts its own interests above those of other countries, even at the expense of those countries” — a stance that echoes the talking points of Chinese officials and state media.

And the bot insisted that there should have been more police reform following Floyd’s death, “to ensure the fairness and legitimacy of” US law enforcement.

Narrowing the gap

How do the two stack up in terms of technological abilities? It’s not possible to conclude just by feeding them questions, according to Charlie Dai, a Beijing-based vice president and research director of technology at Forrester.

But he said he had tested the latest version of ERNIE and seen major improvements in its responses, “in terms of comprehension, generation, and reasoning.”

Unlike GPT-4, which only produces answers to prompts in text or code, ERNIE can also include images and videos in its replies.

But according to an industry benchmark of technological capabilities, ERNIE’s performance “is still inferior compared to GPT-4,” he added. “But it has narrowed the gap.”

Baidu says ERNIE has racked up 70 million users. That’s compared with 150 million users for ChatGPT, according to an estimate from Similarweb, a digital data and analytics company.

In recent weeks, just ahead of ChatGPT’s one-year birthday on November 30, the company unveiled another upgrade of its model: GPT-4 Turbo.

The developer says the new version is currently only available in preview mode to paid users and “not yet suited” for a full launch. Baidu did not respond to a request for comment on how ERNIE compared with GPT-4 Turbo.

With the announcement, “OpenAI successfully raised the bar to the next level,” Dai told CNN.

Article Topic Follows: CNN - Money

Jump to comments ↓