How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

Special How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream
As Gulf states aim to become AI leaders by investing in R&D and startups (Supplied/MBZUAI)
Short Url
Updated 09 October 2023
Follow

How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream
  • ChatGPT understands inquiries in Arabic, but answers can sound unnatural or fail to convey the right message
  • Now homegrown LLMs can capture linguistic nuances and even comprehend dialects and cultural references

DUBAI: When ChatGPT made its debut last year, the artificial intelligence program caused a global sensation, as users found themselves communicating with a machine that could pass as another human being.

However, the enthusiasm among techies in the Arab world was somewhat diminished by ChatGPT’s limited grasp of Arabic, in part the result of the language’s complexity, diacritical markings, inflection system and regional dialects.

Although ChatGPT, which is based on a large language model, or LLM, can understand inquiries in Arabic and is able to translate, especially when using Modern Standard Arabic, answers can come across as unnatural, while literal translations do not always convey the right message.

That is why Jais, an LLM designed to support Arabic, was unveiled in July, bringing one of the world’s most widely spoken, though occasionally overlooked, languages into the AI mainstream.

Jais, a name that recalls the UAE’s highest peak in Ras Al-Khaimah, is the brainchild of a team of academics and engineers who embarked on the project because they felt too few LLMs were credibly multilingual.




The Ameca humanoid robot greets visitors at Dubai's Museum of the Future. (AFP)

Downloadable on the machine learning platform Hugging Face, Jais is the result of a collaboration between Cerebras Systems, Mohamed bin Zayed University of Artificial Intelligence, or MBZUAI, and a subsidiary of the Abu Dhabi-based G42 called Inception.

“It is vital that large language models are developed for languages other than English to ensure that innovation is accessible to everyone,” Andy Jackson, CEO of Inception, told Arab News.

“A quality Arabic LLM is critical for all sectors, businesses and organizations, as well as individuals. Innovation thrives when we collaborate, and Jais sets a new standard for AI advancement in the Middle East, ensuring that the Arabic language, with its depth and heritage, finds its voice within the AI landscape.

“Jais demonstrates our commitment to excellence, and our dedication to democratizing AI and promoting innovation.”

LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. These models are then trained on large amounts of text data to learn patterns in the language.

These programs, which are rapidly proliferating in the wake of ChatGPT’s success, are capable of generating text on a seemingly endless array of subjects, producing everything from academic papers to poetry.

What is especially impressive about them is their ability to create responses to questions that are so convincingly human-like in almost any language, including coding.

But in order to make those languages sound convincing, native-speaking human programmers are often required to provide a critical layer of context and understanding that can enhance accuracy and reliability.

“Jais is purpose-built for the Arabic language and excels in capturing its intricacies and nuances, ensuring highly accurate and contextually relevant responses — a distinct advantage over general-purpose models,” said Jackson.




AI programs that are responsive to the Arabic language could widen access to a transformational new technology. (MBZUAI)

“This specialization is a pivotal development, opening up opportunities for governments, industries, and individuals across the Arab world to tap into the potential of generative AI.”

Currently considered among the foremost Arabic LLMs, Jais, a 13-billion parameter model, was trained on a newly developed 395-billion-token Arabic and English dataset on Condor Galaxy, one of the largest cloud AI supercomputers in the world, launched by G42 and Cerebras in July using 116 billion Arabic tokens and 279 billion English tokens.

“Jais was born in Abu Dhabi and offers more than 400 million Arabic speakers the opportunity to harness the potential of generative AI,” Preslav Nakov, professor and deputy department chair of Natural Language Processing at MBZUAI, told Arab News.

“It will facilitate and expedite innovation, highlighting Abu Dhabi’s leading position as a hub for AI, innovation, culture preservation and international collaboration.”

As an open-source model, Jais is expected to engage scientists, academics and developers to accelerate the growth of a an Arabic language AI ecosystem. It could also serve as a model for other languages now underrepresented in mainstream AI.

FASTFACTS

• Large language models, or LLMs, are a type of AI that can mimic human intelligence.

• Arabic is spoken by 400m people, but accounts for 1 percent of total global online content.

• Jais was created by Cerebras, MBZUAI, and a subsidiary of G42 called Inception.

“Jais outperforms existing Arabic models by a sizable margin,” said Nakov. “It is also competitive with English models of similar size despite being trained on significantly less English data.

“This exciting result shows that the model’s English component learned from the Arabic data and vice versa, opening a new era in LLM development and training.”

In Jais’s development, significant attention was devoted to pre-processing Arabic text, enhancing support for the language’s unique features, including its writing style and word order.

Jais also maintains a balanced Arabic-English dataset focus for optimal performance, offering a marked improvement over models with a limited Arabic text presence.

Its developers say Jais, unlike other models, captures linguistic nuances and even comprehends various Arabic dialects and cultural references.

“Jais facilitates faster customization for specific Arabic-focused use cases and addresses data ownership concerns by being based in the UAE, offering a reassuring solution for local enterprises,” said Inception CEO Jackson.




LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. (Supplied)

The UAE’s Ministry of Foreign Affairs and Ministry of Industry and Advanced Technology, Abu Dhabi’s National Oil Company and Department of Health, Etihad Airways, First Abu Dhabi Bank, and global technology group e& are planning to utilize Jais, offering valuable insights to enhance the model and its applications across their industries.

Given the strong digital transformation efforts by several of the Arab Gulf governments, accompanied by huge investments in high-tech industries and homegrown tech startups, AI programs that are responsive to the Arabic language could widen access to a transformational new technology and challenge the monopoly of a clutch of Silicon Valley companies.

Last month, Technology Innovation Institute, an Emirati research center in Abu Dhabi, released Falcon 180b, an open-source AI model. Established in 2020, TII released Falcon 40b, the first version of its flagship open-source AI model, in May this year, after unveiling Noor, an Arabic-based AI model, last year.

According to a report in The Economist magazine, TII is the applied-research arm of the Advanced Technology Research Council, a government agency that employs an 800-strong multinational staff working on subjects from biotechnology and robotics to quantum computing.

“We are entering the game to disrupt the core players,” Faisal Al-Bannai, secretary-general of the ATRC, told The Economist, adding that TII will build new proprietary models and applications catering for specific fields such as medicine and law.

For its part, Saudi Arabia launched its National Strategy for Data and Artificial Intelligence in October 2020, aiming to become a global leader in the field as it seeks to attract $20 billion in foreign and local investments by 2030.

The Kingdom is also determined to future-proof its workforce, initially by training and developing a pool of 20,000 AI and data specialists. In May this year, Deloitte’s AI Institute was officially launched at the Experience Analytics conference in Riyadh.

Just last week Saudi Arabia launched a National Olympiad for Programming and Artificial Intelligence open to all middle- and high-school pupils. An estimated 300,000 students will be selected from 3 million participants for training in programming and AI, according to media reports.




The hope is that the advent of AI and the automation of rapid translation will be a game changer for Arabic content. (LEAP)

The initiative is a collaboration between the Saudi Data and Artificial Intelligence Authority, in collaboration with the Ministry of Education and King Abdulaziz and His Companions Foundation for Giftedness and Creativity (Mawhiba).

Saudi Arabia’s adoption of digitalization and emerging technologies is forecast to contribute about 2.4 percent to its gross domestic product by 2030, according to a recent report by global consultancy firm PwC.

In terms of average annual growth in the contribution of AI by region, Saudi Arabia is expected to grab a 31.3 percent share in the technology’s expansion between 2018 and 2030, the PwC report added.

“AI is developing rapidly, and its impact will be felt more and more across all sectors and areas of life,” said MBZUAI’s Nakov. “In this context, it is vital that the Arab world has access to an advanced LLM that can be adapted and utilized across all sectors.

“The rapid advancement of AI means that organizations that fail to adapt and start using AI sooner rather than later will be left behind, which makes it even more essential for the Arab world to have access to quality LLMs.”

Beyond its business applications, however, a crucial aspect of a program such as Jais is its ability to champion neglected languages, preserve them in a fast-changing economy, and promote digital inclusivity.

Although Arabic is an official language in 22 countries and is partly spoken in 11 others, it accounts for just 1 percent of total global online content, according to Jais’s creators. The hope is that the advent of AI and the automation of rapid translation will be a game changer.

By placing the language at the forefront of the AI revolution, Jais and its successors could help to maintain Arabic’s global prominence and its distinctive cultural significance in the digital age.


China’s Xi meets Egyptian leader El-Sisi in Beijing

China’s Xi meets Egyptian leader El-Sisi in Beijing
Updated 5 sec ago
Follow

China’s Xi meets Egyptian leader El-Sisi in Beijing

China’s Xi meets Egyptian leader El-Sisi in Beijing
  • Top of the agenda will be the war between Israel and Hamas, which Xi has called for an “international peace conference” to resolve
  • Several Arab leaders are this week visiting Beijing, which is seeking to present a “common voice” on the conflict between Israel and Hamas
Beijing: President Xi Jinping welcomed Egyptian counterpart Abdel Fattah El-Sisi to Beijing on Wednesday, as the Chinese capital hosts a number of Arab dignitaries for a forum it hopes will deepen ties with the region.
Several Arab leaders are this week visiting Beijing, which is seeking to present a “common voice” on the conflict between Israel and Hamas and improve cooperation.
Xi met El-Sisi in a grand ceremony outside Beijing’s Great Hall of People on Wednesday afternoon, state media footage showed, with the national anthems of both countries blaring out.
Cairo has said the two will discuss “regional and international issues of common interest.”
“Discussions will tackle ways to forge closer bilateral relations and to unlock broader prospects for cooperation in an array of fields,” the Egyptian presidency said.
Beijing has sought to build closer ties with Arab states in recent years, and last year brokered a detente between Tehran and its long-time foe Saudi Arabia.
It has also historically been sympathetic to the Palestinian cause and supportive of a two-state solution to the Israeli-Palestinian conflict.
And Beijing last month hosted rival Palestinian groups Hamas and Fatah for “in-depth and candid talks on promoting intra-Palestinian reconciliation.”
United Arab Emirates President Sheikh Mohamed bin Zayed Al Nahyan, as well as a host of other regional leaders and diplomats, is also among the delegates attending the forum.
Xi is set to deliver a keynote speech at the opening ceremony on Thursday, Beijing has said, aimed at building “common consensus” between China and Arab states.
Top of the agenda will be the war between Israel and Hamas, which Xi has called for an “international peace conference” to resolve.
China sees a “strategic opportunity to boost its reputation and standing in the Arab world” by framing its efforts to end that conflict against US inaction, Ahmed Aboudouh, an associate fellow with the Chatham House Middle East and North Africa Programme, told AFP.
“This, in turn, serves Beijing’s focus on undermining the US’s credibility and influence in the region,” he said.
“The longer the war, the easier for China to pursue this objective,” he added.
On Tuesday, Foreign Minister Wang Yi met with counterparts from Yemen and Sudan in Beijing, saying he hoped to “strengthen solidarity and coordination” with the Arab world.
He also raised China’s concerns over disruptive attacks on Red Sea shipping by Iran-backed Houthi forces acting in solidarity with Hamas with his Yemeni counterpart Shayea Mohsen Al-Zindani.
“China calls for an end to the harassment of civilian vessels and to ensure the safety of waterways in the Red Sea,” state news agency Xinhua quoted him as saying.

Three Israeli soldiers killed in combat in southern Gaza, military says

Three Israeli soldiers killed in combat in southern Gaza, military says
Updated 1 min 3 sec ago
Follow

Three Israeli soldiers killed in combat in southern Gaza, military says

Three Israeli soldiers killed in combat in southern Gaza, military says
  • Israeli forces have kept up their offensive in Rafah, defying an order from the International Court of Justice
JERUSALEM: The Israeli military said three soldiers had been killed in combat in southern Gaza on Wednesday, as it pressed ahead with its offensive in Rafah.
Three more soldiers were badly wounded in the same incident, the military said, though it provided no further details. Israel’s public broadcaster Kan radio said they were injured by an explosive device set off in a building in Rafah.
Defying an order from the International Court of Justice, Israeli forces have kept up their offensive in Rafah, where they aim to root out the last major intact formations of Hamas fighters and rescue hostages.
International unease over Israel’s three-week-old Rafah offensive has turned to outrage since an airstrike on Sunday set off a blaze in a tent camp in a western district of the city, killing at least 45 people.
Israel said it had been targeting two senior Hamas operatives and had not intended to cause civilian casualties. Prime Minister Benjamin Netanyahu said that “something unfortunately went tragically wrong.”
The Israeli military said it was investigating the possibility that munitions stored near a compound targeted by Sunday’s airstrike may have ignited.
Israel told around one million Palestinian civilians displaced by the almost eight-month-old war to evacuate from Rafah before launching its incursion in early May. Around that many have fled the city since then, according to the UN agency for Palestinian refugees, UNRWA.
On Tuesday, the United States, Israel’s closest ally, reiterated its opposition to a major Israeli ground offensive in Rafah but said it did not believe such an operation was under way.

Syrians in Lebanon fear unprecedented restrictions, deportations

Syrians in Lebanon fear unprecedented restrictions, deportations
Updated 53 min 9 sec ago
Follow

Syrians in Lebanon fear unprecedented restrictions, deportations

Syrians in Lebanon fear unprecedented restrictions, deportations
  • Lebanon remains home to the largest refugee population per capita in the world: roughly 1.5 million Syrians
  • Five million Syrian refugees who spilled out of Syria into neighboring countries, while millions more are displaced within Syria

BEKAA VALLEY: The soldiers came before daybreak, singling out the Syrian men without residence permits from the tattered camp in Lebanon’s Bekaa Valley. As toddlers wailed around them, Mona, a Syrian refugee in Lebanon for a decade, watched Lebanese troops shuffle her brother onto a truck headed for the Syrian border.
Thirteen years since Syria’s conflict broke out, Lebanon remains home to the largest refugee population per capita in the world: roughly 1.5 million Syrians — half of whom are refugees formally registered with the United Nations refugee agency UNHCR — in a country of approximately 4 million Lebanese.
They are among some five million Syrian refugees who spilled out of Syria into neighboring countries, while millions more are displaced within Syria. Donor countries in Brussels this week pledged fewer funds in Syria aid than last year.
With Lebanon struggling to cope with an economic meltdown that has crushed livelihoods and most public services, its chronically underfunded security forces and typically divided politicians now agree on one thing: Syrians must be sent home.
Employers have been urged to stop hiring Syrians for menial jobs. Municipalities have issued new curfews and have even evicted Syrian tenants, two humanitarian sources told Reuters. At least one township in northern Lebanon has shuttered an informal camp, sending Syrians scattering, the sources said.
Lebanese security forces issued a new directive this month shrinking the number of categories through which Syrians can apply for residency — frightening many who would no longer qualify for legal status and now face possible deportation.
Lebanon has organized voluntary returns for Syrians, through which 300 traveled home in May. But more than 400 have also been summarily deported by the Lebanese army, two humanitarian sources told Reuters, caught in camp raids or at checkpoints set up to identify Syrians without legal residency.
They are automatically driven across the border, refugees and humanitarian workers say, fueling concerns about rights violations, forced military conscription or arbitrary detention.
Mona, who asked to change her name in fear of Lebanese authorities, said her brother was told to register with Syria’s army reserves upon his entry. Fearing a similar fate, the rest of the camp’s men no longer venture out.
“None of the men can pick up their kids from school, or go to the market to get things for the house. They can’t go to any government institutions, or hospital, or court,” Mona said.
She must now care for her brother’s children, who were not deported, through an informal job she has at a nearby factory. She works at night to evade checkpoints along her commute.
’Wrong $ not sustainable’
Lebanon has deported refugees in the past, and political parties have long insisted parts of Syria are safe enough for large-scale refugee returns.
But in April, the killing of a local Lebanese party official blamed on Syrians touched off a concentrated campaign of anti-refugee sentiment.
Hate speech flourished online, with more than 50 percent of the online conversation about refugees in Lebanon focused on deporting them and another 20 percent referring to Syrians as an “existential threat,” said Lebanese research firm InflueAnswers.
The tensions have extended to international institutions. Lebanon’s foreign minister has pressured UNHCR’s representative to rescind a request to halt the new restrictions and lawmakers slammed a one billion euro aid package from the European Union as a “bribe” to keep hosting refugees.
“This money that the EU is sending to the Syrians, let them send it to Syria,” said Roy Hadchiti, a media representative for the Free Patriotic Movement, speaking at an anti-refugee rally organized by the conservative Christian party.
He, like a growing number of Lebanese, complained that Syrian refugees received more aid than desperate Lebanese. “Go see them in the camps — they have solar panels, while Lebanese can’t even afford a private generator subscription,” he said.
The UN still considers Syria unsafe for large-scale returns and said rising anti-refugee rhetoric is alarming.
“I am very concerned because it can result in... forced returns, which are both wrong and not sustainable,” UNHCR head Filippo Grandi told Reuters.
“I understand the frustrations in host countries — but please don’t fuel it further.”
Zeina, a Syrian refugee who also asked her name be changed, said her husband’s deportation last month left her with no work or legal status in an increasingly hostile Lebanese town.
Returning has its own dangers: her children were born in Lebanon and do not have Syrian ID cards, and her home in Homs province remains in ruins since a 2012 government strike that forced her to flee.
“Even now, when I think of those days, and I think of my parents or anyone else going back, they can’t. The house is flattened. What kind of return is that?” she said.


Palestinian militants release video of Israeli hostage alive in Gaza

Palestinian militants release video of Israeli hostage alive in Gaza
Updated 29 May 2024
Follow

Palestinian militants release video of Israeli hostage alive in Gaza

Palestinian militants release video of Israeli hostage alive in Gaza

GAZA STRIP: Palestinian militant group Islamic Jihad released a video on Tuesday showing an Israeli hostage alive and held in the Gaza Strip.

The captive, identified by Israeli media as Sasha Trupanov, 28, is seen speaking in Hebrew in the 30-second clip.

The Hostages and Missing Families Forum campaign group identified him as Alexander (Sasha) Trupanov, and called on the Israeli authorities to secure the release of all captives held in Gaza.

It was unclear when the footage, in which he is seen wearing a T-shirt, was taken.

Trupanov, a Russian-Israeli dual national, was captured on October 7 from Kibbutz Nir Oz along with his mother, grandmother and girlfriend.

The three women were freed during a truce between Hamas and Israel at the end of November, which led to the release of 105 hostages.

“Seeing my Sasha on television today is very heartening, but it also breaks my heart that he has been in captivity for such a long time,” said his mother, Yelena Trupanov, in a short message published by the families’ forum.

Israel’s government has instructed its negotiating team to continue talks with mediators to secure a deal for the release of the hostages, but no new round of talks has begun.

“The Israeli government must give a significant mandate to the negotiating team, which will be able to lead to a deal for the return of all the hostages — the living to rehabilitation and the murdered to burial,” a families’ forum statement said after the release of Trupanov’s video.

Trupanov’s father was killed in the October 7 attack on southern Israel, which resulted in the deaths of 1,189 people, mostly civilians, according to an AFP tally based on Israeli official figures.

Militants also took 252 hostages, 121 of whom remain in Gaza, including 37 the army says are dead.

Israel’s retaliatory offensive has killed at least 36,096 people in Gaza, mostly civilians, according to the Hamas-run territory’s health ministry.


Algeria to present UN resolution on end to Rafah ‘killing’

Algeria to present UN resolution on end to Rafah ‘killing’
Updated 29 May 2024
Follow

Algeria to present UN resolution on end to Rafah ‘killing’

Algeria to present UN resolution on end to Rafah ‘killing’

UNITED NATIONS: Algeria will present a draft UN resolution calling for an end to “the killing” in Rafah as Israel attacks Hamas fighters in the crowded Gaza city, its ambassador said Tuesday after a Security Council meeting.

Defying pressure from the United States and other western countries, Israel has been conducting military operations in Rafah, which is packed with people who have fled fighting elsewhere in Gaza.

An Israeli strike Sunday killed 45 people at a tent camp for displaced people, said the Hamas-run health ministry in Gaza, drawing a chorus of international condemnation.

“It will be a short text, a decisive text, to stop the killing in Rafah,” Ambassador Amar Bendjama told reporters.

It was Algeria that requested Tuesday’s urgent meeting of the council after the Sunday strike.

A civil defense official in Gaza said another Israeli strike on a displacement camp west of Rafah on Tuesday killed at least 21 more people.

The Algerian ambassador did not say when he hoped the resolution might be put to a vote.

“We hope that it could be done as quickly as possible because life is in the balance,” said Chinese ambassador Fu Cong, expressing hope for a vote this week.

“It’s high time for this council to take action. This is a matter of life and death. This is a matter of emergency,” the French ambassador Nicolas de Riviere said before the council meeting.

The council has struggled to find a unified voice since the war broke out with the October 7 Hamas attack on Israel, followed by Israel’s retaliatory campaign.

After passing two resolutions centered on the need for humanitarian aid to people in Gaza, in March the council passed a resolution calling for an immediate ceasefire — an appeal that had been blocked several times before by the United States, Israel’s main ally.

Washington, increasingly frustrated with how Israel is waging the war and its mounting civilian death toll, finally allowed that resolution to pass by abstaining from voting.

But the White House said Tuesday that Israel’s offensive in Rafah had not amounted to the type of full-scale operation that would breach President Joe Biden’s “red lines,” and said it had no plans to change its policy toward Israel.

Asked about the new Algerian draft resolution, US Ambassador Linda Thomas-Greenfield said, “we’re waiting to see it and then we’ll react to it.”