Listen to this Post

A New Chapter for Britain’s Oldest Living Languages
Celtic languages — Welsh, Cornish, Irish, and Scottish Gaelic — are among the U.K.’s oldest surviving tongues, each carrying centuries of cultural history. Yet, as English dominates public life, these languages face the risk of being sidelined. To combat this, the UK-LLM sovereign AI initiative has launched a bold project: training an AI model on NVIDIA Nemotron to reason in both English and Welsh.
Welsh, spoken today by around 850,000 people, is at the heart of this initiative. The AI’s purpose is not just technical — it’s cultural. By enabling reasoning in Welsh, the model could transform how public services like healthcare, education, and legal systems are delivered, making them fully accessible in the language people live by.
UK Prime Minister Keir Starmer highlighted the effort, saying this step ensures every region of the country can benefit from AI without sacrificing cultural heritage. Trained on the Isambard-AI supercomputer in Bristol, the system represents both technological advancement and cultural preservation.
The project, originally launched in 2023 as BritLLM and led by University College London, has already developed two U.K. language models. The new Welsh model, built with Bangor University and NVIDIA, aligns with the Cymraeg 2050 initiative, which aims for one million Welsh speakers by mid-century.
Through AI cloud provider Nscale, developers will soon gain API access to the model, enabling practical applications ranging from bilingual chatbots to automatic translations for healthcare, education, retail, broadcasting, and hospitality.
Language experts like Gruffudd Prys from Bangor University stress that AI could make Welsh a “living, breathing language” that evolves with the times. This includes helping second-language learners acquire Welsh more easily while giving native speakers new opportunities to refine their skills.
The long-term vision goes beyond Wales. Using the same methodology, UK-LLM intends to build AI models for Cornish, Irish, Scots, and Scottish Gaelic, while also collaborating with international partners to develop tools for African and Southeast Asian languages.
Technically, the Welsh model builds on Nemotron’s open-source foundation, utilizing the 49-billion-parameter Llama Nemotron Super and 9-billion-parameter Nemotron Nano. Since Welsh training data is limited compared to English or Spanish, the team used NVIDIA NIM microservices and DeepSeek-R1 to translate over 30 million English-language data entries into Welsh. The process was accelerated on hundreds of NVIDIA GH200 Grace Hopper Superchips through Isambard-AI, backed by \$280 million (USD 225 million GBP) in government investment.
Bangor University provided essential linguistic validation, ensuring the AI correctly captured complex nuances of Welsh grammar, such as consonant mutations depending on word context.
Once publicly released, the Welsh datasets and models will support research, enterprise applications, and public services, keeping accessibility at the forefront. As Prys noted: “It’s one thing to have AI capability in Welsh, but another to make it open and accessible to all.”
Ultimately, this initiative is about more than preserving a language — it’s about ensuring that heritage and modern technology grow together, positioning AI not as a threat to cultural identity, but as a safeguard for it.
What Undercode Say:
This initiative represents a pivotal moment in cultural technology. For centuries, smaller languages like Welsh have fought against decline, often relegated to ceremonial or academic use. Now, with AI, they’re entering the digital mainstream.
The strategic importance of this move cannot be overstated. In an increasingly digitized society, languages that lack a technological presence risk fading into obscurity. By embedding Welsh into the architecture of sovereign AI, the UK is ensuring that speaking Welsh remains practical, relevant, and accessible.
From a policy perspective, this aligns neatly with Cymraeg 2050, which has been criticized as overly ambitious without proper infrastructure. AI integration could provide the missing link: education tools for children, learning aids for adults, and streamlined bilingual services that make Welsh not just symbolic, but functional.
Economically, the availability of Welsh-language AI could empower local businesses, especially in tourism, retail, and media. A restaurant in Cardiff could deploy a bilingual chatbot for reservations, while Welsh broadcasters could automate subtitle generation in both languages. This reduces barriers for companies and boosts cultural pride.
Culturally, the initiative flips the script on how minority languages are usually treated. Instead of being preserved in a museum-like fashion, Welsh is being placed in the cutting edge of AI innovation. This makes the language more attractive to younger generations, who may have previously viewed it as outdated or irrelevant.
On a global level, this project sends a signal: minority languages are not doomed to extinction in the digital age. By leveraging sovereign AI infrastructure, the UK is building a replicable model that could be applied from Basque to Zulu. It shows that with sufficient investment, even languages with smaller datasets can thrive in AI ecosystems.
However, challenges remain. Training data gaps create the risk of biased or inaccurate translations, particularly in sensitive contexts like healthcare or law. This is where the role of Bangor University becomes crucial — human oversight ensures the technology respects linguistic and cultural integrity.
The political dimension is also worth noting. By investing in Welsh, the government is making a statement about unity and inclusivity within the UK’s diverse nations. It is also subtly countering critiques that AI development often reflects only English-language dominance, leaving others behind.
From a technological perspective, Nemotron’s open-source framework is a clever choice. Open weights and datasets allow transparency and adaptability, encouraging developers to build upon the foundation rather than rely on closed corporate ecosystems.
In essence, this project represents a marriage of heritage and high-tech. It demonstrates that AI does not have to erase identity — it can protect it, amplify it, and even give it new life. The Welsh model may well become a case study in how digital transformation can serve culture, not consume it.
🔍 Fact Checker Results
✅ The Cymraeg 2050 initiative indeed aims for one million Welsh speakers by 2050.
✅ Isambard-AI is currently the UK’s most powerful supercomputer, backed by \$280 million USD (£225M GBP).
✅ Welsh has around 850,000 speakers today, according to official census and language-use reports.
📊 Prediction
In the next decade, Welsh will become a fully digital-first language, appearing in AI-driven education apps, smart assistants, and public services. By 2035, expect similar models for Irish, Cornish, and Scottish Gaelic, with Wales serving as the blueprint. Globally, this model could spark a renaissance for endangered languages, making AI the unexpected guardian of cultural survival.
🕵️📝✔️Let’s dive deep and fact‑check.
References:
Reported By: blogs.nvidia.com
Extra Source Hub:
https://stackoverflow.com
Wikipedia
OpenAi & Undercode AI
Image Source:
Unsplash
Undercode AI DI v2
🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]
📢 Follow UndercodeNews & Stay Tuned:
𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky | 🐘Mastodon




