Hanoi, Vietnam – FPT Corporation and NVIDIA have announced the release of the Nemotron-Personas-Vietnam dataset, an open-source resource designed to support sovereign AI development in Vietnam and across Southeast Asia.
The dataset, which is available for commercial use, is intended to provide developers, researchers, and enterprises with access to an auditable and localised data resource for building AI systems that better reflect Vietnam’s language, culture, workforce, and economic conditions.
According to the companies, the dataset expands NVIDIA’s Nemotron ecosystem, which includes models, datasets, evaluation resources, and NVIDIA NeMo libraries. The ecosystem is designed to help developers customise, evaluate, and deploy AI systems for specific local use cases.
The collaboration combines NVIDIA’s AI development framework and synthetic data generation technologies with FPT’s local expertise and infrastructure. NVIDIA contributed the Nemotron-Personas methodology, NeMo Data Designer synthetic data library, and open model framework. FPT provided validation methodologies, data infrastructure, and AI research capabilities through several business units, including FPT Smart Cloud, the Quantum AI and Cyber Security Institute, and FPT DC5.
The Nemotron-Personas methodology is designed to create population-scale synthetic datasets that are auditable and grounded in demographic data. The Vietnam-specific dataset applies this approach to represent the country’s linguistic diversity, demographics, and labor characteristics.
According to the companies, the Nemotron-Personas-Vietnam dataset contains 900,000 synthetic personas based on Vietnam’s latest official statistics and geographic structure. Each record includes 31 fields covering persona information, attributes, contextual data, and a unique identifier. The dataset is available through Hugging Face and is compatible with NVIDIA NeMo libraries for AI development tasks including data curation, model fine-tuning, post-training, and deployment.
“FPT believes that sovereign AI must be built from the ground up to reflect local language, culture, and economic realities. The Nemotron-Personas-Vietnam dataset represents our commitment to making localised AI development openly accessible for every innovator building AI solutions for Vietnam and the broader region,” said Associate Professor Dr. Ngo Xuan Bach, Director of AI Product Center, FPT Smart Cloud, and Director of the Quantum AI & Cyber Security Institute, FPT Corporation.
The announcement also forms part of FPT’s broader sovereign AI strategy. The company said it is pursuing a “Build Your Own AI” approach centered on three layers: NVIDIA-accelerated GPU cloud services for AI model training and inference, AI deployment platforms, and AI applications for businesses and institutions.
The companies said these components are intended to create a complete sovereign AI stack that supports the development and deployment of localised AI systems within national or regional boundaries, with a model that could be replicated across Southeast Asia.

