Alibaba introduces latest enhancements to its ‘Qwen 2.5’ visual-language series

by

Azunta Gaviola

-

4 weeks ago

Be part of the forefront of innovation and reshape the future of retail and e-commerce! Making its highly anticipated return, MARKETECH APAC and UpTech Media partners for the Retail & E-Commerce Innovation Marketing & Tech Summit Malaysia 2025, happening on 22 May 2025 at Sheraton Petaling Jaya and for the Retail &E-Commerce Innovation Marketing & Tech Summit: Philippines 2025 on 25 June 2025 at Shangri-La The Fort, Manila. Don’t miss out!

Beijing, China – Alibaba, a Chinese tech company, has recently issued a statement noting the launch of Qwen2.5-VL, an upgraded version of its visual-language model predecessor, Qwen2-VL.

As per the company, the multimodal model is offered in an open-source format, with sizes ranging from 3 billion to 72 billion parameters, and includes both base and instruction-tuned variants.

The Qwen2.5-VL-72B-Instruct model can be accessed as well on the Qwen Chat platform, alongside the entire Qwen2.5-VL series hosted on Hugging Face and Alibaba’s Model Scope.

In terms of capabilities, the Qwen2.5-VL can interpret complex visual elements, including texts, diagrams, charts, graphics, and image structures. It can also understand videos longer than an hour and answer video-related questions while accurately identifying specific segments down to the exact second.

In addition, the model can develop structured outputs, like JSON, enabling the automatic extraction and organisation of data from invoices, forms, and tables. Said capability streamlines processes in finance and legal sectors.

Meanwhile, Qwen2.5-VL may also function as a visual agent that facilitates task execution on computers and mobile devices, such as checking the weather or booking flights, through the use of a guiding tool. 

In particular, the flagship model Qwen2.5-VL-72B-Instruct has performed a series of benchmarks covering domains and tasks including document and diagram reading, general visual question answering, college-level math, video understanding, and visual agent.

From this end, researchers have improved the model’s multimodal capabilities by implementing dynamic resolution and frame rate training for enhanced video understanding. They have also introduced a visual encoder, integrating Window Attention within a dynamic Vision Transformer (ViT) framework to accelerate both training and inference. 

These innovations make the model a crucial solution for diverse multimodal applications across various fields.

Apart from these developments, Alibaba has also launched the latest version of the Qwen large language model, known as Qwen2.5-1M. This open-source iteration is distinguished by its capability to process long context inputs, with the ability to handle up to 1 million tokens.

Included in the release are two instruction-tuned models, Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M,boasting 7 billion and 14 billion parameters. These models have been made available on Hugging Face.

It has also unveiled a corresponding inference framework optimised for processing long contexts on GitHub. This framework is tailored to help developers deploy the Qwen2.5-1M series more cost-effectively. 

By leveraging techniques such as length extrapolation and sparse attention, the framework can process 1-million-token inputs with speeds 3 to 7 times faster than traditional approaches, offering a potent solution for developing applications that require long-context processing with more efficiency.

Recently, Alibaba also made an announcement introducing Qwen2.5-Max, a next-generation AI model they claim surpasses several top AI systems in key performance benchmarks. This latest model is now accessible to developers via Alibaba Cloud services and Alibaba’s conversational AI platform, Qwen Chat.

Be part of the forefront of innovation and reshape the future of retail and e-commerce! Making its highly anticipated return, MARKETECH APAC and UpTech Media partners for the Retail & E-Commerce Innovation Marketing & Tech Summit Philippines 2025, happening on 25 June 2025 at Shangri-La The Fort, Manila. Don’t miss out!

The NEXT Awards 2025 is here, and we’re seeking the most innovative marketing campaigns from Indonesia, the Philippines, Malaysia, Singapore and Asia Pacific. Submit your entry today and showcase your best work!

Share

RECENT ARTICLES

inDrive partners with Fingular to jointly introduce accessible financial solutions for Indonesian drivers
Progress welcomes Ed Keisling as new chief AI officer
Netskope’s latest enhancements to ‘Enterprise Browser’ elevates data security functions, workforce access
Exclusive Networks, Palo Alto Networks to address cybersecurity challenges with latest managed SOC solution
Mosaic Solutions expands portfolio with HelixPay acquisition and PayMongo alliance
Ellipse 3

RELATED ARTICLES

Alibaba Cloud introduces revamped AI-focused partner ecosystem with latest initiatives_11zon
Alibaba Cloud introduces Qwen 2
Alibaba Cloud to expand regional footprint in key international markets to boost cloud, AI infrastructure_11zon
Ellipse 3

FEATURED ARTICLES

Levelling up beyond gameplay: How Coda innovates content monetisation with out-of-app strategies
1_Huawei unveils smart tech strategies for secure, transparent e-commerce future 
EW2025_(UT)Launch Article_Feature Image_11zon

Subscribe to UpTech Media Newsletter

Video Title Here: The Indonesian on-ground activation status

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos.

Video Title Here: The Indonesian on-ground activation status

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos.

Video Title Here: The Indonesian on-ground activation status

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos.