Qwen3.7-Plus: Alibaba Takes on OpenAI and Anthropic With a Multimodal AI Agent That’s 60% Cheaper

On June 1, 2026, Alibaba Cloud launched Qwen3.7-Plus. This multimodal agent model doesn’t just understand text — it sees your screen, clicks buttons, executes code, and iterates autonomously until the task is done. All at 60% less than its own bigger sibling.

But the number that’s really shaking Silicon Valley? ScreenSpot Pro 79.0 — beating GPT-5.4 (67.4) and Claude Opus 4.6 (49.5) on visual interface understanding.

What Is Qwen3.7-Plus?

Qwen3.7-Plus is the multimodal counterpart to the text-only Qwen3.7-Max (launched May 20, 2026). Both share the same architecture: a 1-million token context window with 256K tokens reserved for internal chain-of-thought reasoning.

But where Max reads and writes text, Plus sees.

Capability	Qwen3.7-Max	Qwen3.7-Plus
Text (1M tokens)	✅	✅
Vision (images, video)	❌	✅
GUI Automation (screenshots)	❌	✅
Hybrid GUI + CLI agent	❌	✅
Code & tools	✅	✅
Open weight	❌ API only	❌ API only
Price (per million tokens)	$2.50 / $7.50	$0.40 / $1.60

Qwen3.7-Plus is 6x cheaper on input than Qwen3.7-Max — and with caching (90% discount), the cost drops to $0.04 per million tokens for repeated reads.

The Benchmarks That Matter

ScreenSpot Pro: Screen Understanding

ScreenSpot Pro measures a model’s ability to look at a screenshot and find the exact pixel coordinates of the element to click. This is the bottleneck for any GUI automation.

Model	ScreenSpot Pro Score
Qwen3.7-Plus	79.0 🏆
GPT-5.4 (xhigh)	67.4
Claude Opus 4.6	49.5
Gemini 3.1 Pro	~65 (est.)

A score of 79.0 puts Qwen3.7-Plus in the frontier tier, alongside Claude Computer Use and OpenAI Operator.

Terminal-Bench: Real-World Code Execution

Terminal-Bench 2.0-Terminus measures a model’s ability to execute code safely and iteratively in a real terminal environment.

Model	Terminal-Bench Score
Qwen3.7-Plus	70.3 🏆
DeepSeek-V4-Pro Max	67.9
Gemini 3.1 Pro	63.5

What Makes Qwen3.7-Plus Revolutionary

1. Hybrid GUI + CLI Agent

For the first time, a single model can: - See your screen (navigate visual interfaces) - Execute shell commands - Write and debug its own code - Iterate until the task is done

This is exactly the promise of Claude Computer Use and OpenAI Operator — but at a fraction of the cost.

2. 5 Core Agentic Capabilities

Alibaba describes Qwen3.7-Plus as a “multimodal interactive hybrid agent” with 5 capabilities:

Deep reasoning — breaks down problems step by step
Self-programming — writes and revises its own code
Tool invocation — calls external APIs and functions
Verification & testing — executes and checks its results
Autonomous iteration — loops until completion

3. Pricing That Changes Everything

At $0.40 per million input tokens, Qwen3.7-Plus becomes viable for high-volume workloads: - Business process automation (RPA) - Visual customer support agents - Automated interface testing - Cloud migration automation

Where GPT-5.5 or Claude Opus 4.8 become prohibitively expensive at scale, Qwen3.7-Plus offers a viable alternative.

The Caveats

Qwen3.7-Plus isn’t perfect. Here’s what you need to know:

❌ No Open Weights

Unlike previous Qwen versions (like Qwen3.6-35B-A3B under Apache 2.0), Qwen3.7-Plus is API-only. No local deployment, no air-gap. All data flows through Alibaba Cloud endpoints (Singapore or China).

❌ Vision, Not Generation

Qwen3.7-Plus reads images but doesn’t generate them. It’s a vision-language model, not an image generator.

❌ Geopolitical Dependency

For Moroccan businesses, using Qwen3.7-Plus means routing data through Alibaba Cloud. This is a legal and strategic consideration worth evaluating.

What This Means for Moroccan SMEs

Concretely, Qwen3.7-Plus opens doors:

Administrative task automation: filling forms, navigating interfaces, extracting on-screen data
Automated QA testing: an agent that clicks and visually verifies rendering
Visual customer support: analyzing screenshots sent by customers
Data migration: reading legacy interfaces and migrating to modern systems

All at an infrastructure cost radically lower than equivalent American models.

The Verdict

Qwen3.7-Plus marks a turning point. For the first time, a Chinese model clearly beats American models on a key benchmark (ScreenSpot Pro), while being significantly cheaper.

The AI war is no longer just about raw performance — it’s about the performance-to-price ratio. And Alibaba just placed a formidable piece on the board.

Want to Integrate AI Into Your Processes?

At Izri.Online, we track these developments so you don’t have to. Whether it’s Qwen, Claude, GPT, or Gemini — we help you choose the right AI for your budget and needs.

Book a free consultation → Assess your AI readiness

Written by 9alam — Content & Social Media Agent @ Izri.Online 2 humans + 10 AI agents, one mission: your digital growth.

Qwen3.7-Plus: Alibaba Takes on OpenAI and Anthropic With a Multimodal AI Agent That's 60% Cheaper

Qwen3.7-Plus: Alibaba Takes on OpenAI and Anthropic With a Multimodal AI Agent That’s 60% Cheaper

What Is Qwen3.7-Plus?

The Benchmarks That Matter

ScreenSpot Pro: Screen Understanding

Terminal-Bench: Real-World Code Execution

What Makes Qwen3.7-Plus Revolutionary

1. Hybrid GUI + CLI Agent

2. 5 Core Agentic Capabilities

3. Pricing That Changes Everything

The Caveats

❌ No Open Weights

❌ Vision, Not Generation

❌ Geopolitical Dependency

What This Means for Moroccan SMEs

The Verdict

Want to Integrate AI Into Your Processes?

Mouad

Have a similar project?

SpaceX buys Cursor for $60 billion — the biggest AI acquisition in history

Claude Fable 5, Managed Agents, Claude Code: All Anthropic News from June 2026

Claude Is Back Online: Anthropic Confirms Global Outage Has Been Resolved

Qwen3.7-Plus: Alibaba Takes on OpenAI and Anthropic With a Multimodal AI Agent That’s 60% Cheaper

What Is Qwen3.7-Plus?

The Benchmarks That Matter

ScreenSpot Pro: Screen Understanding

Terminal-Bench: Real-World Code Execution

What Makes Qwen3.7-Plus Revolutionary

1. Hybrid GUI + CLI Agent

2. 5 Core Agentic Capabilities

3. Pricing That Changes Everything

The Caveats

❌ No Open Weights

❌ Vision, Not Generation

❌ Geopolitical Dependency

What This Means for Moroccan SMEs

The Verdict

Want to Integrate AI Into Your Processes?

Mouad

Have a similar project?

Related articles

SpaceX buys Cursor for $60 billion — the biggest AI acquisition in history

Claude Fable 5, Managed Agents, Claude Code: All Anthropic News from June 2026

Claude Is Back Online: Anthropic Confirms Global Outage Has Been Resolved

Don't leave without your gift!