Home
Categories
EXPLORE
True Crime
Comedy
Business
Society & Culture
History
Sports
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/95/fe/9e/95fe9e2a-cee8-e955-84e3-1301efdb1fc8/mza_2178844958555179913.jpg/600x600bb.jpg
Ctrl+Alt+Future
Mp3Pintyo
15 episodes
1 week ago
Feeling overwhelmed by the future? It's time for a hard reset. Welcome to Ctrl+Alt+Future, the podcast that navigates the complex world of AI, innovation, and digital culture. Join your hosts, Jules (the skeptic) and Aris (the visionary), for a weekly deep dive into the tech that shapes our world. Through their respectful debates, they separate the signal from the noise and help you understand tomorrow, today. Tune in and reboot your worldview.
Show more...
Technology
RSS
All content for Ctrl+Alt+Future is the property of Mp3Pintyo and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Feeling overwhelmed by the future? It's time for a hard reset. Welcome to Ctrl+Alt+Future, the podcast that navigates the complex world of AI, innovation, and digital culture. Join your hosts, Jules (the skeptic) and Aris (the visionary), for a weekly deep dive into the tech that shapes our world. Through their respectful debates, they separate the signal from the noise and help you understand tomorrow, today. Tune in and reboot your worldview.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/42016195/42016195-1757004998526-9829f8e6cb09d.jpg
Qwen3-Next: Free large language model from Alibaba that could revolutionize training costs?
Ctrl+Alt+Future
46 minutes 11 seconds
1 month ago
Qwen3-Next: Free large language model from Alibaba that could revolutionize training costs?

Qwen3-Next is a new large-scale language model (LLM) from Alibaba that has 80 billion parameters but only activates 3 billion during inference through a hybrid attention mechanism and rare Mixture-of-Experts (MoE) design. It offers outstanding efficiency and speed of up to 10 times compared to previous models, while achieving higher accuracy in ultra-long context tasks and outperforming Gemini-2.5-Flash-Thinking model on complex reasoning tests.


Why is Qwen3-Next good and what makes it special?


Accessibility and open source:

Qwen3-Next models are available through Hugging Face, ModelScope, Alibaba Cloud Model Studio, and NVIDIA API Catalog. Its open source nature, released under the Apache 2.0 license, encourages innovation and democratizes access to cutting-edge AI technology.


Cost-effectiveness:

- Qwen3-Next not only shows higher accuracy, but also significant efficiency compared to other models

- It can be trained with less than 10% of the computational cost (9.3% to be exact) compared to the Qwen3-32B model. This reduced training cost has the potential to democratize AI development.


Faster inference:

- Only 3 billion (about 3.7%) of its 80 billion parameters are active during the inference phase. This dramatically reduces the FLOPs/token ratio while maintaining model performance

FLOPs is an abbreviation for Floating Point Operations Per Second, which is a unit of measurement for computer performance. In the case of AI models, FLOPs/token indicates how many computational operations are required to process a single text "token" (word or word fragment).

- For shorter contexts, it provides up to 7x speedup in the prefill (first token output) phase and 4x speedup in the decode (additional tokens output) phase.


Innovative architecture:

- Hybrid attention mechanism, which enables extremely efficient context modeling for ultra-long contexts.

- Rare Mixture-of-Experts (MoE) system: consists of 512 experts, where 10 experts and 1 shared expert are actively used at the same time.


Outstanding performance:

- Outperforms Qwen3-32B-Base in most benchmarks, while using less than 10% of its computational cost

- Very close in performance to Alibaba's flagship 235B parameter model.

- Performs particularly well in handling ultra-long context tasks, up to 256,000 tokens. Furthermore, the context length can be extended to 1 million tokens using the YaRN method.

- Qwen3-Next-80B-A3B-Thinking excels at complex reasoning tasks. It outperforms mid-range Qwen3 variants and even outperforms the closed-source Gemini-2.5-Flash-Thinking in several benchmarks


Multilingual capabilities:

The automatic speech recognition model, Qwen3-ASR-Flash, performs accurate transcription in 11 major languages ​​and several Chinese dialects


Agent capabilities

Excellent for device invocation tasks and agent-based workflows


Links

Qwen3-Next: Towards Ultimate Training & Inference Efficiency: https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-listHugging Face model: https://huggingface.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9dModelscope: https://modelscope.cn/models/Qwen/Qwen3-Next-80B-A3B-ThinkingOpenrouter: https://openrouter.ai/qwenQwen Chat: https://chat.qwen.ai/


Ctrl+Alt+Future
Feeling overwhelmed by the future? It's time for a hard reset. Welcome to Ctrl+Alt+Future, the podcast that navigates the complex world of AI, innovation, and digital culture. Join your hosts, Jules (the skeptic) and Aris (the visionary), for a weekly deep dive into the tech that shapes our world. Through their respectful debates, they separate the signal from the noise and help you understand tomorrow, today. Tune in and reboot your worldview.