Cosmos on PAI系列一：PAI-Model Gallery云上一键部署NVIDIA Cosmos Reason-1

时间：2025-06-09 17:07

摘要 NVIDIA Cosmos 是一个世界基础模型（WFMs, world foundation models）开发平台，用于推动物理 AI 的发展，包含先进的视觉标记器、护栏以及加速视频数据处理工具管线。它专为加速智能驾驶汽车和机器人领域的合成数据生成、AI 模型训练与评估而设计。本篇文章

NVIDIA Cosmos 是一个世界基础模型（WFMs, world foundation models）开发平台，用于推动物理 AI 的发展，包含先进的视觉标记器、护栏以及加速视频数据处理工具管线。它专为加速智能驾驶汽车和机器人领域的合成数据生成、AI 模型训练与评估而设计。

本篇文章介绍 Cosmos 最新世界基础模型 Cosmos Reason-1 如何在阿里云人工智能平台PAI上进行快速部署使用。

Cosmos Reason-1 模型简介

Cosmos Reason-1 是一款可完全定制的多模态 AI 推理模型，它专门为理解运动、物体交互以及时空关系而构建。基于思维链（Chain-of-thought, CoT）推理，Cosmos Reason-1 模型可以解读视觉输入、根据给定的提示词预测结果、并奖励最佳决策。

该模型基于真实世界的物理规律实现推理，从而生成清晰且能够感知上下文环境的自然语言回复。Cosmos Reason-1 能够通过充当判别器或对海量视觉数据进行标注，从而增强合成数据管理能力。

Cosmos Reason-1-7B 基于 Qwen2.5-VL 使用物理常识和具身推理数据进行后训练，并使用了监督微调（SFT）和强化学习（RL）技术。

更多关于 Cosmos Reason-1 模型的介绍，您可访问：

NVIDIA Research 官网：https://research.nvidia.com/labs/dir/cosmos-reason1/

NVIDIA Cosmos 官网：https://www.nvidia.cn/ai/cosmos/

NVIDIA Cosmos 开发者官网：https://developer.nvidia.cn/cosmos

PAI-Model Gallery 简介

阿里云 PAI-Model Gallery 已同步接入 Cosmos Reason-1 模型，提供企业级部署方案。

PAI-Model Gallery 是阿里云人工智能平台 PAI 的产品组件，它集成了国内外 AI 开源社区中优质的预训练模型，涵盖了 LLM、AIGC、CV、NLP 等各个领域。通过 PAI 对这些模型的适配，用户可以以零代码方式实现从训练到部署再到推理的全过程，简化了模型的开发流程，为开发者和企业用户带来了更快、更高效、更便捷的 AI 开发和应用体验。

PAI-Model Gallery 访问地址：https://pai.console.aliyun.com/#/quick-start/models

✅ 零代码一键部署

✅ 自动适配云资源

✅ 部署后开箱即用API

✅ 全流程运维托管

✅ 企业级安全数据不出域

PAI 一键部署 Cosmos Reason-1

极简流程立即体验

在 PAI-Model Gallery 模型广场找到 Cosmos Reason-1-7B 模型，或通过链接直达该模型：https://pai.console.aliyun.com/?regionId=cn-beijing#/quick-start/models/Cosmos-Reason1-7B/intro

在模型详情页右上角点击「部署」，在选择计算资源后，即可一键完成模型的云上部署。

部署成功后，在服务页面可以点击“查看调用信息”获取调用的 Endpoint 和 Token，想了解服务调用方式可以点击预训练模型链接，返回模型介绍页查看调用方式说明。

使用推理服务：您可以使用 API 调用模型服务，也可以使用 PAI 平台提供的 WebUI 界面与模型交互。

模型实测

我们使用NVIDIA官网提供的视频案例来测试模型。视频中一个人正在往杯子中倒牛奶，我们给模型的问题是“视频中的人将牛奶倒入杯中，下一步最合理的动作是什么？” 模型的回答展示出模型考虑到了人物下一步动作的各种可能性，甚至考虑到了倒完牛奶放好奶瓶后，用勺子搅拌牛奶和咖啡，最后模型在进行了一番深入思考后得出了结论，下一步是“将奶瓶放回到台面上”。

从这个例子可以看出，模型准确地“理解”了视频中人的动作并给出了正确的预测，这种对运动、物体交互以及时空关系的理解及预测能力，当前一般的语言模型是难以做到的，而且模型输出了完整的思维链，让我们能清楚地知道模型的整个思考过程。

下面附上完整的视频、输入问题、模型输出结果，供大家参考。

视频：

点击链接观看视频：https://research.nvidia.com/labs/dir/cosmos-reason1/

视频来源于NVIDIA

Question：The overall goal is "The agent pours milk into the cup". The agent in the video is currently performing one subtask out of many to complete this instruction. For the agent in the video, what is the most plausible next immediate subtask?

模型输出：The agent is currently pouring milk into the cup. The next immediate subtask after pouring would logically be to stop pouring once the desired amount is achieved. Then, the agent would need to place the milk bottle back on the countertop. After that, they might close the bottle's cap to prevent spills. Finally, they could stir the mixture with the spoon to mix the milk and coffee thoroughly. However, since the current action is pouring, the very next step after pouring would be to stop pouring and set the bottle down. But the question asks for the most plausible next immediate subtask. Since the pouring is ongoing, the next step after completing the pour would be to place the milk bottle back on the countertop. Alternatively, if the pouring is still in progress, maybe adjusting the angle to ensure all milk is poured. But given the context, the agent is actively pouring, so once that's done, placing the bottle down makes sense. However, the answer might be to stop pouring and set the bottle aside. But the exact next step after pouring would be to place the bottle back. Alternatively, maybe the agent needs to check if the cup is full enough. But the most logical next step after pouring is to stop and place the bottle down. So the answer is likely "Place the milk bottle back on the countertop."

Coming Soon：NVIDIA Cosmos Reason-1 tools 模型后训练最佳实践

通过上面的例子我们已经可以看出 Cosmos Reason-1 模型的强大之处，此外，NVIDIA 本次除了开源 Cosmos Reason-1 模型，也开放了 Cosmos Reason-1 tools，包括模型后训练脚本（SFT + RL）。这就意味着，用户可以根据自身数据定制自己的 Cosmos Reason-1 模型，相信这会吸引众多开发者和企业定制自己的 physical AI 模型。

我们也第一时间测试了 Cosmos Reason-1 tools 的性能表现。在 Qwen2.5-32B-Instruct 模型 + gsm8k数据集（Batch size = 2,048）组合上进行后训练测试，相比开源框架verl，Cosmos Reason-1 tools 在小规模集群上实测有1-2倍的性能优势。

PAI平台将在近期集成 Cosmos Reason-1 tools 模型后训练能力，欢迎您持续关注。

更多模型支持

除 Cosmos Reason-1 模型外，PAI-Model Gallery 持续提供开源社区热门模型快速部署、训练、评测实践。

云上一键部署 Qwen3 全尺寸模型：https://mp.weixin.qq.com/s/20xc0g-h6XLqKiAuSUNaIg

DeepSeek-R1 推理性能优化版：https://mp.weixin.qq.com/s/yMR4CAfW08SLO3qfd4FLhg

推理性能提升，相同时延约束下，吞吐能提升492%；相同吞吐情况下，首token时延直降86%，token间时延直降69%。

DeepSeek-R1 满血版：https://mp.weixin.qq.com/s/jp2QE84RdYig_x1I4cxa_Q

联系我们

欢迎各位小伙伴持续关注使用 PAI-Model Gallery，平台会不断上线 SOTA 模型，如果您有任何模型需求，也可以联系我们。您可通过搜索钉钉群号（79680024618），加入PAI-Model Gallery用户交流群。

来源：作者：admin

Cosmos on PAI系列一：PAI-Model Gallery云上

热点标签

Cosmos on PAI 聚首香港文国红主席荣任几九部委发布电子凭阿里云人 DistilQwen-Thoug PAI Mo AOTOS 来学网是正规机构靠谱从世界屋脊到世界

24小时热点排行

热点图片

有梦当燃！“中国	阿维塔012限量联
“高管说消保”—