diff --git a/applications/OneFormer/Inference_with_OneFormer.ipynb b/applications/OneFormer/Inference_with_OneFormer.ipynb
new file mode 100644
index 000000000..911306d4c
--- /dev/null
+++ b/applications/OneFormer/Inference_with_OneFormer.ipynb
@@ -0,0 +1,458 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "f58b9dcf",
+   "metadata": {},
+   "source": [
+    "# 推理与OneFormer：通用图像分割\n",
+    "原论文：https://arxiv.org/abs/2211.06220\n",
+    "OneFormer在Mask2Former框架中集成了一个文本模块，以在各自的子任务（实例、语义或panoptic）上约束模型。这样可以得到更准确的结果，但代价是增加了延迟。"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7a37a1f9",
+   "metadata": {},
+   "source": [
+    "## 设置环境\n",
+    "Mindspore 2.5.0\n",
+    "\n",
+    "Mindnlp   0.4.0\n",
+    "\n",
+    "python    3.9.0"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c6085611",
+   "metadata": {},
+   "source": [
+    "## 图像加载\n",
+    "\n",
+    "接下来，我们加载一个我们想要执行推理的图像。这里我们加载熟悉的猫图像，这是COCO数据集的一部分。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bf9de159-3e3c-4ef1-b54c-ab2623fe8526",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from PIL import Image\n",
+    "import requests\n",
+    "\n",
+    "url = 'http://images.cocodataset.org/val2017/000000039769.jpg'\n",
+    "image = Image.open(requests.get(url, stream=True).raw)\n",
+    "image"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7863d8a8",
+   "metadata": {},
+   "source": [
+    "## 为模型准备图像\n",
+    "\n",
+    "我们可以使用处理器准备图像。OneFormer利用了一个处理器，它内部由一个图像处理器（用于图像模态）和一个标记器（用于文本模态）组成。OneFormer实际上是一个多模态模型，因为它结合了图像和文本来解决图像分割。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3e936c0d-d6cd-426d-bf9e-e05dff42e145",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from mindnlp.transformers import AutoProcessor\n",
+    "\n",
+    "# the Auto API loads a OneFormerProcessor for us, based on the checkpoint\n",
+    "processor = AutoProcessor.from_pretrained(\"shi-labs/oneformer_coco_swin_large\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5f159b71-4f30-43b9-834d-e53cb6c20153",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# prepare image for the model\n",
+    "panoptic_inputs = processor(images=image, task_inputs=[\"panoptic\"], return_tensors=\"ms\")\n",
+    "for k,v in panoptic_inputs.items():\n",
+    "  print(k,v.shape)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c3b95c44",
+   "metadata": {},
+   "source": [
+    "可以看到，这个模型有一个额外的“task_inputs”，这是MaskFormer和Mask2Former所没有的。这些文本输入允许模型区分实例/语义/全景分割。\n",
+    "\n",
+    "\n",
+    "\n",
+    "我们可以将任务输入解码回文本："
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f74a380a-7245-4c3b-bd4c-ee65481b8eb5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "processor.tokenizer.batch_decode(panoptic_inputs.task_inputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "11d76bcc",
+   "metadata": {},
+   "source": [
+    "## 加载模型\n",
+    "\n",
+    "\n",
+    "\n",
+    "接下来，让我们从mindnlp/transformers加载一个模型。在这里，我们用一个swing -large的主干加载OneFormer模型，该主干是在COCO数据集上训练的。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "c53c4c55-17ea-4be2-a761-ec793413a9bc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from mindnlp.transformers import AutoModelForUniversalSegmentation\n",
+    "\n",
+    "model = AutoModelForUniversalSegmentation.from_pretrained(\"shi-labs/oneformer_coco_swin_large\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8dcdf3b9",
+   "metadata": {},
+   "source": [
+    "## 前向传播\n",
+    "\n",
+    "\n",
+    "\n",
+    "mindnlp中的前向传播是这样完成的："
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e5399787-05ac-4e55-95e8-bc712e5efc1c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from mindnlp.core import ops, no_grad\n",
+    "\n",
+    "# forward pass\n",
+    "with no_grad():\n",
+    "  outputs = model(**panoptic_inputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b00c6dbf",
+   "metadata": {},
+   "source": [
+    "# 可视化\n",
+    "\n",
+    "\n",
+    "\n",
+    "接下来，我们可以对原始输出进行后处理，并将预测可视化。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e740410c-7081-407a-9a06-03e159f14e04",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "panoptic_segmentation = processor.post_process_panoptic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]\n",
+    "print(panoptic_segmentation.keys())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b36ee706-a401-4d39-87ea-63d77fcd299a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from collections import defaultdict\n",
+    "import matplotlib.pyplot as plt\n",
+    "from matplotlib import cm\n",
+    "import matplotlib.patches as mpatches\n",
+    "import numpy as np\n",
+    "from mindspore import Tensor\n",
+    "\n",
+    "def draw_panoptic_segmentation(segmentation, segments_info):\n",
+    "\n",
+    "    if isinstance(segmentation, Tensor):\n",
+    "        segmentation_np = segmentation.asnumpy()\n",
+    "    else:\n",
+    "        segmentation_np = np.array(segmentation)\n",
+    "    \n",
+    "    if not np.issubdtype(segmentation_np.dtype, np.integer):\n",
+    "        segmentation_np = segmentation_np.astype(np.int32)\n",
+    "    \n",
+    "    # Get the maximum segment ID using numpy\n",
+    "    max_segment = np.max(segmentation_np)\n",
+    "    viridis = cm.get_cmap('viridis', max_segment + 1)  \n",
+    "    \n",
+    "    fig, ax = plt.subplots()\n",
+    "    ax.imshow(segmentation_np)\n",
+    "    \n",
+    "    instances_counter = defaultdict(int)\n",
+    "    handles = []\n",
+    "    \n",
+    "    for segment in segments_info:\n",
+    "        segment_id = segment['id']\n",
+    "        segment_label_id = segment['label_id']\n",
+    "        segment_label = model.config.id2label[segment_label_id]  \n",
+    "        label = f\"{segment_label}-{instances_counter[segment_label_id]}\"\n",
+    "        instances_counter[segment_label_id] += 1\n",
+    "        color = viridis(segment_id)\n",
+    "        handles.append(mpatches.Patch(color=color, label=label))\n",
+    "    \n",
+    "    ax.legend(handles=handles)\n",
+    "    plt.savefig('cats_panoptic.png')\n",
+    "draw_panoptic_segmentation(**panoptic_segmentation)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f24c019a",
+   "metadata": {},
+   "source": [
+    "可以看出，该模型能够正确区分两只不同的猫以及两个不同的遥控器。"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8acb48fc",
+   "metadata": {},
+   "source": [
+    "## 推理：语义分割\n",
+    "我们还可以使用相同的模型对猫咪图像进行语义分割！我们只需要更改任务输入（即模型的文本输入），将其改为“此任务为语义”。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c5496e90-37f4-4fe7-a8db-521b60f2ea37",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# prepare image for the model\n",
+    "semantic_inputs = processor(images=image, task_inputs=[\"semantic\"], return_tensors=\"ms\")\n",
+    "for k,v in semantic_inputs.items():\n",
+    "  print(k,v.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "ab5497ee-1c8d-4740-bf22-65d2de5eac2c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# forward pass\n",
+    "with no_grad():\n",
+    "  outputs = model(**semantic_inputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6cd6bb98",
+   "metadata": {},
+   "source": [
+    "让我们对结果进行后处理并可视化："
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b0400d2e-d921-422b-b8d9-b08f872251f7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "semantic_segmentation = processor.post_process_semantic_segmentation(outputs)[0]\n",
+    "semantic_segmentation.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "73b6a7f7-aaf9-408e-bcf9-b21dbf38e937",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "import matplotlib.patches as mpatches\n",
+    "from matplotlib.colors import ListedColormap, LinearSegmentedColormap\n",
+    "from matplotlib import cm\n",
+    "\n",
+    "\n",
+    "def draw_semantic_segmentation(segmentation):\n",
+    "\n",
+    "    if not isinstance(segmentation, np.ndarray):\n",
+    "        segmentation = np.array(segmentation)\n",
+    "    \n",
+    "    segmentation = segmentation.astype(np.int32)\n",
+    "    \n",
+    "    max_label = np.max(segmentation)  \n",
+    "    viridis = cm.get_cmap('viridis', max_label)\n",
+    "    \n",
+    "    labels_ids = np.unique(segmentation).tolist()\n",
+    "    \n",
+    "    fig, ax = plt.subplots()\n",
+    "    ax.imshow(segmentation, cmap=viridis) \n",
+    "    handles = []\n",
+    "    \n",
+    "    for label_id in labels_ids:\n",
+    "        label = model.config.id2label[label_id]\n",
+    "        color = viridis(label_id / max_label)  \n",
+    "        handles.append(mpatches.Patch(color=color, label=label))\n",
+    "    \n",
+    "    ax.legend(handles=handles)\n",
+    "\n",
+    "draw_semantic_segmentation(semantic_segmentation)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a65f1ace",
+   "metadata": {},
+   "source": [
+    "可以看到，在语义分割中，不会区分单个实例（可数的事物，如猫咪或遥控器）。相反，只会为“猫咪”类别等生成一个单一的掩码。"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "78bdb604",
+   "metadata": {},
+   "source": [
+    "## 推理：实例分割\n",
+    "\n",
+    "同样，我们可以使用相同的模型进行实例分割，我们只需要更改文本输入即可。"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0ce50c61-ef57-4773-979f-a115f847b0a2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# prepare image for the model\n",
+    "instance_inputs = processor(images=image, task_inputs=[\"instance\"], return_tensors=\"ms\")\n",
+    "for k,v in instance_inputs.items():\n",
+    "  print(k,v.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "08d35d25-df61-4025-8a75-32828818f0e8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# forward pass\n",
+    "with no_grad():\n",
+    "  outputs = model(**instance_inputs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "06b9ecbe",
+   "metadata": {},
+   "source": [
+    "让我们对结果进行后处理并可视化："
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f1049e52-8fc0-45c6-9ee9-4d4b80d9dd2d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "instance_segmentation = processor.post_process_instance_segmentation(outputs)[0]\n",
+    "instance_segmentation.keys()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0155c6e6-add1-44ac-9845-f80a2fe6e063",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from collections import defaultdict\n",
+    "import matplotlib.pyplot as plt\n",
+    "from matplotlib import cm\n",
+    "import matplotlib.patches as mpatches\n",
+    "import numpy as np  # 确保导入 numpy\n",
+    "\n",
+    "def draw_instance_segmentation(segmentation, segments_info):\n",
+    "    # 转换数据类型（如果是张量或 object 类型）\n",
+    "    if hasattr(segmentation, 'asnumpy'):  # 处理 MindSpore 张量\n",
+    "        segmentation = segmentation.asnumpy()\n",
+    "    segmentation = np.array(segmentation, dtype=np.int32)  # 强制转换为 int32\n",
+    "    \n",
+    "    # 获取颜色映射\n",
+    "    max_segment_id = np.max(segmentation)  # 使用 NumPy 的 max\n",
+    "    viridis = cm.get_cmap('viridis', max_segment_id)\n",
+    "    \n",
+    "    fig, ax = plt.subplots()\n",
+    "    ax.imshow(segmentation)  # 现在 segmentation 是数值类型\n",
+    "    \n",
+    "    instances_counter = defaultdict(int)\n",
+    "    handles = []\n",
+    "    for segment in segments_info:\n",
+    "        segment_id = segment['id']\n",
+    "        segment_label_id = segment['label_id']\n",
+    "        segment_label = model.config.id2label[segment_label_id]\n",
+    "        label = f\"{segment_label}-{instances_counter[segment_label_id]}\"\n",
+    "        instances_counter[segment_label_id] += 1\n",
+    "        color = viridis(segment_id)\n",
+    "        handles.append(mpatches.Patch(color=color, label=label))\n",
+    "    \n",
+    "    ax.legend(handles=handles)\n",
+    "    plt.savefig('cats_panoptic.png')\n",
+    "\n",
+    "# 调用函数（确保 instance_segmentation 包含正确的键）\n",
+    "draw_instance_segmentation(**instance_segmentation)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "MindSpore",
+   "language": "python",
+   "name": "mindspore"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}