接口说明

OCR 文档智能服务

shell

POST https://api.scnet.cn/api/llm/v1/ocrdoc/submit

1.功能介绍

Scnet OCR 文档智能服务，支持多种文档类型的异步 OCR 识别任务提交与结果查询，适用于大批量文档处理场景。

核心流程：【任务提交】通过 submit 接口提交待处理的文件 URL，获取任务 ID；【状态查询】通过 result 接口轮询任务状态，获取处理结果；【结果获取】任务成功后返回文件下载地址，可下载识别结果文件。

2.任务提交 API

2.1 端点信息

项目	内容
URL	`POST /api/llm/v1/ocrdoc/submit`
Content-Type	`application/json`
认证	`Authorization: Bearer <token>`

2.2 请求参数

名称	类型	必填	示例值
Content-Type	string	是	application/json
Authorization	string	是	Bearer `<API Key>`

Body 参数

名称	类型	必填	描述
file_url	string	是	待处理文件的公网可访问下载地址（获取文件上传地址，请参考）
ocr_type	string	否	识别类别（目前只有DOC_PARING）

2.3 请求体示例

json

{
    "file_url": "https://oss.ksai.scnet.cn:58043/ocr/doc/2135155845..."
}

2.4 响应参数

参数名称	参数类型	描述
code	String	状态码
msg	String	结果描述
output	Object	任务提交结果
task_status	String	任务状态（pending 待执行、running 执行中、succeeded 成功、failed 失败、unknown 任务不存在或未知状态）
task_id	String	任务唯一标识，用于后续结果查询
request_id	String	请求唯一标识

2.5 响应示例

成功响应

json

{
  "code": "0",
  "msg": "",
  "data": {
    "output": {
      "task_status": "pending",
      "task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx"
    },
    "request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx"
  }
}

失败响应

json

{
  "code": "10011",
  "msg": "Burst rate limit exceeded for model"
}

3.任务状态查询 API

3.1 端点信息

项目	内容
URL	`POST /api/llm/v1/ocrdoc/result`
认证	`Authorization: Bearer <token>`

3.2 请求参数

名称	类型	必填	示例值
Authorization	string	是	Bearer `<API Key>`

Body 参数

名称	类型	必填	描述
task_ids	array	是	任务ID 列表

2.3 请求体示例

json

{
  "task_ids": [
    "2056706028668284929","2056703208598626305"
  ]
}

3.3 响应参数

参数名称	参数类型	描述
code	String	状态码
msg	String	结果描述
requestId	String	请求唯一标识
output	Object	任务结果
taskId	String	任务唯一标识
taskStatus	String	任务状态
submitTime	String	任务提交时间
endTime	String	任务结束时间（成功/失败时返回）
results	Array	识别结果文件下载地址列表（成功时返回）
error_code	String	错误码（失败时返回）
error_message	String	错误信息（失败时返回）

3.4 响应示例

任务成功

json

{
  "code": "0",
  "msg": "success",
  "data": [
    {
      "output": {
        "results": [
          "https://minio.fanhualuomu.top:8088/long-document-parsing/longDocumentParsing/results/2026/05/19/2056703208598626305/013_result_2056703208598626305.json?response-content-disposition=attachment%3B%20filename%3D%22013_result_2056703208598626305.json%22%3B%20filename%2A%3DUTF-8%27%27013_result_2056703208598626305.json&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=peihaojie%2F20260519%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20260519T114740Z&X-Amz-Expires=43200&X-Amz-SignedHeaders=host&X-Amz-Signature=d414e3ff2f9074b036dac1855dad18a67792f3dcb2d2930d9f3605a4c90b111f"
        ],
        "task_id": "2056703208598626305",
        "task_status": "succeeded",
        "submit_time": "2026-05-19 19:47:11",
        "end_time": "2026-05-19 19:47:40"
      },
      "usage": {
        "image_count": 1
      },
      "request_id": "5e726f4f7d518259"
    },
    {
      "output": {
        "results": [
          "https://minio.fanhualuomu.top:8088/long-document-parsing/longDocumentParsing/results/2026/05/19/2056706028668284929/014_result_2056706028668284929.json?response-content-disposition=attachment%3B%20filename%3D%22014_result_2056706028668284929.json%22%3B%20filename%2A%3DUTF-8%27%27014_result_2056706028668284929.json&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=peihaojie%2F20260519%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20260519T115855Z&X-Amz-Expires=43200&X-Amz-SignedHeaders=host&X-Amz-Signature=89707d665c259c91d00bedc289e4d20922b3b8936548d793d8b0e20b4e38e751"
        ],
        "task_id": "2056706028668284929",
        "task_status": "succeeded",
        "submit_time": "2026-05-19 19:58:24",
        "end_time": "2026-05-19 19:58:55"
      },
      "usage": {
        "image_count": 1
      },
      "request_id": "5e726f4f7d518259"
    }
  ]
}

任务进行中

json


{
  "code": "0",
  "msg": "",
  "data": [
    {
      "request_id": "8ae698ba-df2d-966c-abcf-xxxxxx",
      "output": {
        "task_id": "e56d806f-76f9-4037-aefa-xxxxxx",
        "task_status": "running",
        "submit_time": "2026-04-20 19:33:50.425"
      }
    }
  ]
}

任务失败

json

{
  "code": "0",
  "msg": "",
  "data": [
    {
      "request_id": "c61fe158-c0de-40f0-b4d9-964625119ba4",
      "output": {
        "task_id": "86ecf553-d340-4e21-xxxxxxxxx",
        "task_status": "failed",
        "submit_time": "2025-11-11 11:46:28.116",
        "end_time": "2025-11-11 11:46:28.255",
        "error_code": "10011",
        "error_message": "Burst rate limit exceeded for model xxx"
      }
    }
  ]
}

4.请求示例

4.1 任务提交 cURL 请求示例

shell

curl --location 'https://api.scnet.cn/api/llm/v1/ocrdoc/submit' \
--header 'Authorization: Bearer <API Key>' \
--header 'Content-Type: application/json' \
--data '{
    "file_url": "https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx"
}'

4.2 任务状态查询 cURL 请求示例

shell

curl --location 'https://api.scnet.cn/api/llm/v1/ocrdoc/result' \
--header 'Authorization: Bearer <API Key>' \
--header 'Content-Type: application/json' \
--data '{
    "task_ids": ["2056706028668284929","2056703208598626305"]
}'

4.3 Python 请求示例

python

import requests
import time

API_KEY = "<API Key>"
BASE_URL = "https://api.scnet.cn/api/llm/v1/ocrdoc"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# 提交任务
submit_payload = {
    "fileUrl": "https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx"
}
response = requests.post(f"{BASE_URL}/submit", json=submit_payload, headers=headers)
submit_result = response.json()
print("提交结果:", submit_result)

task_id = submit_result["output"]["taskId"]

# 轮询查询任务状态
while True:
    result_response = requests.get(
        f"{BASE_URL}/result",
        params={"taskId": task_id},
        headers=headers
    )
    result = result_response.json()
    task_status = result["output"]["taskStatus"]

    if task_status == "SUCCEEDED":
        print("任务成功:", result)
        break
    elif task_status == "FAILED":
        print("任务失败:", result)
        break
    else:
        estimated = result["output"].get("estimatedSeconds", 10)
        print(f"任务处理中，预计等待 {estimated} 秒...")
        time.sleep(min(estimated, 10))

4.4 Go 请求示例

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"net/http"
	"time"
)

func main() {
	baseURL := "https://api.scnet.cn/api/llm/v1/ocrdoc"
	bearerToken := "<API Key>"

	// 提交任务
	submitBody := map[string]string{
		"fileUrl": "https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx",
	}
	bodyBytes, _ := json.Marshal(submitBody)

	req, _ := http.NewRequest("POST", baseURL+"/submit", bytes.NewBuffer(bodyBytes))
	req.Header.Set("Authorization", "Bearer "+bearerToken)
	req.Header.Set("Content-Type", "application/json")

	client := &http.Client{}
	resp, _ := client.Do(req)
	respBody, _ := io.ReadAll(resp.Body)
	resp.Body.Close()

	var submitResult map[string]interface{}
	json.Unmarshal(respBody, &submitResult)
	fmt.Printf("提交结果: %s\n", string(respBody))

	taskId := submitResult["output"].(map[string]interface{})["taskId"].(string)

	// 轮询查询任务状态
	for {
		req, _ = http.NewRequest("GET", baseURL+"/result?taskId="+taskId, nil)
		req.Header.Set("Authorization", "Bearer "+bearerToken)

		resp, _ = client.Do(req)
		respBody, _ = io.ReadAll(resp.Body)
		resp.Body.Close()

		var result map[string]interface{}
		json.Unmarshal(respBody, &result)
		output := result["output"].(map[string]interface{})
		taskStatus := output["taskStatus"].(string)

		if taskStatus == "SUCCEEDED" {
			fmt.Printf("任务成功: %s\n", string(respBody))
			break
		} else if taskStatus == "FAILED" {
			fmt.Printf("任务失败: %s\n", string(respBody))
			break
		} else {
			fmt.Println("任务处理中...")
			time.Sleep(10 * time.Second)
		}
	}
}

4.5 Node.js 请求示例

javascript

const API_KEY = '<API Key>';
const BASE_URL = 'https://api.scnet.cn/api/llm/v1/ocrdoc';

async function submitTask() {
    const response = await fetch(`${BASE_URL}/submit`, {
        method: 'POST',
        headers: {
            'Authorization': `Bearer ${API_KEY}`,
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            fileUrl: 'https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx'
        })
    });

    if (!response.ok) {
        throw new Error(`提交任务失败 [状态码: ${response.status}]：${await response.text()}`);
    }

    const submitResult = await response.json();
    console.log('提交结果:', submitResult);
    return submitResult.output.taskId;
}

async function pollResult(taskId) {
    while (true) {
        const response = await fetch(`${BASE_URL}/result?taskId=${taskId}`, {
            method: 'GET',
            headers: {
                'Authorization': `Bearer ${API_KEY}`
            }
        });

        if (!response.ok) {
            throw new Error(`查询任务失败 [状态码: ${response.status}]：${await response.text()}`);
        }

        const result = await response.json();
        const taskStatus = result.output.taskStatus;

        if (taskStatus === 'SUCCEEDED') {
            console.log('任务成功:', result);
            return result;
        } else if (taskStatus === 'FAILED') {
            console.log('任务失败:', result);
            return result;
        } else {
            const estimated = result.output.estimatedSeconds || 10;
            console.log(`任务处理中，预计等待 ${estimated} 秒...`);
            await new Promise(resolve => setTimeout(resolve, Math.min(estimated, 10) * 1000));
        }
    }
}

async function main() {
    try {
        const taskId = await submitTask();
        await pollResult(taskId);
    } catch (error) {
        console.error('OCR 文档智能服务调用失败：', error.message);
    }
}

main();

5.结果文件字段说明

5.1 整体数据结构

模型响应数据采用严谨的层级化结构组织，其基本演进线路为： 任务响应（Task Response） -> 文档级数组（Documents） -> 页面级数组（Datas） -> 版面块数组（Blocks） -> 细粒度元素（Lines/Cells/Sub-blocks）

在页面级别，系统同时并行输出两条主线：

blocks (高精结构化版面块)：提供完全解耦、带有精准空间坐标定位的版面元素，适合需要复杂渲染或二次加工的原子化场景。
md (面向LLM的整页Markdown)：提供已完成版面重建、序列化语序还原的纯文本 Markdown，可直接用于大模型训练、微调或 RAG 向量化检索。

5.2 响应参数字典

5.2.1 页面级参数 (`datas` 数组元素)

参数名称	参数类型	是否必填	描述
`rotate_angle`	int	是	页面旋转角度。当开启文档方向矫正功能时有效，顺时针方向。默认为不启动，值为 `0`。
`height`	int	是	模型服务接收并处理的文档影像实际高度（单位：像素）。
`width`	int	是	模型服务接收并处理的文档影像实际宽度（单位：像素）。
`blocks`	array	是	[核心] 单页版面解析结果集合。包含该页面中切分出的所有逻辑及视觉实体块。
`md`	object	是	[附加] 整页的 Markdown 聚合内容及关联静态资源映射表。

5.2.2 版面块参数 (`blocks` 数组实体)

每个 Block 代表页面上的一个独立区域（如一个段落、一张图表、一个表格或一个公式）。

参数名称	参数类型	适用类型	描述
`id`	int	所有 Block	版面分析块的唯一标识 ID。注：由于后处理流程中会对部分零散块进行合并与逻辑优化，此 ID 为递增但非连续。
`bbox`	array[int]	所有 Block	版面块的两点绝对坐标信息。格式为 `[左上角X, 左上角Y, 右下角X, 右下角Y]`。
`type`	string	所有 Block	版面块的一级分类。可选值：`text`, `chart`, `formula`, `footer`, `header`, `image`, `seal`, `table`。
`sub_type`	string	所有 Block	版面块的二级分类。定义了更为精细的语义属性（详见第四章分类体系）。
`sub_img`	string	视觉/特定块	版面块的局部抠图。采用 Base64 编码串，图片格式为 JPG。仅在 `algorithm`、`chart`、`footer_image`、`header_image`、`image` 等二级分类中返回。
`lines`	array	文本相关块	文字识别（OCR）内容。存在于 `text`（一级）、`footer`（二级）、`footnote`（二级）、`header`（二级）、`formula_number`（二级）等包含文本内容的版面块中。
`latex`	string	公式块	独立公式的标准 LaTeX 源码。仅存在于 `display_formula`（二级分类）块中。
`html`	string	表格块	经过版面重建后的标准 HTML 格式表格代码。完美支持跨行、跨列及复杂嵌套结构渲染。仅存在于 `table` 块中。
`cells`	array	表格块	表格内部单元格的多维微观信息集合。仅存在于 `table` 块中。
`blocks`	array	印章块	嵌套的印章元素集合。仅存在于 `seal` 块中。

5.2.3 细粒度子元素参数明细

A. 文本行对象 (`lines` 内部元素)

参数名称	参数类型	描述
`bbox`	array[int]	当前文本行的两点绝对包围盒坐标 `[左上角X, 左上角Y, 右下角X, 右下角Y]`。
`polygon`	array[array[int]]	文本行多边形顶点坐标序列。通过 `[[x1,y1], [x2,y2], [x3,y3], [x4,y4]]` 精准适配倾斜文本或弧形排版。
`text`	string	经过 OCR 识别并校对后的纯文字内容。
`score`	float	文字识别结果的整体置信度。取值范围：`0.0` 至 `1.0`。

B. 单元格对象 (`cells` 内部元素)

参数名称	参数类型	描述
`id`	int	单元格在当前表格内的序号。
`bbox`	array	单元格的两点包围盒坐标。
`polygon`	array	单元格的多边形包围边界。
`rows`	array[int]	单元格所占据的水平行逻辑点位。如 `[0, 0]` 表示仅占第 0 行；`[0, 1]` 表示跨第 0 到 1 行（合并单元格）。
`cols`	array[int]	单元格所占据的垂直列逻辑点位。如 `[1, 3]` 表示该单元格跨第 1 到 3 列。

C. 印章集合对象 (`seal` -> `blocks` 内部元素)

参数名称	参数类型	描述
`id`	int	当前印章实体的序号。
`class_id`	int	印章形状分类 ID。映射关系如下： `0`: 圆形章 (`circle_stamp`)、`1`: 椭圆章 (`oval_stamp`)、`2`: 矩形章 (`rectangle_stamp`)、`3`: 其他 (`other`)、`4`: 个人私章 (`personal_stamp`)、`5`: 三角章 (`triangele_stamp`)、`6`: 菱形章 (`rhombus_stamp`)。
`class_name`	string	印章形状分类的英文标准别名（如 `circle_stamp`, `oval_stamp` ）。
`bbox`	array	印章实体的包围盒坐标。
`score`	float	该印章检测与分类的整体置信度得分。
`is_rotated`	bool	标识印章是否处于旋转状态。
`color_type`	string	印章颜色分类。可选值映射：`"红色"`, `"蓝色"`, `"黑色"`, `"其他"`。
`sub_img`	string	印章区域独立切图，Base64 编码，JPG 格式。
`lines`	array	印章内部包含的文字识别结果列表（内部包含 `text`, `score`, `polygon` 属性）。

5.2.4 Markdown 聚合参数 (`md` 对象)

参数名称	参数类型	描述
`markdown_content`	string	整页重构后的完整 Markdown 格式文本。版面块中的视觉元素（如 `algorithm`、`chart`、`image` 等）会在文本对应上下文中被转化为标准的图片标签（例：`![](images/image_id_sid.jpg)`) 进行自然占位。
`images`	array	当前页面中被 Markdown 引用到的所有图片/视觉块清单字典。

`images` 数组元素明细：

id: int - 静态资源文件序列号。
block_index: int - 该图片在最外层 blocks 数组中对应的索引下标，通过该指标可联动溯源原图的绝对物理坐标。
seal_block_index: int - 如果该图片源自印章，则指示其在印章集合内部的下标。
format: string - 图片输出格式，固定为 JPG。
path: string - Markdown 文本内映射的相对路径前缀，默认固定为 images/。
name: string - 该图片在 Markdown 中映射的文件名（如 image_id_sid.jpg）。
type: string - 源版面块的一级分类。
sub_type: string - 源版面块的二级分类。

5.4 版面块分类体系 (Type & Sub_type)

服务内置了一套完整、细腻的二级分类模型，能够对文档元素赋予极高行业纵深的语义：

一级分类 (`type`)	二级分类 (`sub_type`)	行业语义与典型应用场景说明
text (普通文本)	`abstract`	论文、白皮书、专利的摘要区域。
	`content`	专门在大型、多级目录块中出现的导航引导文本。
	`doc_title`	文章的主标题。通常在全篇或单章起始页具有最大字号。
	`text`	最基础、最普遍的正文文本段落。
	`vertical_text`	古籍、特定公文、海报或亚洲部分排版风格中的竖排文本。
	`paragraph_title`	段落标题、小节标题（有别于 `doc_title`），用于标识结构层级。
	`reference`	参考文献的外围总框（List 外框）。
	`reference_content`	参考文献的每一条具体文献条目（List Item）。
	`aside_text`	侧边栏注文本。通常位于页面边缘，提供补充信息、导读或延伸阅读。
	`figure_title`	图片 (`image`)、图表 (`chart`) 或表格 (`table`) 的标题/说明文本 (Caption)。
	`vision_footnote`	图片、图表或表格底部的专属小字注解或数据来源说明 (Footnote)。
	`number`	页码。
chart (图表)	`chart`	包含数据可视化元素的区块，如柱状图、折线图、饼图、散点图等。
formula (公式)	`display_formula`	独立占据一整行或多行的核心数学/物理/化学公式。具有专属的大号字体和独立居中对齐特征。
	`formula_number`	配合 `display_formula` 使用的右侧公式编号，便于正文索引（如 `(1-1)`）。
	`inline_formula`	夹杂在普通正文行内部的短小公式或数学符号。
footer (页脚)	`footer`	页脚标准文本。通常包含版权声明、会议名称等。
	`footer_image`	位于页脚区域的图标或装饰性窄图。
	`footnote`	页面级别的学术/说明脚注。通常位于页面最下端，用于对正文特定名词做原页补充。
header (页眉)	`header`	页眉标准文本。如期刊名称、当前章节名。
	`header_image`	页眉区域的特殊 Logo、横线或机构徽标。
image (图像)	`image`	标准的实景照片、手绘图、结构图等插图。
	`algorithm`	特殊的算法逻辑伪代码块或流程图，具备代码块的排版架构。
table (表格)	`table`	标准的行列式数据表格，内部同时携带高精度的 OCR 文本和多维拓扑结构。
seal (印章)	`seal`	合同、发票、公文中的实体印章区域。包含完整的形状、颜色、旋转及内部文字识别能力。

6.任务状态说明

状态	描述
pending	任务已提交，等待处理
running	任务处理中
succeeded	任务处理成功
failed	任务处理失败
unknown	任务不存在或未知状态

7.错误码说明

错误码	描述
10001	Unknown error
10002	Unsupported modal type xxx
10003	Unsupported provider xxx
10004	Unsupported model xxx
10005	Model xxx not found
10006	Task not found
10007	Concurrency conflict for request, please try again later
10008	Provider xxx process error
10009	Model xxx route failed
10010	Illegal content detected by content approval
10011	Burst rate limit exceeded for model xxx
10012	An system error has occurred, please try again later
10013	Parameter illegal
10014	Incorrect API key provided
10015	Task timeout, please try again later

图像生成

视频生成

音频

接口说明 ​

OCR 文档智能服务 ​

1.功能介绍 ​

2.任务提交 API ​

2.1 端点信息 ​

2.2 请求参数 ​

Header 参数 ​

Body 参数 ​

2.3 请求体示例 ​

2.4 响应参数 ​

2.5 响应示例 ​

成功响应 ​

失败响应 ​

3.任务状态查询 API ​

3.1 端点信息 ​

3.2 请求参数 ​

Header 参数 ​

Body 参数 ​

2.3 请求体示例 ​

3.3 响应参数 ​

3.4 响应示例 ​

任务成功 ​

任务进行中 ​

任务失败 ​

4.请求示例 ​

4.1 任务提交 cURL 请求示例 ​

4.2 任务状态查询 cURL 请求示例 ​

4.3 Python 请求示例 ​

4.4 Go 请求示例 ​

4.5 Node.js 请求示例 ​

5.结果文件字段说明 ​

5.1 整体数据结构 ​

5.2 响应参数字典 ​

5.2.1 页面级参数 (datas 数组元素) ​

5.2.2 版面块参数 (blocks 数组实体) ​

5.2.3 细粒度子元素参数明细 ​

A. 文本行对象 (lines 内部元素) ​

B. 单元格对象 (cells 内部元素) ​

C. 印章集合对象 (seal -> blocks 内部元素) ​

5.2.4 Markdown 聚合参数 (md 对象) ​

images 数组元素明细： ​

5.4 版面块分类体系 (Type & Sub_type) ​

6.任务状态说明 ​

7.错误码说明 ​

接口说明

OCR 文档智能服务

1.功能介绍

2.任务提交 API

2.1 端点信息

2.2 请求参数

Header 参数

Body 参数

2.3 请求体示例

2.4 响应参数

2.5 响应示例

成功响应

失败响应

3.任务状态查询 API

3.1 端点信息

3.2 请求参数

Header 参数

Body 参数

2.3 请求体示例

3.3 响应参数

3.4 响应示例

任务成功

任务进行中

任务失败

4.请求示例

4.1 任务提交 cURL 请求示例

4.2 任务状态查询 cURL 请求示例

4.3 Python 请求示例

4.4 Go 请求示例

4.5 Node.js 请求示例

5.结果文件字段说明

5.1 整体数据结构

5.2 响应参数字典

5.2.1 页面级参数 (`datas` 数组元素)

5.2.2 版面块参数 (`blocks` 数组实体)

5.2.3 细粒度子元素参数明细

A. 文本行对象 (`lines` 内部元素)

B. 单元格对象 (`cells` 内部元素)

C. 印章集合对象 (`seal` -> `blocks` 内部元素)

5.2.4 Markdown 聚合参数 (`md` 对象)

`images` 数组元素明细：

5.4 版面块分类体系 (Type & Sub_type)

6.任务状态说明

7.错误码说明