接口说明
OCR 文档智能服务
shell
POST https://api.scnet.cn/api/llm/v1/ocrdoc/submit1.功能介绍
Scnet OCR 文档智能服务,支持多种文档类型的异步 OCR 识别任务提交与结果查询,适用于大批量文档处理场景。
核心流程: 【任务提交】通过 submit 接口提交待处理的文件 URL,获取任务 ID; 【状态查询】通过 result 接口轮询任务状态,获取处理结果; 【结果获取】任务成功后返回文件下载地址,可下载识别结果文件。
2.任务提交 API
2.1 端点信息
| 项目 | 内容 |
|---|---|
| URL | POST /api/llm/v1/ocrdoc/submit |
| Content-Type | application/json |
| 认证 | Authorization: Bearer <token> |
2.2 请求参数
Header 参数
| 名称 | 类型 | 必填 | 示例值 |
|---|---|---|---|
| Content-Type | string | 是 | application/json |
| Authorization | string | 是 | Bearer <API Key> |
Body 参数
| 名称 | 类型 | 必填 | 描述 |
|---|---|---|---|
| file_url | string | 是 | 待处理文件的 公网可访问 下载地址 (获取文件上传地址,请参考) |
| ocr_type | string | 否 | 识别类别(目前只有DOC_PARING) |
2.3 请求体示例
json
{
"file_url": "https://oss.ksai.scnet.cn:58043/ocr/doc/2135155845..."
}2.4 响应参数
| 参数名称 | 参数类型 | 描述 |
|---|---|---|
| code | String | 状态码 |
| msg | String | 结果描述 |
| output | Object | 任务提交结果 |
| task_status | String | 任务状态(pending 待执行、running 执行中、succeeded 成功、failed 失败、unknown 任务不存在或未知状态) |
| task_id | String | 任务唯一标识,用于后续结果查询 |
| request_id | String | 请求唯一标识 |
2.5 响应示例
成功响应
json
{
"code": "200",
"msg": "",
"data": {
"output": {
"task_status": "pending",
"task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx"
},
"request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx"
}
}失败响应
json
{
"request_id": "xxxx",
"error": {
"code": "404",
"type": "model_not_found",
"message": "Model xxx not found"
}
}3.任务状态查询 API
3.1 端点信息
| 项目 | 内容 |
|---|---|
| URL | POST /api/llm/v1/ocrdoc/result |
| 认证 | Authorization: Bearer <token> |
3.2 请求参数
Header 参数
| 名称 | 类型 | 必填 | 示例值 |
|---|---|---|---|
| Authorization | string | 是 | Bearer <API Key> |
Body 参数
| 名称 | 类型 | 必填 | 描述 |
|---|---|---|---|
| task_ids | array | 是 | 任务ID 列表 |
2.3 请求体示例
json
{
"task_ids": [
"2056706028668284929","2056703208598626305"
]
}3.3 响应参数
| 参数名称 | 参数类型 | 描述 |
|---|---|---|
| code | String | 状态码 |
| msg | String | 结果描述 |
| requestId | String | 请求唯一标识 |
| output | Object | 任务结果 |
| taskId | String | 任务唯一标识 |
| taskStatus | String | 任务状态 |
| submitTime | String | 任务提交时间 |
| endTime | String | 任务结束时间(成功/失败时返回) |
| results | Array | 识别结果文件下载地址列表(成功时返回) |
| error_code | String | 错误码(失败时返回) |
| error_message | String | 错误信息(失败时返回) |
3.4 响应示例
任务成功
json
{
"code": "0",
"msg": "success",
"data": [
{
"output": {
"results": [
"https://minio.fanhualuomu.top:8088/long-document-parsing/longDocumentParsing/results/2026/05/19/2056703208598626305/013_result_2056703208598626305.json?response-content-disposition=attachment%3B%20filename%3D%22013_result_2056703208598626305.json%22%3B%20filename%2A%3DUTF-8%27%27013_result_2056703208598626305.json&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=peihaojie%2F20260519%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20260519T114740Z&X-Amz-Expires=43200&X-Amz-SignedHeaders=host&X-Amz-Signature=d414e3ff2f9074b036dac1855dad18a67792f3dcb2d2930d9f3605a4c90b111f"
],
"task_id": "2056703208598626305",
"task_status": "succeeded",
"submit_time": "2026-05-19 19:47:11",
"end_time": "2026-05-19 19:47:40"
},
"usage": {
"image_count": 1
},
"request_id": "5e726f4f7d518259"
},
{
"output": {
"results": [
"https://minio.fanhualuomu.top:8088/long-document-parsing/longDocumentParsing/results/2026/05/19/2056706028668284929/014_result_2056706028668284929.json?response-content-disposition=attachment%3B%20filename%3D%22014_result_2056706028668284929.json%22%3B%20filename%2A%3DUTF-8%27%27014_result_2056706028668284929.json&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=peihaojie%2F20260519%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20260519T115855Z&X-Amz-Expires=43200&X-Amz-SignedHeaders=host&X-Amz-Signature=89707d665c259c91d00bedc289e4d20922b3b8936548d793d8b0e20b4e38e751"
],
"task_id": "2056706028668284929",
"task_status": "succeeded",
"submit_time": "2026-05-19 19:58:24",
"end_time": "2026-05-19 19:58:55"
},
"usage": {
"image_count": 1
},
"request_id": "5e726f4f7d518259"
}
]
}任务进行中
json
{
"code": "200",
"msg": "",
"data": [
{
"request_id": "8ae698ba-df2d-966c-abcf-xxxxxx",
"output": {
"task_id": "e56d806f-76f9-4037-aefa-xxxxxx",
"task_status": "running",
"submit_time": "2026-04-20 19:33:50.425"
}
}
]
}任务失败
json
{
"code": "200",
"msg": "",
"data": [
{
"request_id": "c61fe158-c0de-40f0-b4d9-964625119ba4",
"output": {
"task_id": "86ecf553-d340-4e21-xxxxxxxxx",
"task_status": "failed",
"submit_time": "2025-11-11 11:46:28.116",
"end_time": "2025-11-11 11:46:28.255",
"error_code": "limit_burst_rate",
"error_message": "Burst rate limit exceeded for model xxx"
}
}
]
}4.请求示例
4.1 任务提交 cURL 请求示例
shell
curl --location 'https://api.scnet.cn/api/llm/v1/ocrdoc/submit' \
--header 'Authorization: Bearer <API Key>' \
--header 'Content-Type: application/json' \
--data '{
"file_url": "https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx"
}'4.2 任务状态查询 cURL 请求示例
shell
curl --location 'https://api.scnet.cn/api/llm/v1/ocrdoc/result' \
--header 'Authorization: Bearer <API Key>' \
--header 'Content-Type: application/json' \
--data '{
"task_ids": ["2056706028668284929","2056703208598626305"]
}'4.3 Python 请求示例
python
import requests
import time
API_KEY = "<API Key>"
BASE_URL = "https://api.scnet.cn/api/llm/v1/ocrdoc"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
# 提交任务
submit_payload = {
"fileUrl": "https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx"
}
response = requests.post(f"{BASE_URL}/submit", json=submit_payload, headers=headers)
submit_result = response.json()
print("提交结果:", submit_result)
task_id = submit_result["output"]["taskId"]
# 轮询查询任务状态
while True:
result_response = requests.get(
f"{BASE_URL}/result",
params={"taskId": task_id},
headers=headers
)
result = result_response.json()
task_status = result["output"]["taskStatus"]
if task_status == "SUCCEEDED":
print("任务成功:", result)
break
elif task_status == "FAILED":
print("任务失败:", result)
break
else:
estimated = result["output"].get("estimatedSeconds", 10)
print(f"任务处理中,预计等待 {estimated} 秒...")
time.sleep(min(estimated, 10))4.4 Go 请求示例
go
package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
"time"
)
func main() {
baseURL := "https://api.scnet.cn/api/llm/v1/ocrdoc"
bearerToken := "<API Key>"
// 提交任务
submitBody := map[string]string{
"fileUrl": "https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx",
}
bodyBytes, _ := json.Marshal(submitBody)
req, _ := http.NewRequest("POST", baseURL+"/submit", bytes.NewBuffer(bodyBytes))
req.Header.Set("Authorization", "Bearer "+bearerToken)
req.Header.Set("Content-Type", "application/json")
client := &http.Client{}
resp, _ := client.Do(req)
respBody, _ := io.ReadAll(resp.Body)
resp.Body.Close()
var submitResult map[string]interface{}
json.Unmarshal(respBody, &submitResult)
fmt.Printf("提交结果: %s\n", string(respBody))
taskId := submitResult["output"].(map[string]interface{})["taskId"].(string)
// 轮询查询任务状态
for {
req, _ = http.NewRequest("GET", baseURL+"/result?taskId="+taskId, nil)
req.Header.Set("Authorization", "Bearer "+bearerToken)
resp, _ = client.Do(req)
respBody, _ = io.ReadAll(resp.Body)
resp.Body.Close()
var result map[string]interface{}
json.Unmarshal(respBody, &result)
output := result["output"].(map[string]interface{})
taskStatus := output["taskStatus"].(string)
if taskStatus == "SUCCEEDED" {
fmt.Printf("任务成功: %s\n", string(respBody))
break
} else if taskStatus == "FAILED" {
fmt.Printf("任务失败: %s\n", string(respBody))
break
} else {
fmt.Println("任务处理中...")
time.Sleep(10 * time.Second)
}
}
}4.5 Node.js 请求示例
javascript
const API_KEY = '<API Key>';
const BASE_URL = 'https://api.scnet.cn/api/llm/v1/ocrdoc';
async function submitTask() {
const response = await fetch(`${BASE_URL}/submit`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
fileUrl: 'https://oss.ksai.scnet.cn:58043/ocr/doc/xxxxxx'
})
});
if (!response.ok) {
throw new Error(`提交任务失败 [状态码: ${response.status}]:${await response.text()}`);
}
const submitResult = await response.json();
console.log('提交结果:', submitResult);
return submitResult.output.taskId;
}
async function pollResult(taskId) {
while (true) {
const response = await fetch(`${BASE_URL}/result?taskId=${taskId}`, {
method: 'GET',
headers: {
'Authorization': `Bearer ${API_KEY}`
}
});
if (!response.ok) {
throw new Error(`查询任务失败 [状态码: ${response.status}]:${await response.text()}`);
}
const result = await response.json();
const taskStatus = result.output.taskStatus;
if (taskStatus === 'SUCCEEDED') {
console.log('任务成功:', result);
return result;
} else if (taskStatus === 'FAILED') {
console.log('任务失败:', result);
return result;
} else {
const estimated = result.output.estimatedSeconds || 10;
console.log(`任务处理中,预计等待 ${estimated} 秒...`);
await new Promise(resolve => setTimeout(resolve, Math.min(estimated, 10) * 1000));
}
}
}
async function main() {
try {
const taskId = await submitTask();
await pollResult(taskId);
} catch (error) {
console.error('OCR 文档智能服务调用失败:', error.message);
}
}
main();5.任务状态说明
| 状态 | 描述 |
|---|---|
| pending | 任务已提交,等待处理 |
| running | 任务处理中 |
| succeeded | 任务处理成功 |
| failed | 任务处理失败 |
| unknown | 任务不存在或未知状态 |
6.错误码说明
| 错误码 | 描述 |
|---|---|
| unknown_error | Unknown error |
| modal_type_not_supported | Unsupported modal type xxx |
| provider_not_supported | Unsupported provider xxx |
| model_not_supported | Unsupported model xxx |
| model_not_found | Model xxx not found |
| request_concurrency_conflict | Concurrency conflict for request, please try again later |
| provider_error | Provider xxx process error |
| model_route_failed | Model xxx route failed |
| content_illegal | Illegal content detected by content approval |
| limit_burst_rate | Burst rate limit exceeded for model xxx |
| task_not_found | Task not found |
| InvalidParameter | Parameter illegal |
| SystemError | An system error has occurred, please try again later |