感知-决策-行动闭环

约 1771 字大约 6 分钟

Agent虾学智能体入门

2026-03-08

感知-决策-行动闭环

本系列第二篇，深入讲解智能体的核心工作循环：感知 → 决策 → 行动 → 反馈。

Agent Loop 概述

智能体的核心是一个持续运行的循环，通常称为 Agent Loop 或 OODA Loop（Observe-Orient-Decide-Act）。这个循环定义了智能体如何与环境和用户交互。

┌─────────────────────────────────────────┐
│                                         │
│    ┌──────────┐                         │
│    │  感知    │ ←── 环境/用户输入        │
│    │Perception│                         │
│    └────┬─────┘                         │
│         │                               │
│         ▼                               │
│    ┌──────────┐                         │
│    │  决策    │ ←── 内部状态/记忆        │
│    │ Decision │                         │
│    └────┬─────┘                         │
│         │                               │
│         ▼                               │
│    ┌──────────┐                         │
│    │  行动    │ ──→ 环境/用户输出        │
│    │  Action  │                         │
│    └────┬─────┘                         │
│         │                               │
│         ▼                               │
│    ┌──────────┐                         │
│    │  反馈    │ ──→ 更新状态/记忆        │
│    │ Feedback │                         │
│    └──────────┘                         │
│                                         │
└─────────────────────────────────────────┘

1. 感知（Perception）

感知是智能体获取环境信息的过程。在 LLM Agent 中，感知通常包括：

输入类型

类型	示例	处理方式
用户消息	"帮我查天气"	文本理解
系统事件	定时触发、Webhook	事件解析
环境状态	文件变化、数据库更新	状态监控
工具返回	API 响应、搜索结果	结果解析

代码示例

from dataclasses import dataclass
from typing import Optional
import json

@dataclass
class Perception:
    """感知输入的数据结构"""
    type: str           # 输入类型: user_message, system_event, tool_result
    content: str        # 原始内容
    metadata: dict      # 元数据（时间戳、来源等）
    
class PerceptionModule:
    """感知模块"""
    
    def perceive(self, raw_input: dict) -> Perception:
        """将原始输入转换为结构化感知"""
        return Perception(
            type=raw_input.get("type", "unknown"),
            content=raw_input.get("content", ""),
            metadata={
                "timestamp": raw_input.get("timestamp"),
                "source": raw_input.get("source"),
                "session_id": raw_input.get("session_id")
            }
        )
    
    def parse_tool_result(self, result: str) -> dict:
        """解析工具返回结果"""
        try:
            return json.loads(result)
        except:
            return {"raw": result}

# 使用示例
perception_module = PerceptionModule()
perception = perception_module.perceive({
    "type": "user_message",
    "content": "帮我分析这份财报",
    "timestamp": "2026-03-08T22:00:00Z"
})
print(perception.content)  # 输出: 帮我分析这份财报

2. 决策（Decision）

决策是智能体的核心，根据感知输入和内部状态选择下一步行动。在 LLM Agent 中，决策通常由大语言模型完成。

决策内容

意图理解 - 用户想要做什么？
任务分解 - 需要哪些步骤？
工具选择 - 需要调用哪些工具？
参数提取 - 调用参数是什么？

代码示例

from enum import Enum
from typing import List, Optional

class ActionType(Enum):
    """行动类型"""
    REPLY = "reply"           # 直接回复
    CALL_TOOL = "call_tool"   # 调用工具
    ASK_CLARIFY = "ask_clarify"  # 请求澄清
    PLAN = "plan"             # 制定计划

@dataclass
class Decision:
    """决策结果"""
    action_type: ActionType
    content: Optional[str]      # 回复内容
    tool_name: Optional[str]    # 工具名称
    tool_args: Optional[dict]   # 工具参数
    reasoning: Optional[str]    # 推理过程

class DecisionModule:
    """决策模块 - 基于 LLM"""
    
    def __init__(self, llm_client):
        self.llm = llm_client
        self.tools = {
            "search": "搜索网络信息",
            "calculator": "数学计算",
            "weather": "查询天气"
        }
    
    def decide(self, perception: Perception, context: dict) -> Decision:
        """根据感知和上下文做出决策"""
        
        # 构建提示词
        prompt = f"""
你是一个智能助手。根据用户输入，决定下一步行动。

用户输入: {perception.content}
可用工具: {list(self.tools.keys())}

请分析用户意图并返回决策：
1. action_type: reply/call_tool/ask_clarify
2. 如果需要调用工具，指定 tool_name 和 tool_args
3. 如果直接回复，提供 content
"""
        
        # 调用 LLM（这里简化为模拟）
        response = self._call_llm(prompt)
        
        return self._parse_decision(response)
    
    def _call_llm(self, prompt: str) -> str:
        """调用 LLM（模拟）"""
        # 实际应用中调用 OpenAI/Claude 等
        return '{"action_type": "call_tool", "tool_name": "weather", "tool_args": {"city": "北京"}}'
    
    def _parse_decision(self, response: str) -> Decision:
        """解析 LLM 响应为决策对象"""
        data = json.loads(response)
        return Decision(
            action_type=ActionType(data["action_type"]),
            content=data.get("content"),
            tool_name=data.get("tool_name"),
            tool_args=data.get("tool_args"),
            reasoning=data.get("reasoning")
        )

# 使用示例
# decision_module = DecisionModule(llm_client=openai_client)
# decision = decision_module.decide(perception, context={})

3. 行动（Action）

行动是智能体执行决策的过程。行动可以是：

直接回复用户
调用外部工具/API
更新内部状态
触发其他事件

代码示例

import asyncio
from typing import Any

class ActionModule:
    """行动模块"""
    
    def __init__(self):
        self.tool_handlers = {
            "search": self._search,
            "calculator": self._calculate,
            "weather": self._get_weather
        }
    
    async def execute(self, decision: Decision) -> Any:
        """执行决策"""
        
        if decision.action_type == ActionType.REPLY:
            return {"type": "reply", "content": decision.content}
        
        elif decision.action_type == ActionType.CALL_TOOL:
            result = await self._call_tool(
                decision.tool_name,
                decision.tool_args
            )
            return {"type": "tool_result", "content": result}
        
        elif decision.action_type == ActionType.ASK_CLARIFY:
            return {"type": "clarify", "content": decision.content}
        
        return {"type": "unknown"}
    
    async def _call_tool(self, tool_name: str, args: dict) -> Any:
        """调用工具"""
        handler = self.tool_handlers.get(tool_name)
        if handler:
            return await handler(args)
        return f"未知工具: {tool_name}"
    
    async def _search(self, args: dict) -> str:
        """搜索工具"""
        query = args.get("query", "")
        # 实际调用搜索 API
        return f"搜索结果: {query}"
    
    async def _calculate(self, args: dict) -> float:
        """计算工具"""
        expression = args.get("expression", "0")
        # 安全计算（实际应用需要更严格的安全检查）
        return eval(expression)
    
    async def _get_weather(self, args: dict) -> str:
        """天气工具"""
        city = args.get("city", "北京")
        # 实际调用天气 API
        return f"{city}: 晴，25°C"

# 使用示例
async def main():
    action_module = ActionModule()
    decision = Decision(
        action_type=ActionType.CALL_TOOL,
        content=None,
        tool_name="weather",
        tool_args={"city": "上海"},
        reasoning=None
    )
    result = await action_module.execute(decision)
    print(result)  # 输出: {'type': 'tool_result', 'content': '上海: 晴，25°C'}

# asyncio.run(main())

4. 反馈（Feedback）

反馈是智能体学习的重要环节。通过反馈，智能体可以：

更新内部状态
调整决策策略
积累经验知识

反馈类型

类型	来源	作用
显式反馈	用户评价、纠正	直接改进
隐式反馈	用户行为、停留时间	间接优化
环境反馈	工具执行结果	状态更新
自我反馈	反思机制	自我改进

代码示例

@dataclass
class Feedback:
    """反馈数据结构"""
    type: str           # explicit, implicit, environment, self
    content: Any        # 反馈内容
    score: Optional[float] = None  # 评分（可选）

class FeedbackModule:
    """反馈模块"""
    
    def __init__(self):
        self.feedback_history = []
    
    def process(self, action_result: dict, user_response: str = None) -> Feedback:
        """处理行动结果，生成反馈"""
        
        # 环境反馈：工具执行成功/失败
        if action_result["type"] == "tool_result":
            feedback = Feedback(
                type="environment",
                content=action_result["content"],
                score=1.0 if "错误" not in str(action_result["content"]) else 0.0
            )
        
        # 显式反馈：用户评价
        elif user_response:
            score = self._extract_score(user_response)
            feedback = Feedback(
                type="explicit",
                content=user_response,
                score=score
            )
        
        else:
            feedback = Feedback(type="none", content=None)
        
        self.feedback_history.append(feedback)
        return feedback
    
    def _extract_score(self, response: str) -> float:
        """从用户回复中提取评分"""
        # 简单的情感分析（实际应用可用模型）
        positive_words = ["好", "谢谢", "很棒", "完美"]
        negative_words = ["不好", "错了", "不行", "差"]
        
        for word in positive_words:
            if word in response:
                return 1.0
        for word in negative_words:
            if word in response:
                return 0.0
        return 0.5  # 中性

完整的 Agent Loop 实现

import asyncio
from typing import Optional

class AgentLoop:
    """完整的智能体循环"""
    
    def __init__(self, llm_client):
        self.perception = PerceptionModule()
        self.decision = DecisionModule(llm_client)
        self.action = ActionModule()
        self.feedback = FeedbackModule()
        
        self.context = {}  # 内部上下文
        self.running = False
    
    async def run(self, initial_input: dict):
        """运行智能体循环"""
        self.running = True
        current_input = initial_input
        
        while self.running:
            # 1. 感知
            p = self.perception.perceive(current_input)
            print(f"[感知] {p.content}")
            
            # 2. 决策
            d = self.decision.decide(p, self.context)
            print(f"[决策] {d.action_type.value}")
            
            # 3. 行动
            result = await self.action.execute(d)
            print(f"[行动] {result}")
            
            # 4. 反馈
            f = self.feedback.process(result)
            print(f"[反馈] score={f.score}")
            
            # 5. 更新上下文
            self._update_context(p, d, result, f)
            
            # 6. 判断是否继续
            if self._should_stop(d, result):
                break
            
            # 等待下一次输入（实际应用中可能是异步等待）
            break  # 单次执行示例
    
    def _update_context(self, perception, decision, result, feedback):
        """更新内部上下文"""
        self.context["last_perception"] = perception
        self.context["last_decision"] = decision
        self.context["last_result"] = result
        self.context["last_feedback"] = feedback
    
    def _should_stop(self, decision, result) -> bool:
        """判断是否应该停止"""
        # 如果是最终回复，停止循环
        return decision.action_type == ActionType.REPLY

# 使用示例
async def main():
    # agent = AgentLoop(llm_client=openai_client)
    # await agent.run({
    #     "type": "user_message",
    #     "content": "北京今天天气怎么样？"
    # })
    pass

# asyncio.run(main())

小结

Agent Loop 是智能体的核心工作循环
感知获取环境信息，决策选择行动，行动执行操作，反馈促进学习
在 LLM Agent 中，决策通常由大语言模型完成，行动包括工具调用

LLM 作为智能体大脑 - 深入了解大语言模型如何驱动智能体

感知-决策-行动闭环

感知-决策-行动闭环

Agent Loop 概述

1. 感知（Perception）

输入类型

代码示例

2. 决策（Decision）

决策内容

代码示例

3. 行动（Action）

代码示例

4. 反馈（Feedback）

反馈类型

代码示例

完整的 Agent Loop 实现

小结

下一篇

参考资料