教程：微调智能体¶

让我们快速过一遍一个示例，该示例展示如何在表示 ReAct 智能体的 DSPy 模块中优化语言模型权重（即微调），智能体玩的是一个包含 50 个步骤任务的游戏。

安装依赖项并下载数据¶

通过 pip install -U --pre dspy 安装最新版本的 DSPy 并跟随教程。本教程使用 AlfWorld 数据集，该数据集依赖于 DSPy 2.6.0（预发布版本）。

你还需要以下依赖项：

> pip install -U alfworld==0.3.5 multiprocess
> alfworld-download

推荐：设置 MLflow 追踪，以便了解幕后发生了什么

MLflow DSPy 集成¶

MLflow 是一个原生集成 DSPy 并提供可解释性和实验追踪功能的 LLMOps 工具。在本教程中，你可以使用 MLflow 将提示和优化进度可视化为轨迹，以便更好地理解 DSPy 的行为。你可以通过以下四个步骤轻松设置 MLflow。

MLflow Trace

安装 MLflow

%pip install mlflow>=2.20

在单独的终端中启动 MLflow UI

mlflow ui --port 5000

将 notebook 连接到 MLflow

import mlflow

mlflow.set_tracking_uri("https://:5000")
mlflow.set_experiment("DSPy")

启用追踪。

mlflow.dspy.autolog()

要了解更多关于集成的信息，也可以访问 MLflow DSPy 文档。

设置语言模型¶

我们的目标是让 gpt-4o-mini 能够熟练地玩 AlfWorld 家务游戏，而无需手动修改字符串提示或示例轨迹。

虽然不是严格必需的，但我们将使用更大的 gpt-4o 进行提示优化和微调，从而构建我们的小型 gpt-4o-mini 智能体，这将使我们的工作更容易一些。

In [1]

已复制！

import dspy

gpt4o_mini = dspy.LM('gpt-4o-mini-2024-07-18')
gpt4o = dspy.LM('openai/gpt-4o')
dspy.configure(experimental=True)
import dspy gpt4o_mini = dspy.LM('gpt-4o-mini-2024-07-18') gpt4o = dspy.LM('openai/gpt-4o') dspy.configure(experimental=True)

让我们从 AlfWorld 加载 200 个训练任务和 200 个开发任务。数据集要大得多，但少量示例将有助于将本教程的运行时间控制在 1-2 小时内，包括微调。

只用 100 个训练任务，我们将教 4o-mini 从 19%（几乎不能玩游戏）提高到 72%。如果你使用 500 个任务并在微调期间保留演示，你可以轻松地将成功率推到 82%。

In [2]

已复制！

from dspy.datasets.alfworld import AlfWorld

alfworld = AlfWorld()
trainset, devset = alfworld.trainset[:200], alfworld.devset[-200:]
len(trainset), len(devset)
from dspy.datasets.alfworld import AlfWorld alfworld = AlfWorld() trainset, devset = alfworld.trainset[:200], alfworld.devset[-200:] len(trainset), len(devset)

Out[2]

(200, 200)

在继续之前，让我们看一个此任务的示例。

In [3]

已复制！

example = trainset[0]

with alfworld.POOL.session() as env:
    task, info = env.init(**example.inputs())

print(task)
example = trainset[0] with alfworld.POOL.session() as env: task, info = env.init(**example.inputs()) print(task)

-= Welcome to TextWorld, ALFRED! =-

You are in the middle of a room. Looking quickly around you, you see a countertop 1, a drawer 8, a drawer 7, a drawer 6, a drawer 5, a drawer 4, a drawer 3, a drawer 2, a drawer 1, a garbagecan 1, a handtowelholder 1, a sinkbasin 2, a sinkbasin 1, a toilet 1, a toiletpaperhanger 1, and a towelholder 1.

Your task is to: put a clean soapbar in garbagecan.

定义智能体程序¶

智能体是一个相当简单的 dspy.Module，其中包含一个名为 self.react 的子模块。

这个子模块接收一个特定 task 的定义，查看其先前的 trajectory（轨迹），并查看它可以采取的 possible_actions（可能的操作）列表。它只需响应下一个操作即可。

在 forward 方法中，我们只需为给定的任务 idx 初始化一个环境。然后我们循环直到 self.max_iters，重复调用 self.react 模块来采取下一个操作。

In [4]

已复制！





class Agent(dspy.Module):
    def __init__(self, max_iters=50, verbose=False):
        self.max_iters = max_iters
        self.verbose = verbose
        self.react = dspy.Predict("task, trajectory, possible_actions: list[str] -> action")

    def forward(self, idx):
        with alfworld.POOL.session() as env:
            trajectory = []
            task, info = env.init(idx)
            if self.verbose:
                print(f"Task: {task}")

            for _ in range(self.max_iters):
                trajectory_ = "\n".join(trajectory)
                possible_actions = info["admissible_commands"][0] + ["think: ${...thoughts...}"]
                prediction = self.react(task=task, trajectory=trajectory_, possible_actions=possible_actions)
                trajectory.append(f"> {prediction.action}")

                if prediction.action.startswith("think:"):
                    trajectory.append("OK.")
                    continue

                obs, reward, done, info = env.step(prediction.action)
                obs, reward, done = obs[0], reward[0], done[0]
                trajectory.append(obs)

                if self.verbose:
                    print("\n".join(trajectory[-2:]))

                if done:
                    break

        assert reward == int(info["won"][0]), (reward, info["won"][0])
        return dspy.Prediction(trajecotry=trajectory, success=reward)
class Agent(dspy.Module): def __init__(self, max_iters=50, verbose=False): self.max_iters = max_iters self.verbose = verbose self.react = dspy.Predict("task, trajectory, possible_actions: list[str] -> action") def forward(self, idx): with alfworld.POOL.session() as env: trajectory = [] task, info = env.init(idx) if self.verbose: print(f"Task: {task}") for _ in range(self.max_iters): trajectory_ = "\n".join(trajectory) possible_actions = info["admissible_commands"][0] + ["think: ${...thoughts...}"] prediction = self.react(task=task, trajectory=trajectory_, possible_actions=possible_actions) trajectory.append(f"> {prediction.action}") if prediction.action.startswith("think:"): trajectory.append("OK.") continue obs, reward, done, info = env.step(prediction.action) obs, reward, done = obs[0], reward[0], done[0] trajectory.append(obs) if self.verbose: print("\n".join(trajectory[-2:])) if done: break assert reward == int(info["won"][0]), (reward, info["won"][0]) return dspy.Prediction(trajecotry=trajectory, success=reward)

补充说明：如果你想为你的智能体包含说明...¶

上面，我们选择让智能体极其简单，甚至没有提供描述任务的简短说明。

原则上，你可以复制 AlfWorld 任务的简短定义（基于 Yao 等人，2022 年的研究）并将其用作你智能体的说明。这并非本质上必需，但它有助于说明说明在 DSPy 中的作用：它们不是为了强制模型表现出某种特定行为，而是为了以直接、人类可读的方式描述任务的基本要素。

如果你想这样做，只需将此部分替换为

self.react = dspy.Predict("task, trajectory, possible_actions: list[str] -> action")

此部分即可

INSTRUCTIONS = """
Interact with a simulated household to achieve a high-level goal. Make sure to plan, track subgoals,
determine likely locations for common household items (e.g. desklamps will likely be on desks, shelfs, or dressers),
and explore systematically (e.g. check all desks one by one for desklamp).
""".strip()

self.react = dspy.Predict(dspy.Signature("task, trajectory, possible_actions: list[str] -> action", INSTRUCTIONS))

零样本评估¶

现在，让我们在进行任何优化工作之前，尝试一下这个简单的程序。

In [5]

已复制！

agent_4o = Agent()
agent_4o.set_lm(gpt4o)
agent_4o.verbose = True

agent_4o(**example.inputs())
agent_4o = Agent() agent_4o.set_lm(gpt4o) agent_4o.verbose = True agent_4o(**example.inputs())

Task: -= Welcome to TextWorld, ALFRED! =-

You are in the middle of a room. Looking quickly around you, you see a countertop 1, a drawer 8, a drawer 7, a drawer 6, a drawer 5, a drawer 4, a drawer 3, a drawer 2, a drawer 1, a garbagecan 1, a handtowelholder 1, a sinkbasin 2, a sinkbasin 1, a toilet 1, a toiletpaperhanger 1, and a towelholder 1.

Your task is to: put a clean soapbar in garbagecan.
> go to countertop 1
You arrive at countertop 1. On the countertop 1, you see a candle 1, a soapbar 1, a soapbottle 2, a soapbottle 1, and a spraybottle 1.
> take soapbar 1 from countertop 1
You pick up the soapbar 1 from the countertop 1.
> go to garbagecan 1
You arrive at garbagecan 1. On the garbagecan 1, you see nothing.
> move soapbar 1 to garbagecan 1
You move the soapbar 1 to the garbagecan 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> take soapbar 1 from garbagecan 1
You pick up the soapbar 1 from the garbagecan 1.
> move soapbar 1 to garbagecan 1
You move the soapbar 1 to the garbagecan 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> look
You are facing the garbagecan 1. Next to it, you see nothing.
> examine garbagecan 1
On the garbagecan 1, you see a soapbar 1.
> look
You are facing the garbagecan 1. Next to it, you see nothing.

Out[5]

Prediction(
trajecotry=['> go to countertop 1', 'You arrive at countertop 1. On the countertop 1, you see a candle 1, a soapbar 1, a soapbottle 2, a soapbottle 1, and a spraybottle 1.', '> take soapbar 1 from countertop 1', 'You pick up the soapbar 1 from the countertop 1.', '> go to garbagecan 1', 'You arrive at garbagecan 1. On the garbagecan 1, you see nothing.', '> move soapbar 1 to garbagecan 1', 'You move the soapbar 1 to the garbagecan 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> take soapbar 1 from garbagecan 1', 'You pick up the soapbar 1 from the garbagecan 1.', '> move soapbar 1 to garbagecan 1', 'You move the soapbar 1 to the garbagecan 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.', '> examine garbagecan 1', 'On the garbagecan 1, you see a soapbar 1.', '> look', 'You are facing the garbagecan 1. Next to it, you see nothing.'],
success=0
)

好的，在这种情况下，它无法解决这个示例！现在，让我们检查一下 4o 和 4o-mini 的平均质量。

In [6]

已复制！

metric = lambda x, y, trace=None: y.success
evaluate = dspy.Evaluate(devset=devset, metric=metric, display_progress=True, num_threads=16)
metric = lambda x, y, trace=None: y.success evaluate = dspy.Evaluate(devset=devset, metric=metric, display_progress=True, num_threads=16)

在 MLflow 实验中追踪评估结果

要追踪和可视化评估结果随时间的变化，你可以将结果记录在 MLflow 实验中。

import mlflow

with mlflow.start_run(run_name="agent_evaluation"):
    evaluate = dspy.Evaluate(
        devset=devset,
        metric=metric,
        num_threads=16,
        display_progress=True,
        # To record the outputs and detailed scores to MLflow
        return_all_scores=True,
        return_outputs=True,
    )

    # Evaluate the program as usual
    aggregated_score, outputs, all_scores = evaluate(cot)

    # Log the aggregated score
    mlflow.log_metric("success_rate", aggregated_score)
    # Log the detailed evaluation results as a table
    mlflow.log_table(
        {
            "Idx": [example.idx for example in eval_set],
            "Result": outputs,
            "Success": all_scores,
        },
        artifact_file="eval_results.json",
    )

要了解更多关于集成的信息，也可以访问 MLflow DSPy 文档。

In [7]

已复制！

agent_4o.verbose = False
evaluate(agent_4o)
agent_4o.verbose = False evaluate(agent_4o)

Average Metric: 115.00 / 200 (57.5%): 100%|██████████| 200/200 [06:14<00:00,  1.87s/it]

2024/12/28 11:10:25 INFO dspy.evaluate.evaluate: Average Metric: 115 / 200 (57.5%)

Out[7]

57.5

In [8]

已复制！

agent_4o_mini = Agent()
agent_4o_mini.set_lm(gpt4o_mini)

evaluate(agent_4o_mini)
agent_4o_mini = Agent() agent_4o_mini.set_lm(gpt4o_mini) evaluate(agent_4o_mini)

Average Metric: 30.00 / 200 (15.0%): 100%|██████████| 200/200 [08:33<00:00,  2.57s/it]

2024/12/28 11:18:59 INFO dspy.evaluate.evaluate: Average Metric: 30 / 200 (15.0%)

Out[8]

15.0

开箱即用时，在这项任务中，4o 表现尚可（成功率 58%），而 4o-mini 则表现不佳（成功率 15%）。

让我们应用以下策略：

我们将以轻量级的方式优化 gpt-4o 的提示。
然后我们将使用这个经过提示优化的智能体作为教师模型，对 gpt-4o-mini 进行任务微调。这将使其质量从 19% 提升到 72%（如果你使用 500 个训练集示例，则可达 82%）。

提示优化 GPT-4o¶

In [9]

已复制！

optimizer = dspy.MIPROv2(metric=metric, auto="light", num_threads=16, prompt_model=gpt4o)

config = dict(max_bootstrapped_demos=1, max_labeled_demos=0, minibatch_size=40)
optimized_4o = optimizer.compile(agent_4o, trainset=trainset, **config, requires_permission_to_run=False)
optimizer = dspy.MIPROv2(metric=metric, auto="light", num_threads=16, prompt_model=gpt4o) config = dict(max_bootstrapped_demos=1, max_labeled_demos=0, minibatch_size=40) optimized_4o = optimizer.compile(agent_4o, trainset=trainset, **config, requires_permission_to_run=False)

微调 GPT-4o-mini¶

对于微调，我们需要一个教师程序（上面的 optimized_4o）以及一个从中派生的学生程序（下面的 student_4om）。

In [10]

已复制！

student_4o_mini = optimized_4o.deepcopy()
student_4o_mini.set_lm(gpt4o_mini)
# student_4o_mini.react.demos = []  # you can optionally reset the demos
student_4o_mini = optimized_4o.deepcopy() student_4o_mini.set_lm(gpt4o_mini) # student_4o_mini.react.demos = [] # you can optionally reset the demos

In [11]

已复制！

optimizer = dspy.BootstrapFinetune(metric=metric, num_threads=16)
finetuned_4o_mini = optimizer.compile(student_4o_mini, teacher=optimized_4o, trainset=trainset)
optimizer = dspy.BootstrapFinetune(metric=metric, num_threads=16) finetuned_4o_mini = optimizer.compile(student_4o_mini, teacher=optimized_4o, trainset=trainset)

评估微调后的 GPT-4o-mini 智能体¶

In [12]

已复制！

evaluate(finetuned_4o_mini)
evaluate(finetuned_4o_mini)

Average Metric: 143.00 / 200 (71.5%): 100%|██████████| 200/200 [03:15<00:00,  1.05it/s]

完成所有这些优化后，让我们保存我们的程序，以便将来使用！只要微调后的模型在提供者端继续以相同的标识符存在，这也会保留对其的引用。

In [13]

已复制！

finetuned_4o_mini.save('finetuned_4o_mini_001.pkl')
finetuned_4o_mini.save('finetuned_4o_mini_001.pkl')

在 MLflow 实验中保存程序

除了将程序保存到本地文件，你还可以在 MLflow 中追踪它，以提高可重现性和协作效率。

依赖管理：MLflow 会自动保存冻结的环境元数据以及程序，以确保可重现性。
实验追踪：使用 MLflow，你可以追踪程序的性能和成本以及程序本身。
协作：你可以通过共享 MLflow 实验与团队成员共享程序和结果。

要在 MLflow 中保存程序，请运行以下代码

import mlflow

# Start an MLflow Run and save the program
with mlflow.start_run(run_name="optimized"):
    model_info = mlflow.dspy.log_model(
        finetuned_4o_mini,
        artifact_path="model", # Any name to save the program in MLflow
    )

# Load the program back from MLflow
loaded = mlflow.dspy.load_model(model_info.model_uri)

要了解更多关于集成的信息，也可以访问 MLflow DSPy 文档。

现在，让我们使用我们微调过的智能体程序来检查一个任务！

In [14]

已复制！

finetuned_4o_mini.verbose = True
finetuned_4o_mini(**devset[0].inputs())
finetuned_4o_mini.verbose = True finetuned_4o_mini(**devset[0].inputs())

Task: -= Welcome to TextWorld, ALFRED! =-

You are in the middle of a room. Looking quickly around you, you see a armchair 1, a cabinet 1, a drawer 21, a drawer 20, a drawer 19, a drawer 18, a drawer 17, a drawer 16, a drawer 15, a drawer 14, a drawer 13, a drawer 12, a drawer 11, a drawer 10, a drawer 9, a drawer 8, a drawer 7, a drawer 6, a drawer 5, a drawer 4, a drawer 3, a drawer 2, a drawer 1, a dresser 1, a garbagecan 1, a sidetable 5, a sidetable 4, a sidetable 3, a sidetable 2, a sidetable 1, and a sofa 1.

Your task is to: put some box on dresser.
> go to cabinet 1
You arrive at cabinet 1. On the cabinet 1, you see nothing.
> go to dresser 1
You arrive at dresser 1. On the dresser 1, you see a book 1, a newspaper 1, a remotecontrol 1, a statue 3, and a television 1.
> look
You are facing the dresser 1. Next to it, you see nothing.
> go to sidetable 1
You arrive at sidetable 1. On the sidetable 1, you see a cellphone 1, and a desklamp 1.
> go to sidetable 2
You arrive at sidetable 2. On the sidetable 2, you see a box 2.
> take box 2 from sidetable 2
You pick up the box 2 from the sidetable 2.
> go to dresser 1
You arrive at dresser 1. On the dresser 1, you see a book 1, a newspaper 1, a remotecontrol 1, a statue 3, and a television 1.
> move box 2 to dresser 1
You move the box 2 to the dresser 1.

Out[14]

Prediction(
    trajecotry=['> go to cabinet 1', 'You arrive at cabinet 1. On the cabinet 1, you see nothing.', '> go to dresser 1', 'You arrive at dresser 1. On the dresser 1, you see a book 1, a newspaper 1, a remotecontrol 1, a statue 3, and a television 1.', '> look', 'You are facing the dresser 1. Next to it, you see nothing.', '> go to sidetable 1', 'You arrive at sidetable 1. On the sidetable 1, you see a cellphone 1, and a desklamp 1.', '> go to sidetable 2', 'You arrive at sidetable 2. On the sidetable 2, you see a box 2.', '> take box 2 from sidetable 2', 'You pick up the box 2 from the sidetable 2.', '> go to dresser 1', 'You arrive at dresser 1. On the dresser 1, you see a book 1, a newspaper 1, a remotecontrol 1, a statue 3, and a television 1.', '> move box 2 to dresser 1', 'You move the box 2 to the dresser 1.'],
    success=1
)

如果你想加载和使用智能体程序，可以按如下方式操作。

In [16]

已复制！

loaded = Agent()
loaded.load('finetuned_4o_mini_001.pkl')
loaded = Agent() loaded.load('finetuned_4o_mini_001.pkl')