Support Thinking part #1142

Kludex · 2025-03-16T13:41:29Z

Checklist

Streaming
Documentation
Add id in the ThinkingPart

github-actions · 2025-03-16T13:45:24Z

Docs Preview

commit:	`2e31e26`
Preview URL:	https://2577b205-pydantic-ai-previews.pydantic.workers.dev

Kludex · 2025-03-27T07:47:29Z

I would like to support OpenAI responses API before working on this PR.

sam-cobo · 2025-04-17T02:40:10Z

Any progress on this or any release plan?

Kludex · 2025-04-20T08:06:32Z

I'll release this in some days.

Wh1isper · 2025-04-22T02:39:54Z

Very nice PR, I was wondering if we could split it into two parts and release the ThinkingPart first so developers can implement it for their own models first.

For example, deepseek-r1 uses reasoning_content to show the full content. this doesn't seem to be in the current primary support.

https://api-docs.deepseek.com/api/create-chat-completion

Kludex · 2025-04-25T08:56:50Z

I think signature should only be available for Anthropic. I think it's confusing to convert id to signature.

hyperlint-ai · 2025-04-25T09:02:09Z

PR Change Summary

Introduced support for the Thinking part, enhancing the reasoning capabilities of the model.

Added a new section on Thinking in the documentation
Closed multiple related issues regarding the Thinking feature
Included streaming support for the Thinking part

Added Files

docs/thinking.md

How can I customize these reviews?

Check out the Hyperlint AI Reviewer docs for more information on how to customize the review.

If you just want to ignore it on this PR, you can add the hyperlint-ignore label to the PR. Future changes won't trigger a Hyperlint review.

Note specifically for link checks, we only check the first 30 links in a file and we cache the results for several hours (for instance, if you just added a page, you might experience this). Our recommendation is to add hyperlint-ignore to the PR to ignore the link check for this PR.

aristideubertas · 2025-04-28T11:48:23Z

pydantic_ai_slim/pydantic_ai/models/bedrock.py

@@ -419,6 +426,9 @@ async def _map_messages(
                for item in m.parts:
                    if isinstance(item, TextPart):
                        content.append({'text': item.content})
+                    elif isinstance(item, ThinkingPart):
+                        # NOTE: We don't pass the thinking part to Bedrock since it raises an error.


Does this mean that models won't be able to see their previous thinking if we try to pass previous messages in?

I think we need to use the <think> tags here.

aristideubertas · 2025-05-12T16:41:26Z

@Kludex is this PR ready to be merged or is more work to be done?

yoadsn · 2025-05-21T15:57:36Z

Thank you for the hard work on this.
I realize this is WIP and super involved/complex.

But for Gemini (and other models?) you can still pay for thinking even if no "thinking parts" are in the response.
Thus, cost calculation should include those tokens which are priced differently.
With Gemini, for example the total will include those thinking tokens, and you can imply them but this seems fragile.

For example:

# Pricing
## gemini-2.5-flash-preview-04-17
## Input $0.15/1M
## Output $0.6/1M, Thinking $3.5/1M
total_usage = result.usage()

display(total_usage)
total_cost_usd = total_usage.request_tokens * 0.15 / 1e6 + total_usage.response_tokens * 0.6 / 1e6
print(f"total_cost using request+response tokens:{total_cost_usd}")

assumed_thinking_token_amount = total_usage.total_tokens - (total_usage.request_tokens + total_usage.response_tokens)
print(f"assumed_thinking_token_amount:{assumed_thinking_token_amount}")
thinking_output_cost_usd = assumed_thinking_token_amount * 0.35 / 1e6

total_cost_usd = total_cost_usd + thinking_output_cost_usd
print(f"total_cost using request+response+thinking tokens:{total_cost_usd}")

Output:

total_cost using request+response tokens:0.00115425
assumed_thinking_token_amount:2118
total_cost using request+response+thinking tokens:0.0018955499999999998

Is it worth creating a small PR just to reflect this on the Usage "details", even just for Gemini? Or too messy to merge later? I think I can get this done quite quickly.

dmontagu · 2025-06-10T13:44:27Z

docs/thinking.md

+[`ThinkingPart`][pydantic_ai.messages.ThinkingPart] into [`TextPart`][pydantic_ai.messages.TextPart]s using the
+`"<think>"` tag.
+
+If you want to proper emit thinking parts, you'd need to use the


Suggested change

If you want to proper emit thinking parts, you'd need to use the

If you want to emit proper thinking parts, you'd need to use the

or

Suggested change

If you want to proper emit thinking parts, you'd need to use the

If you want to properly emit thinking parts, you'd need to use the

dmontagu · 2025-06-10T13:45:55Z

docs/thinking.md

+Differently than other providers, Anthropic sends back a signature in the thinking part. This signature
+is used to make sure that the thinking part was not tampered with.
+
+To enable the thinking part, use the `anthropic_thinking` field on the


Probably worth discussing compatibility with other models — does this mean you can't hand non-anthropic thinking parts to an anthropic model? And vice versa?

dmontagu · 2025-06-10T13:46:11Z

docs/thinking.md

+
+Thinking or reasoning is the process of using a model's capabilities to reason about a task.
+
+This capability is usually not enabled by default. It depends on the model.


Suggested change

This capability is usually not enabled by default. It depends on the model.

This capability is usually not enabled by default, and how to enable it depends on the model.

dmontagu · 2025-06-10T15:07:32Z

pydantic_ai_slim/pydantic_ai/models/test.py

+                # NOTE: There's no way to reach this part of the code, since we don't generate ThinkingPart on TestModel.
+                pass  # pragma: no cover


Suggested change

# NOTE: There's no way to reach this part of the code, since we don't generate ThinkingPart on TestModel.

pass # pragma: no cover

assert False, 'This should be unreachable — we don't generate ThinkingPart on TestModel.'

dmontagu · 2025-06-10T15:08:08Z

pydantic_ai_slim/pydantic_ai/models/openai.py

+            from rich.pretty import pprint
+
+            pprint(chunk)


Suggested change

from rich.pretty import pprint

pprint(chunk)

dmontagu · 2025-06-10T15:15:22Z

pydantic_ai_slim/pydantic_ai/_thinking_part.py

+    while START_THINK_TAG in content:
+        before_think, content = content.split(START_THINK_TAG, 1)
+        if before_think.strip():
+            parts.append(TextPart(content=before_think))
+        if END_THINK_TAG in content:
+            think_content, content = content.split(END_THINK_TAG, 1)
+            parts.append(ThinkingPart(content=think_content))
+        else:
+            # We lose the `<think>` tag, but it shouldn't matter.
+            parts.append(TextPart(content=content))
+            content = ''


Suggested change

while START_THINK_TAG in content:

before_think, content = content.split(START_THINK_TAG, 1)

if before_think.strip():

parts.append(TextPart(content=before_think))

if END_THINK_TAG in content:

think_content, content = content.split(END_THINK_TAG, 1)

parts.append(ThinkingPart(content=think_content))

else:

# We lose the `<think>` tag, but it shouldn't matter.

parts.append(TextPart(content=content))

content = ''

start_index = content.find(START_THINK_TAG)

while start_index >= 0:

before_think, content = content[:start_index], content[start_index + len(START_THINK_TAG):]

if before_think.strip():

parts.append(TextPart(content=before_think))

end_index = content.find(END_THINK_TAG)

if end_index >= 0:

think_content, content = content[:end_index], content[end_index + len(END_THINK_TAG):]

parts.append(ThinkingPart(content=think_content))

else:

# We lose the `<think>` tag, but it shouldn't matter.

parts.append(TextPart(content=content))

content = ''

start_index = content.find(START_THINK_TAG)

single pass over the string 🤷‍♂️. Also, I see you check for .strip(), should we be applying .strip() to the contents of the TextPart / ThinkingParts?

dmontagu · 2025-06-10T15:19:12Z

pydantic_ai_slim/pydantic_ai/models/anthropic.py

+                warnings.warn(
+                    'PydanticAI currently does not handle redacted thinking blocks. '
+                    'If you have a suggestion on how we should handle them, please open an issue.',
+                    UserWarning,


Should this really be a UserWarning? Not sure what's going on to generate these but it feels to me like we should emit something easier to selectively suppress. But maybe this is good if it brings attention to the issue if anyone is using it..

dmontagu

The one main point of feedback I have is that I think we should provide a convenient way to receive thinking parts in the responses (for the sake of observability, possibly even showing to users, etc.), but to also exclude them from subsequent requests for the sake of reducing token usage. Like, my guess is that in practice including the thinking parts doesn't usually improve subsequent chat completions (at least not by much), but does dramatically increase input token usage in longer conversations with a reasoning model. To be clear, I think there are times when you do want to include the thoughts in subsequent messages, so I don't think we should make that impossible, I just think there should be a way to configure it, probably via a kwarg to the agent or similar. (I'm happy for the default behavior to be whatever we think is best, I just think the option to disable it should exist..)

Otherwise, other than the comments I've added, the PR is looking pretty good

amiyapatanaik · 2025-06-10T16:49:28Z

The one main point of feedback I have is that I think we should provide a convenient way to receive thinking parts in the responses (for the sake of observability, possibly even showing to users, etc.), but to also exclude them from subsequent requests for the sake of reducing token usage. Like, my guess is that in practice including the thinking parts doesn't usually improve subsequent chat completions (at least not by much), but does dramatically increase input token usage in longer conversations with a reasoning model. To be clear, I think there are times when you do want to include the thoughts in subsequent messages, so I don't think we should make that impossible, I just think there should be a way to configure it, probably via a kwarg to the agent or similar. (I'm happy for the default behavior to be whatever we think is best, I just think the option to disable it should exist..)

Otherwise, other than the comments I've added, the PR is looking pretty good

Excellent points. I do think most applications would like to access the thoughts as a separate part in the response, but surely would like to ignore it in the history due to both cost as well as the chance that long sequence of thoughts in the chat history can distract the agent.

That being said, I'm eagerly waiting for the PR to be accepted.

Kludex marked this pull request as draft March 16, 2025 13:41

Support Thinking part

edffe70

Kludex force-pushed the support-thinking branch from 800710b to edffe70 Compare March 22, 2025 10:36

merge

1127c31

Kludex added 2 commits April 1, 2025 14:11

Merge branch 'main' into support-thinking

ac7ca8d

push

bc84993

Kludex added 7 commits April 18, 2025 13:36

merge

77a3338

ignore

e5202cb

Add more support for thinking part

f4b7fde

Add tests

f0da181

Add tests

3b92cc0

pass tests

985991b

Merge branch 'main' into support-thinking

94abf96

Kludex and others added 4 commits April 20, 2025 10:42

fix pipeline

0cb280e

Implement streaming

f3600f7

Merge branch 'main' into support-thinking

427daf7

Minor cleanup

b57502e

Kludex added 7 commits April 22, 2025 08:35

merge

03e9fd4

Support Thinking part

a04533f

Support Thinking part

5fced6a

Support Thinking part

eaf70e1

Support Thinking part

aef5b47

Support Thinking part

4e92754

Merge remote-tracking branch 'origin/main' into support-thinking

2373e39

Kludex marked this pull request as ready for review April 22, 2025 13:52

Kludex added 4 commits April 25, 2025 10:31

Coverage on Bedrock

76a1d48

Bump openai

7b56087

Merge branch 'main' into support-thinking

3483663

Fix openai provider streaming

e5e901f

Add basic documentation

bf03ecd

Kludex added 3 commits April 25, 2025 20:33

Add more coverage

006d17a

Merge remote-tracking branch 'origin/main' into support-thinking

07cfe72

Apply changes

19c275e

aristideubertas reviewed Apr 28, 2025

View reviewed changes

DouweM marked this pull request as draft April 30, 2025 20:56

DouweM assigned Kludex Apr 30, 2025

seangal2 mentioned this pull request May 19, 2025

Reasoning response support #907

Open

Kludex added 2 commits June 8, 2025 12:46

merge

84b3ce4

Add documentation and feature parity

e5f657e

dmontagu reviewed Jun 10, 2025

View reviewed changes

Kludex added 3 commits June 13, 2025 11:31

Merge remote-tracking branch 'origin/main' into support-thinking

84528fa

fix tests

e579275

fix tests

2e31e26

	If you want to proper emit thinking parts, you'd need to use the
	If you want to emit proper thinking parts, you'd need to use the

	If you want to proper emit thinking parts, you'd need to use the
	If you want to properly emit thinking parts, you'd need to use the


		Thinking or reasoning is the process of using a model's capabilities to reason about a task.

		This capability is usually not enabled by default. It depends on the model.

		# NOTE: There's no way to reach this part of the code, since we don't generate ThinkingPart on TestModel.
		pass # pragma: no cover

	# NOTE: There's no way to reach this part of the code, since we don't generate ThinkingPart on TestModel.
	pass # pragma: no cover
	assert False, 'This should be unreachable — we don't generate ThinkingPart on TestModel.'

Support Thinking part #1142

Are you sure you want to change the base?

Support Thinking part #1142

Conversation

Kludex commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

github-actions bot commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Docs Preview

Uh oh!

Kludex commented Mar 27, 2025

Uh oh!

sam-cobo commented Apr 17, 2025

Uh oh!

Kludex commented Apr 20, 2025

Uh oh!

Wh1isper commented Apr 22, 2025

Uh oh!

Kludex commented Apr 25, 2025

Uh oh!

hyperlint-ai bot commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Change Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aristideubertas commented May 12, 2025

Uh oh!

yoadsn commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dmontagu Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dmontagu left a comment

Choose a reason for hiding this comment

Uh oh!

amiyapatanaik commented Jun 10, 2025

Uh oh!

Uh oh!

Kludex commented Mar 16, 2025 •

edited

Loading

github-actions bot commented Mar 16, 2025 •

edited

Loading

hyperlint-ai bot commented Apr 25, 2025 •

edited

Loading

yoadsn commented May 21, 2025 •

edited

Loading

dmontagu Jun 10, 2025 •

edited

Loading