fix thinking for gemini models #1113

narengogi · 2025-05-27T07:49:31Z

testing done:

tested vertex and google providers with thinking enabled and disabled, in streaming mode and withouth streaming mode
verified that caching is working as intended while streaming

example payload:

{
    "model": "gemini-2.5-flash-preview-04-17",
    "max_tokens": 1000,
    "thinking": {
        "budget_tokens": 100,
        "type": "enabled"
    },
    "stream": false,
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "if tomatina tomatino has fathered tralleliala trallela, and batatina batata is tomatinas sister, how is she related to trallelia?"
                }
            ]
        }
    ]
}

matter-code-review · 2025-05-27T07:50:01Z

Description

# Summary By MatterAI

🔄 What Changed

This pull request refactors the handling of 'thinking' (chain-of-thought) messages for Gemini models across both Google Vertex AI and Google Generative AI providers. Key changes include:

Token Count: Introduced thoughtsTokenCount in usageMetadata and mapped it to completion_tokens_details.reasoning_tokens for accurate token usage reporting.
Content Parsing: Modified GoogleChatCompleteResponseTransform and GoogleChatCompleteStreamChunkTransform to correctly parse and structure thought and text parts from the model's response into a new content_blocks array. This array is included in the message when strictOpenAiCompliance is false, providing a more granular representation of the content.
Configuration: Adjusted the transformGenerationConfig function to precisely control the include_thoughts parameter based on params.thinking.type being 'enabled' and the presence of budget_tokens.

🔍 Impact of the Change

This fix ensures that 'thinking' messages from Gemini models are correctly processed and displayed, separating them from the main content. It improves the accuracy of token usage reporting for reasoning steps and enhances the flexibility of the API response by providing structured content blocks when strict OpenAI compliance is not required. This directly addresses the issue of incorrect thinking output handling.

📁 Total Files Changed

4 files were changed in this pull request.

🧪 Test Added

Manual testing was performed to verify the changes:

Provider Testing: Both Google Vertex AI and Google Generative AI providers were tested with the thinking feature enabled and disabled.
Mode Testing: The functionality was verified in both streaming and non-streaming modes.
Caching Verification: Caching behavior was confirmed to be working as intended during streaming operations.

🔒Security Vulnerabilities

No security vulnerabilities were detected in the changes.

Motivation

Closes #1112, which addresses a bug related to the incorrect handling of thinking output for Gemini models.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)

How Has This Been Tested?

Unit Tests
Integration Tests
Manual Testing

Screenshots (if applicable)

N/A

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Related Issues

#1112

Tip

Quality Recommendations

Consider adding specific unit tests for the content_blocks generation logic, covering various combinations of thought and text parts, and the impact of strictOpenAiCompliance on the final content and content_blocks structure. This would ensure robustness for future changes.
The change in VertexAnthropicChatCompleteConfig.model.transform from (params: Params) => {} to () => {} is a minor cleanup. While harmless as it returns undefined, ensure this aligns with any broader pattern for function signatures in ProviderConfig.

Sequence Diagram

sequenceDiagram
    participant Client
    participant GatewayAPI as Gateway API (/chat/completions)
    participant GoogleVertexAIProvider as Google Vertex AI Provider
    participant GoogleProvider as Google Provider
    participant GoogleVertexAIAPI as Google Vertex AI API
    participant GoogleGenerativeAIAPI as Google Generative AI API

    Client->>GatewayAPI: POST /chat/completions (params: {..., thinking: {budget_tokens, type}, ...})
    GatewayAPI->>GoogleVertexAIProvider: chatComplete(params)
    GatewayAPI->>GoogleProvider: chatComplete(params)

    GoogleVertexAIProvider->>GoogleVertexAIProvider: transformGenerationConfig(params)
    GoogleProvider->>GoogleProvider: transformGenerationConfig(params)

    alt For Google Vertex AI
        GoogleVertexAIProvider->>GoogleVertexAIAPI: generateContent(generationConfig: {thinking_config: {include_thoughts, thinking_budget}})
        GoogleVertexAIAPI-->>GoogleVertexAIProvider: response (usageMetadata: {thoughtsTokenCount}, candidates: [{content: {parts: [{text, thought}, {functionCall}]}}])
        GoogleVertexAIProvider->>GoogleVertexAIProvider: GoogleChatCompleteResponseTransform(response)
        GoogleVertexAIProvider->>GoogleVertexAIProvider: GoogleChatCompleteStreamChunkTransform(parsedChunk)
    end

    alt For Google Generative AI
        GoogleProvider->>GoogleGenerativeAIAPI: generateContent(generationConfig: {thinking_config: {include_thoughts, thinking_budget}})
        GoogleGenerativeAIAPI-->>GoogleGenerativeAIAPI: response (usageMetadata: {thoughtsTokenCount}, candidates: [{content: {parts: [{text, thought}, {functionCall}]}}])
        GoogleProvider->>GoogleProvider: GoogleChatCompleteResponseTransform(response)
        GoogleProvider->>GoogleProvider: GoogleChatCompleteStreamChunkTransform(parsedChunk)
    end

    GoogleChatCompleteResponseTransform->>GoogleChatCompleteResponseTransform: Process content parts
    GoogleChatCompleteResponseTransform->>GoogleChatCompleteResponseTransform: Identify 'thought' and 'text' parts
    GoogleChatCompleteResponseTransform->>GoogleChatCompleteResponseTransform: Construct 'contentBlocks' array ({type: 'thinking', thinking: part.text} or {type: 'text', text: part.text})
    GoogleChatCompleteResponseTransform->>GoogleChatCompleteResponseTransform: Add 'content_blocks' to message if !strictOpenAiCompliance
    GoogleChatCompleteResponseTransform->>GoogleChatCompleteResponseTransform: Map tool_calls
    GoogleChatCompleteResponseTransform->>GoogleChatCompleteResponseTransform: Update usage with 'thoughtsTokenCount' in 'completion_tokens_details.reasoning_tokens'

    GoogleChatCompleteStreamChunkTransform->>GoogleChatCompleteStreamChunkTransform: Process content parts in stream
    GoogleChatCompleteStreamChunkTransform->>GoogleChatCompleteStreamChunkTransform: Identify 'thought' and 'text' parts
    GoogleChatCompleteStreamChunkTransform->>GoogleChatCompleteStreamChunkTransform: Construct 'contentBlocks' array ({index, delta: {thinking}} or {index, delta: {text}})
    GoogleChatCompleteStreamChunkTransform->>GoogleChatCompleteStreamChunkTransform: Add 'content_blocks' to message if !strictOpenAiCompliance
    GoogleChatCompleteStreamChunkTransform->>GoogleChatCompleteStreamChunkTransform: Update usage with 'thoughtsTokenCount' in 'completion_tokens_details.reasoning_tokens'

    GoogleVertexAIProvider-->>GatewayAPI: Transformed Response
    GoogleProvider-->>GatewayAPI: Transformed Response
    GatewayAPI-->>Client: Final API Response

matter-code-review · 2025-05-27T07:50:02Z

Important

PR Review Skipped

PR review skipped as per the configuration setting. Run a manually review by commenting /matter review

💡Tips to use Matter AI

Command List

/matter summary: Generate AI Summary for the PR
/matter review: Generate AI Reviews for the latest commit in the PR
/matter review-full: Generate AI Reviews for the complete PR
/matter release-notes: Generate AI release-notes for the PR
/matter : Chat with your PR with Matter AI Agent
/matter remember : Generate AI memories for the PR
/matter explain: Get an explanation of the PR
/matter help: Show the list of available commands and documentation
Need help? Join our Discord server: https://discord.gg/fJU5DvanU3

matter-code-review · 2025-05-27T09:51:45Z

Important

PR Review Skipped

PR review skipped as per the configuration setting. Run a manually review by commenting /matter review

💡Tips to use Matter AI

Command List

/matter summary: Generate AI Summary for the PR
/matter review: Generate AI Reviews for the latest commit in the PR
/matter review-full: Generate AI Reviews for the complete PR
/matter release-notes: Generate AI release-notes for the PR
/matter : Chat with your PR with Matter AI Agent
/matter remember : Generate AI memories for the PR
/matter explain: Get an explanation of the PR
/matter help: Show the list of available commands and documentation
Need help? Join our Discord server: https://discord.gg/fJU5DvanU3

src/providers/google/chatComplete.ts

matter-code-review · 2025-06-11T07:28:23Z

Important

PR Review Skipped

PR review skipped as per the configuration setting. Run a manually review by commenting /matter review

💡Tips to use Matter AI

Command List

/matter summary: Generate AI Summary for the PR
/matter review: Generate AI Reviews for the latest commit in the PR
/matter review-full: Generate AI Reviews for the complete PR
/matter release-notes: Generate AI release-notes for the PR
/matter : Chat with your PR with Matter AI Agent
/matter remember : Generate AI memories for the PR
/matter explain: Get an explanation of the PR
/matter help: Show the list of available commands and documentation
Need help? Join our Discord server: https://discord.gg/fJU5DvanU3

fix thinking for gemini models

2c381ab

narengogi requested a review from b4s36t4 May 27, 2025 07:49

narengogi requested a review from csgulati09 May 27, 2025 07:52

csgulati09 previously approved these changes May 27, 2025

View reviewed changes

return thinking tokens in usage object for vertex

1b82a85

narengogi dismissed csgulati09’s stale review via 1b82a85 May 27, 2025 09:51

narengogi mentioned this pull request May 27, 2025

Vertex thinking changes to disable thinking mode for 2.5 models #1090

Closed

narengogi requested a review from csgulati09 May 27, 2025 11:09

b4s36t4 reviewed Jun 10, 2025

View reviewed changes

src/providers/google/chatComplete.ts Show resolved Hide resolved

narengogi requested a review from b4s36t4 June 10, 2025 13:34

VisargD approved these changes Jun 11, 2025

View reviewed changes

Merge branch 'main' into fix/thinking-for-gemini-models

7d75c77

VisargD merged commit fb39359 into Portkey-AI:main Jun 11, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix thinking for gemini models #1113

fix thinking for gemini models #1113

Uh oh!

narengogi commented May 27, 2025 •

edited

Loading

Uh oh!

matter-code-review bot commented May 27, 2025 •

edited

Loading

Quality Recommendations

Uh oh!

matter-code-review bot commented May 27, 2025

PR Review Skipped

Command List

Uh oh!

matter-code-review bot commented May 27, 2025

PR Review Skipped

Command List

Uh oh!

Uh oh!

matter-code-review bot commented Jun 11, 2025

PR Review Skipped

Command List

Uh oh!

Uh oh!

Uh oh!

fix thinking for gemini models #1113

fix thinking for gemini models #1113

Uh oh!

Conversation

narengogi commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matter-code-review bot commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

# Summary By MatterAI

🔄 What Changed

🔍 Impact of the Change

📁 Total Files Changed

🧪 Test Added

🔒Security Vulnerabilities

Motivation

Type of Change

How Has This Been Tested?

Screenshots (if applicable)

Checklist

Related Issues

Quality Recommendations

Sequence Diagram

Uh oh!

matter-code-review bot commented May 27, 2025

PR Review Skipped

Command List

Uh oh!

matter-code-review bot commented May 27, 2025

PR Review Skipped

Command List

Uh oh!

Uh oh!

matter-code-review bot commented Jun 11, 2025

PR Review Skipped

Command List

Uh oh!

Uh oh!

Uh oh!

narengogi commented May 27, 2025 •

edited

Loading

matter-code-review bot commented May 27, 2025 •

edited

Loading