Python: Allow Kernel Functions from Prompt for image and audio content #11403

eavanvalkenburg · 2025-04-07T12:53:21Z

Motivation and Context

I noticed that even though the input and prompt rendering match what you want to use for image and audio generation, we didn't support that.

This introduces just that, with two samples. This unlocks the following scenario's:

Running text to speech pipelines with set intro/outro statements
Creating function calls for image generation with limited scope and a lot of set pieces.

Description

Adds a get_image_content method to the TextToImageClientBase class
Adds the option to select a TextToImage or TextToAudio client in the service selector (only for non-streaming)
Adds branches in the KernelFunctionFromPrompt _invoke_internal for those types.
Adds handling the output as a FunctionResult
Adds samples for both

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

markwallace-microsoft · 2025-04-07T12:56:20Z

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
3418	5 💤	0 ❌	0 🔥	1m 43s ⏱️

python/semantic_kernel/connectors/ai/open_ai/services/open_ai_text_to_image_base.py

python/semantic_kernel/connectors/ai/text_to_image_client_base.py

eavanvalkenburg requested a review from a team as a code owner April 7, 2025 12:53

markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Apr 7, 2025

eavanvalkenburg force-pushed the prompt_for_image_and_audio branch from 6df5e18 to 76fbcb0 Compare April 7, 2025 14:08

moonbox3 approved these changes Apr 8, 2025

View reviewed changes

TaoChenOSU reviewed Apr 8, 2025

View reviewed changes

python/semantic_kernel/connectors/ai/open_ai/services/open_ai_text_to_image_base.py Outdated Show resolved Hide resolved

python/semantic_kernel/connectors/ai/text_to_image_client_base.py Outdated Show resolved Hide resolved

eavanvalkenburg force-pushed the prompt_for_image_and_audio branch from ee1ddf8 to 81da09e Compare April 9, 2025 07:04

TaoChenOSU approved these changes Apr 9, 2025

View reviewed changes

eavanvalkenburg force-pushed the prompt_for_image_and_audio branch from 79d0810 to 2d42291 Compare April 10, 2025 08:36

eavanvalkenburg added 5 commits April 10, 2025 14:38

allow prompts to return images or audio

04e03b3

fixed tests

8dab791

small fix in typing

084c2ba

updated methods

278a205

fix

a9d735b

eavanvalkenburg force-pushed the prompt_for_image_and_audio branch from 2d42291 to a9d735b Compare April 10, 2025 12:38

moonbox3 approved these changes Apr 10, 2025

View reviewed changes

moonbox3 added this pull request to the merge queue Apr 10, 2025

Merged via the queue into microsoft:main with commit 97fb2c3 Apr 10, 2025
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Allow Kernel Functions from Prompt for image and audio content #11403

Python: Allow Kernel Functions from Prompt for image and audio content #11403

eavanvalkenburg commented Apr 7, 2025 •

edited

Loading

markwallace-microsoft commented Apr 7, 2025 •

edited

Loading

Python: Allow Kernel Functions from Prompt for image and audio content #11403

Python: Allow Kernel Functions from Prompt for image and audio content #11403

Conversation

eavanvalkenburg commented Apr 7, 2025 • edited Loading

Motivation and Context

Description

Contribution Checklist

markwallace-microsoft commented Apr 7, 2025 • edited Loading

Python Unit Test Overview

eavanvalkenburg commented Apr 7, 2025 •

edited

Loading

markwallace-microsoft commented Apr 7, 2025 •

edited

Loading