This repository contains a sample app to highlight how to build voice agents using the Agents SDK and Python. The backend is written using FastAPI and exposes a websocket endpoint. The front-end is written using Next.js and connects to the websocket server.
Features:
- Multi-turn conversation handling
- Push-to-talk audio mode
- Function calling
- Streaming responses & tool calls
This app is meant to be used as a starting point to build a conversational assistant that you can customize to your needs.
- OpenAI API key
- If you're new to the OpenAI API, sign up for an account.
- Follow the Quickstart to retrieve your API key.
- Node.js and npm
uv
installed on your system
-
Set the OpenAI API key:
2 options:
- Set the
OPENAI_API_KEY
environment variable globally in your system - Set the
OPENAI_API_KEY
environment variable in the project: Create a.env
file at the root of the project and add the following line (see.env.example
for reference):
OPENAI_API_KEY=<your_api_key>
- Set the
-
Clone the Repository:
git clone https://github.com/openai/openai-voice-agent-sdk-sample.git cd openai-voice-agent-sdk-sample/
-
Install dependencies:
You will have to install both the dependencies for the front-end and the server. To do this run in the project root:
make sync
-
Run the app:
make serve
The app will be available at
http://localhost:3000
.
You are welcome to open issues or submit PRs to improve this app, however, please note that we may not review all suggestions.
This project is licensed under the MIT License. See the LICENSE file for details.