Skip to content

'response_format' field in OpenAI image creation request does not match OpenAI API spec #1910

Closed
@Ephex2

Description

@Ephex2

LocalAI version:
v2.11.0-aio-cpu

Environment, CPU architecture, OS, and Version:
OS: Linux myBox 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Docker version: 26.0.0
CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @2.80 GHz

Describe the bug
response_format is not properly typed in image OpenAI requests in LocalAI

The OpenAI API spec says that the response_format property should be a string:

response_format - string or null / Optional / Defaults to url

The format in which the generated images are returned. Must be one of url or b64_json. URLs are only valid for 60 minutes after the image has been generated.

However, the type in the LocalAI repo seems to be a struct with a Type property, which would be a string. This is defined in openai.go at lines 102-106 in commit 801b481


To Reproduce

instead of providing a body like:

{
    ...
    "response_format": "url",
    ...
}

Which is supported by OpenAI, we must provide a response object of the type:

{
    ...
    "response_format": {"type": "url"},
    ...
}

This breaks image creation calls using client models for OpenAI. Example error:

curl http://localhost:8080/v1/images/generations  -H "Content-Type: application/json" -d '{
    "prompt": "A cute baby sea otter",
    "model": "stablediffusion",
    "n":1,
    "response_format": "url",
    "size": "256x256",
    "user": "go-gpt-cli"
}'
{"error":{"code":500,"message":"failed reading parameters from request:failed parsing request body: json: cannot unmarshal string into Go struct field OpenAIRequest.response_format of type schema.ChatCompletionResponseFormat","type":""}}%

As shown below, the modification to the response_format works locally, but the same request would not work with openAI:

curl http://localhost:8080/v1/images/generations  -H "Content-Type: application/json" -d '{
    "prompt": "A cute baby sea otter",
    "model": "stablediffusion",
    "n":1,
    "response_format": {"type": "url"},
    "size": "256x256",
    "user": "go-gpt-cli"
}'
{"created":1711557449,"id":"a6b91e5d-f117-462e-b2b6-d12bba52a2b5","data":[{"embedding":null,"index":0,"url":"http://localhost:8080/generated-images/b64449955181.png"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

Here is a sample of the above request failing when being sent to OpenAI, and a subsequent success when the response_format is modified (model changed to dall-e-2):

curl https://api.openai.com/v1/images/generations  -H "Content-Type: application/json" -H 'Authorization: Bearer $SECRET' -d '{
    "prompt": "A cute baby sea otter",
    "model": "dall-e-2",
    "n":1,
    "response_format": {"type": "url"},
    "size": "256x256",
    "user": "test"
}'
{
  "error": {
    "code": null,
    "message": "{'type': 'url'} is not of type 'string' - 'response_format'",
    "param": null,
    "type": "invalid_request_error"
  }
}

Expected behavior

When calling OpenAI's API, modifying the response_format to be a string with value "url", we see that the request works:

curl https://api.openai.com/v1/images/generations  -H "Content-Type: application/json" -H 'Authorization: Bearer $SECRET' -d '{
    "prompt": "A cute baby sea otter",
    "model": "dall-e-2",
    "n":1,
    "response_format": "url",            
    "size": "256x256",
    "user": "test"
}'
{
  "created": 1711557884,
  "data": [
    {
      "url": "https://something.blob.core.windows.net/private/...redacted..."
    }
  ]
}

Ideally, this same behavior would be achieved with LocalAI's API models.


Additional context
I don't know if this would impact existing users of LocalAI, or the backend, but I believe the struct should be modified to match the OpenAI API specifications, e.g., that the ResponseFormat field in the OpenAIRequest type be modified to be of type string.

Ideally, the omitempty tag would be added to the field as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions