-
Notifications
You must be signed in to change notification settings - Fork 602
feature: inference profiles for bedrock #1118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: inference profiles for bedrock #1118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds support for AWS Bedrock inference profiles, allowing the system to work with inference profile ARNs by extracting the underlying foundation model. The implementation is generally good with proper caching and error handling, but I've identified a few improvements that could enhance the code quality and reliability.
src/providers/bedrock/utils.ts
Outdated
try { | ||
const getFromCacheByKey = c.get('getFromCacheByKey'); | ||
const putInCacheWithValue = c.get('putInCacheWithValue'); | ||
const cacheKey = `bedrock-inference-profile-${inferenceProfileIdentifier}`; | ||
const cachedFoundationModel = getFromCacheByKey | ||
? await getFromCacheByKey(env(c), cacheKey) | ||
: null; | ||
if (cachedFoundationModel) { | ||
//update ttl, dont't await the result | ||
putInCacheWithValue(env(c), cacheKey, cachedFoundationModel, 56400); | ||
return cachedFoundationModel; | ||
} | ||
|
||
const inferenceProfile = await getInferenceProfile( | ||
inferenceProfileIdentifier || '', | ||
providerOptions.awsRegion || '', | ||
providerOptions.awsAccessKeyId || '', | ||
providerOptions.awsSecretAccessKey || '', | ||
providerOptions.awsSessionToken || '' | ||
); | ||
|
||
// modelArn is always like arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2:1 | ||
const foundationModel = inferenceProfile?.models?.[0]?.modelArn | ||
?.split('/') | ||
?.pop(); | ||
putInCacheWithValue(env(c), cacheKey, foundationModel, 56400); | ||
return foundationModel; | ||
} catch (error) { | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Code Refactor
Issue: The error handling in getFoundationModelFromInferenceProfile silently returns null for any error, which could hide important issues.
Fix: Add more specific error handling and logging to help with debugging.
Impact: Improves troubleshooting and error visibility when inference profile resolution fails.
try { | |
const getFromCacheByKey = c.get('getFromCacheByKey'); | |
const putInCacheWithValue = c.get('putInCacheWithValue'); | |
const cacheKey = `bedrock-inference-profile-${inferenceProfileIdentifier}`; | |
const cachedFoundationModel = getFromCacheByKey | |
? await getFromCacheByKey(env(c), cacheKey) | |
: null; | |
if (cachedFoundationModel) { | |
//update ttl, dont't await the result | |
putInCacheWithValue(env(c), cacheKey, cachedFoundationModel, 56400); | |
return cachedFoundationModel; | |
} | |
const inferenceProfile = await getInferenceProfile( | |
inferenceProfileIdentifier || '', | |
providerOptions.awsRegion || '', | |
providerOptions.awsAccessKeyId || '', | |
providerOptions.awsSecretAccessKey || '', | |
providerOptions.awsSessionToken || '' | |
); | |
// modelArn is always like arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2:1 | |
const foundationModel = inferenceProfile?.models?.[0]?.modelArn | |
?.split('/') | |
?.pop(); | |
putInCacheWithValue(env(c), cacheKey, foundationModel, 56400); | |
return foundationModel; | |
} catch (error) { | |
return null; | |
} | |
try { | |
const getFromCacheByKey = c.get('getFromCacheByKey'); | |
const putInCacheWithValue = c.get('putInCacheWithValue'); | |
const cacheKey = `bedrock-inference-profile-${inferenceProfileIdentifier}`; | |
const cachedFoundationModel = getFromCacheByKey | |
? await getFromCacheByKey(env(c), cacheKey) | |
: null; | |
if (cachedFoundationModel) { | |
//update ttl, dont't await the result | |
putInCacheWithValue(env(c), cacheKey, cachedFoundationModel, 56400); | |
return cachedFoundationModel; | |
} | |
const inferenceProfile = await getInferenceProfile( | |
inferenceProfileIdentifier || '', | |
providerOptions.awsRegion || '', | |
providerOptions.awsAccessKeyId || '', | |
providerOptions.awsSecretAccessKey || '', | |
providerOptions.awsSessionToken || '' | |
); | |
// modelArn is always like arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-v2:1 | |
const foundationModel = inferenceProfile?.models?.[0]?.modelArn | |
?.split('/') | |
?.pop(); | |
if (!foundationModel) { | |
console.warn(`No foundation model found in inference profile: ${inferenceProfileIdentifier}`); | |
return null; | |
} | |
putInCacheWithValue(env(c), cacheKey, foundationModel, 56400); | |
return foundationModel; | |
} catch (error) { | |
console.error(`Error resolving foundation model from inference profile ${inferenceProfileIdentifier}:`, error); | |
return null; | |
} |
const foundationModel = model.includes('foundation-model/') | ||
? model.split('/').pop() | ||
: await getFoundationModelFromInferenceProfile( | ||
c, | ||
model, | ||
providerOptions | ||
); | ||
if (foundationModel) { | ||
params.foundationModel = foundationModel; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Code Refactor
Issue: The code doesn't handle the case where foundationModel extraction fails but still attempts to use it.
Fix: Add a check to ensure foundationModel is defined before setting it in params.
Impact: Prevents potential undefined values from being used in the model parameter.
const foundationModel = model.includes('foundation-model/') | |
? model.split('/').pop() | |
: await getFoundationModelFromInferenceProfile( | |
c, | |
model, | |
providerOptions | |
); | |
if (foundationModel) { | |
params.foundationModel = foundationModel; | |
} | |
const foundationModel = model.includes('foundation-model/') | |
? model.split('/').pop() | |
: await getFoundationModelFromInferenceProfile( | |
c, | |
model, | |
providerOptions | |
); | |
if (foundationModel && foundationModel.length > 0) { | |
params.foundationModel = foundationModel; | |
} |
export const getInferenceProfile = async ( | ||
inferenceProfileIdentifier: string, | ||
awsRegion: string, | ||
awsAccessKeyId: string, | ||
awsSecretAccessKey: string, | ||
awsSessionToken?: string | ||
) => { | ||
const url = `https://bedrock.${awsRegion}.amazonaws.com/inference-profiles/${encodeURIComponent(decodeURIComponent(inferenceProfileIdentifier))}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔒 Security Issue Fix
Issue: The getInferenceProfile function doesn't validate the inferenceProfileIdentifier before using it in the URL, which could potentially lead to URL manipulation issues.
Fix: Add validation to ensure the inferenceProfileIdentifier is a valid ARN format before using it.
Impact: Prevents potential security issues related to URL manipulation.
export const getInferenceProfile = async ( | |
inferenceProfileIdentifier: string, | |
awsRegion: string, | |
awsAccessKeyId: string, | |
awsSecretAccessKey: string, | |
awsSessionToken?: string | |
) => { | |
const url = `https://bedrock.${awsRegion}.amazonaws.com/inference-profiles/${encodeURIComponent(decodeURIComponent(inferenceProfileIdentifier))}`; | |
export const getInferenceProfile = async ( | |
inferenceProfileIdentifier: string, | |
awsRegion: string, | |
awsAccessKeyId: string, | |
awsSecretAccessKey: string, | |
awsSessionToken?: string | |
) => { | |
if (!inferenceProfileIdentifier || !inferenceProfileIdentifier.startsWith('arn:aws')) { | |
throw new Error('Invalid inference profile identifier format'); | |
} | |
const url = `https://bedrock.${awsRegion}.amazonaws.com/inference-profiles/${encodeURIComponent(decodeURIComponent(inferenceProfileIdentifier))}`; |
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
remove redundant cache write and check if cache function is available before invoking it
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
Important PR Review SkippedPR review skipped as per the configuration setting. Run a manually review by commenting /matter review 💡Tips to use Matter AICommand List
|
more on bedrock inference profiles:
https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles.html
Testing done:
arn:aws:bedrock:us-east-1:517194595696:application-inference-profile/s529qz7ddy06
(both URI encoded and as a regular string), verified that cost calculation is working as intendedanthropic.claude-3-haiku-20240307-v1:0
Guide
Note
inference profiles are immutable
arn:aws:bedrock:us-east-1:517194595696:application-inference-profile/s529qz7ddy06
)snippets for testing