Skip to content

Add Bedrock embedding support via InvokeModel API#677

Open
cgmoore120 wants to merge 18 commits into
crmne:mainfrom
cgmoore120:add-bedrock-embedding-support
Open

Add Bedrock embedding support via InvokeModel API#677
cgmoore120 wants to merge 18 commits into
crmne:mainfrom
cgmoore120:add-bedrock-embedding-support

Conversation

@cgmoore120

Copy link
Copy Markdown
Contributor

Summary

  • Implements embedding support for AWS Bedrock using the InvokeModel API, enabling RubyLLM.embed to work natively with Bedrock embedding models (e.g. Amazon Titan Text Embeddings V2)
  • Adds Bedrock::Embeddings module following the same pattern as other providers (embedding_url, render_embedding_payload, parse_embedding_response)
  • Overrides embed on the Bedrock provider to handle SigV4 request signing and per-text invocation, since InvokeModel accepts one input at a time
  • Adds amazon.titan-embed-text-v2:0 to the embedding test matrix

Context

This was previously attempted in #393 (model registry only), which was closed with guidance to combine registry + provider support in a single PR. This PR does that.

Bedrock embedding models are already discovered and registered by the existing Bedrock::Models module (with the embeddingembeddings modality normalization), but the provider had no embed implementation — calling RubyLLM.embed with a Bedrock model would hit the base Provider#embed which lacks SigV4 signing.

Design decisions

  • Per-text invocation: Unlike OpenAI/Gemini which accept batch input, Bedrock InvokeModel takes a single inputText. The override maps over an array of texts, making individual signed requests for each. This matches the API contract.
  • SigV4 signing: Follows the same pattern as stream_response — signs headers inline against @connection.post rather than going through signed_post (which is coupled to chat completion via api_payload/parse_completion_response).
  • Custom dimensions skip: Titan V2 only supports dimensions of 256, 512, or 1024, so the shared spec's 768 dimension tests are skipped for Bedrock, consistent with how Mistral and Azure handle unsupported dimension values.

Test plan

  • Verify RubyLLM.embed("text", model: "amazon.titan-embed-text-v2:0", provider: :bedrock) returns valid embedding vectors
  • Verify batch embedding with array input returns correct number of vectors
  • Verify single-string array input returns array-wrapped result (consistency test)
  • Confirm existing provider embedding tests remain green
  • Confirm rubocop passes on all changed files

cgmoore120 and others added 4 commits March 12, 2026 10:45
Implements embedding support for AWS Bedrock using the InvokeModel API,
enabling `RubyLLM.embed` to work natively with Bedrock embedding models
such as Amazon Titan Text Embeddings V2.

- Add Bedrock::Embeddings module with embedding_url,
  render_embedding_payload, and parse_embedding_response
- Override embed in Bedrock provider to handle SigV4 request signing
  and per-text invocation (InvokeModel accepts one input at a time)
- Add amazon.titan-embed-text-v2:0 to EMBEDDING_MODELS test list
- Skip custom dimension tests for Bedrock (Titan V2 only supports
  256, 512, or 1024)
@cgmoore120

Copy link
Copy Markdown
Contributor Author

Rubocop failure was upstream from this - I suspect any other PRs that update against main will start failing

@acoffman

acoffman commented Apr 2, 2026

Copy link
Copy Markdown

We would definitely use this if available, thanks for working on it!

@cgmoore120

Copy link
Copy Markdown
Contributor Author

@crmne there's appetite for this. can we get it in?

@travisbell

travisbell commented Apr 6, 2026

Copy link
Copy Markdown

Ya, I've had a fork of RubyLLM with these same updates for the past 2 months. Would be lovely to see it in main. I was just lazy and never created a PR. Happy that @cgmoore120 was better contributor than me. 😄

@travisbell

Copy link
Copy Markdown

@crmne Any chance this could be merged?

@sheredega303

Copy link
Copy Markdown

@crmne any chance this could be merged soon?
it would save some annoying workarounds =)

…ng-support

# Conflicts:
#	lib/ruby_llm/providers/bedrock.rb
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.15385% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.90%. Comparing base (80fe294) to head (5cb3599).

Files with missing lines Patch % Lines
lib/ruby_llm/providers/bedrock/embeddings.rb 91.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #677      +/-   ##
==========================================
+ Coverage   82.88%   82.90%   +0.02%     
==========================================
  Files         142      143       +1     
  Lines        6574     6600      +26     
  Branches     1148     1150       +2     
==========================================
+ Hits         5449     5472      +23     
- Misses        663      664       +1     
- Partials      462      464       +2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The Bedrock embedding integration specs were failing in CI with
VCR::Errors::UnhandledHTTPRequestError because no cassettes were ever
recorded for amazon.titan-embed-text-v2:0. Record them so the embedding
suite passes without live AWS credentials.
@cgmoore120 cgmoore120 force-pushed the add-bedrock-embedding-support branch from 2b6c9c4 to 99d21f4 Compare June 16, 2026 21:30
Bedrock overrides embed at the provider level (it needs per-request
SigV4 signing and cannot batch, so it issues one InvokeModel request
per text and parses the responses inline). That means the base
parse_embedding_response hook is never called, leaving it as dead code.
Drop it; embedding_url and render_embedding_payload remain and are used.
@cgmoore120 cgmoore120 force-pushed the add-bedrock-embedding-support branch from b3836a2 to 9ee2a63 Compare June 16, 2026 23:31
The `.logger` spec reset `RubyLLM.@config`/`@logger` to nil in its
`after` hook instead of restoring the originals. Because the AR
integration sets `config.model_registry_source` only once (in
ActsAs.included at load time), a nil-ed config is rebuilt without that
source, so any later spec in the same parallel test-queue worker that
relies on the database-backed model registry (acts_as_model registry
integration) silently falls back to the JSON registry and fails.

Use an around hook that saves and restores the original config/logger so
the global state is left untouched. This is order-dependent and only
surfaced intermittently under the parallel queue.
@cgmoore120

Copy link
Copy Markdown
Contributor Author

This also rolls in #815 to address some test flakiness that was encountered trying to get this branch back to green

@travisbell

Copy link
Copy Markdown

Woohoo! Does that mean it's close to being merged?

@cgmoore120

Copy link
Copy Markdown
Contributor Author

Woohoo! Does that mean it's close to being merged?

we can hope

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants