Add Bedrock embedding support via InvokeModel API#677
Conversation
Implements embedding support for AWS Bedrock using the InvokeModel API, enabling `RubyLLM.embed` to work natively with Bedrock embedding models such as Amazon Titan Text Embeddings V2. - Add Bedrock::Embeddings module with embedding_url, render_embedding_payload, and parse_embedding_response - Override embed in Bedrock provider to handle SigV4 request signing and per-text invocation (InvokeModel accepts one input at a time) - Add amazon.titan-embed-text-v2:0 to EMBEDDING_MODELS test list - Skip custom dimension tests for Bedrock (Titan V2 only supports 256, 512, or 1024)
|
Rubocop failure was upstream from this - I suspect any other PRs that update against main will start failing |
|
We would definitely use this if available, thanks for working on it! |
|
@crmne there's appetite for this. can we get it in? |
|
Ya, I've had a fork of RubyLLM with these same updates for the past 2 months. Would be lovely to see it in main. I was just lazy and never created a PR. Happy that @cgmoore120 was better contributor than me. 😄 |
|
@crmne Any chance this could be merged? |
|
@crmne any chance this could be merged soon? |
…ng-support # Conflicts: # lib/ruby_llm/providers/bedrock.rb
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #677 +/- ##
==========================================
+ Coverage 82.88% 82.90% +0.02%
==========================================
Files 142 143 +1
Lines 6574 6600 +26
Branches 1148 1150 +2
==========================================
+ Hits 5449 5472 +23
- Misses 663 664 +1
- Partials 462 464 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
The Bedrock embedding integration specs were failing in CI with VCR::Errors::UnhandledHTTPRequestError because no cassettes were ever recorded for amazon.titan-embed-text-v2:0. Record them so the embedding suite passes without live AWS credentials.
2b6c9c4 to
99d21f4
Compare
Bedrock overrides embed at the provider level (it needs per-request SigV4 signing and cannot batch, so it issues one InvokeModel request per text and parses the responses inline). That means the base parse_embedding_response hook is never called, leaving it as dead code. Drop it; embedding_url and render_embedding_payload remain and are used.
b3836a2 to
9ee2a63
Compare
The `.logger` spec reset `RubyLLM.@config`/`@logger` to nil in its `after` hook instead of restoring the originals. Because the AR integration sets `config.model_registry_source` only once (in ActsAs.included at load time), a nil-ed config is rebuilt without that source, so any later spec in the same parallel test-queue worker that relies on the database-backed model registry (acts_as_model registry integration) silently falls back to the JSON registry and fails. Use an around hook that saves and restores the original config/logger so the global state is left untouched. This is order-dependent and only surfaced intermittently under the parallel queue.
|
This also rolls in #815 to address some test flakiness that was encountered trying to get this branch back to green |
|
Woohoo! Does that mean it's close to being merged? |
we can hope |
Summary
RubyLLM.embedto work natively with Bedrock embedding models (e.g. Amazon Titan Text Embeddings V2)Bedrock::Embeddingsmodule following the same pattern as other providers (embedding_url,render_embedding_payload,parse_embedding_response)embedon the Bedrock provider to handle SigV4 request signing and per-text invocation, since InvokeModel accepts one input at a timeamazon.titan-embed-text-v2:0to the embedding test matrixContext
This was previously attempted in #393 (model registry only), which was closed with guidance to combine registry + provider support in a single PR. This PR does that.
Bedrock embedding models are already discovered and registered by the existing
Bedrock::Modelsmodule (with theembedding→embeddingsmodality normalization), but the provider had noembedimplementation — callingRubyLLM.embedwith a Bedrock model would hit the baseProvider#embedwhich lacks SigV4 signing.Design decisions
inputText. The override maps over an array of texts, making individual signed requests for each. This matches the API contract.stream_response— signs headers inline against@connection.postrather than going throughsigned_post(which is coupled to chat completion viaapi_payload/parse_completion_response).768dimension tests are skipped for Bedrock, consistent with how Mistral and Azure handle unsupported dimension values.Test plan
RubyLLM.embed("text", model: "amazon.titan-embed-text-v2:0", provider: :bedrock)returns valid embedding vectors