feat(services): add S3CompatibleImageFileStorage (first cloud impl of ImageFileStorageBase)#9182
Open
goanpeca wants to merge 1 commit into
Open
feat(services): add S3CompatibleImageFileStorage (first cloud impl of ImageFileStorageBase)#9182goanpeca wants to merge 1 commit into
goanpeca wants to merge 1 commit into
Conversation
… ImageFileStorageBase)
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an opt-in S3-compatible image file storage backend that implements ImageFileStorageBase, parallel to the existing on-disk implementation. The backend is selected via a new storage_backend config field defaulting to "disk", so behavior is unchanged for existing users. boto3 is reused (already a transitive dep), with conveniences for AWS S3, Backblaze B2, and any S3-compatible store.
Changes:
- New
S3CompatibleImageFileStorageplus user-metadata-based workflow/graph lookups and a synthetics3://get_pathstub. - New
storage_backend,s3_bucket,s3_endpoint_urlconfig fields and dispatch independencies.initialize(). - Documentation pages (mkdocs and starlight) plus unit tests against a hand-rolled fake S3 client.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| invokeai/app/services/image_files/image_files_s3.py | New S3-compatible image storage implementation. |
| invokeai/app/services/config/config_default.py | Adds storage_backend, s3_bucket, s3_endpoint_url settings. |
| invokeai/app/api/dependencies.py | Selects disk vs. S3 image storage at startup. |
| tests/app/services/image_files/test_image_files_s3.py | Unit tests using a fake S3 client. |
| tests/test_config.py | Tests for the new storage_backend config field. |
| docs/src/content/docs/configuration/object-storage.mdx | Starlight docs page for the new backend. |
| docs-old/configuration/object-storage.md | mkdocs docs page mirror. |
| mkdocs.yml | Adds the new docs page to navigation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+182
to
+186
| def get_path(self, image_name: str, thumbnail: bool = False) -> Path: | ||
| # Synthetic s3:// path; callers needing a real filesystem path should | ||
| # migrate to a presigned-URL service. | ||
| key = self._object_key(image_name, thumbnail=thumbnail) | ||
| return Path(f"s3://{self._bucket}/{key}") |
| image.info = info_dict | ||
|
|
||
| png_buffer = io.BytesIO() | ||
| image.save(png_buffer, format="PNG", pnginfo=pnginfo) |
Comment on lines
+132
to
+136
| ``` | ||
| <bucket>/ | ||
| images/<image-name>.png | ||
| thumbnails/<image-name>.thumbnail.webp | ||
| ``` |
| ``` | ||
| <bucket>/ | ||
| images/<image-name>.png | ||
| thumbnails/<image-name>.thumbnail.webp |
Comment on lines
+164
to
+167
| # STORAGE | ||
| storage_backend: Literal["disk", "s3"] = Field(default="disk", description='Backend for storing generated images. "disk" uses the local filesystem; "s3" uses any S3-compatible object store (AWS S3, Backblaze B2, etc.).') | ||
| s3_bucket: Optional[str] = Field(default=None, description='Bucket name for the s3 storage backend. Required when storage_backend="s3".') | ||
| s3_endpoint_url: Optional[str] = Field(default=None, description='Endpoint URL for the s3 storage backend. Leave unset to talk to AWS S3; set to a provider-specific URL (e.g. https://s3.us-west-004.backblazeb2.com for Backblaze B2) for any other S3-compatible store.') |
Comment on lines
+237
to
+255
| def _safe_meta(value: str) -> str: | ||
| """Encode arbitrary unicode metadata for safe transport as S3 user-metadata.""" | ||
| try: | ||
| value.encode("ascii") | ||
| return value | ||
| except UnicodeEncodeError: | ||
| return json.dumps({"__b64__": True, "v": value.encode("utf-8").hex()}) | ||
|
|
||
|
|
||
| def _unsafe_meta(value: str) -> str: | ||
| """Inverse of `_safe_meta`.""" | ||
| if value.startswith("{") and "__b64__" in value: | ||
| try: | ||
| payload = json.loads(value) | ||
| if isinstance(payload, dict) and payload.get("__b64__"): | ||
| return bytes.fromhex(payload["v"]).decode("utf-8") | ||
| except (ValueError, KeyError): | ||
| pass | ||
| return value |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First concrete cloud impl of
ImageFileStorageBase. The ABC was introduced in #1650 with cloud storage explicitly named as the design rationale ("if someone wants to use cloud storage for their images, they should be able to replace the image storage service easily."), but no implementation has ever landed.S3CompatibleImageFileStoragelives atinvokeai/app/services/image_files/image_files_s3.py, parallel toimage_files_disk.py. Works against AWS S3 by default and any S3-compatible store (Backblaze B2, MinIO, etc.) whens3_endpoint_urlis set. Selected via a new opt-instorage_backend: Literal["disk", "s3"]field onInvokeAIAppConfig, defaulting to"disk"— behavior is unchanged for existing users. No new runtime dep; boto3 is already pulled in by thedownloadservice.Related Issues / Discussions
Open Questions
Opening as draft so these can be discussed against concrete code. Same questions as #9121:
DiskImageFileStoragekeeps an in-process LRU; the S3 impl is stateless. Ship the cache now, or v1 without and follow up after profiling?pil_compress_level— disk reads it from config at save time; S3 currently uses PIL's default. Thread it through?get_pathcontract — typedPath, but S3 has no real path. We return a synthetics3://...; long-term answer is presigned URLs. OK as a stub?_FakeS3Clientinjected viaclient=(no new dep). Want moto instead?S3CompatibleObjectSerializer(latents) andS3PresignedUrlService(frontend direct-fetch). Keep them as separate PRs?QA Instructions
Generate an image, confirm gallery preview + workflow round-trip, confirm
images/<name>.pngandthumbnails/<name>.webpland in the bucket, delete from gallery and confirm both keys disappear. Smoke-tested against AWSus-east-1and Backblaze B2us-west-004.Unit tests:
uv run pytest tests/app/services/image_files/test_image_files_s3.py tests/test_config.py -v(13 tests).Checklist