Client ckpt save#196
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the ability to specify a custom save_dir when adding adapters to models, allowing users to control where model weights and metadata are stored. The changes span the client generator, server-side handlers, and the checkpoint management system, which now includes a pointer mechanism to track metadata across different filesystem locations. My feedback focuses on addressing a potential directory traversal vulnerability by validating client-provided adapter names and removing redundant path resolution logic in the checkpoint manager to improve code clarity.
| try: | ||
| resolved_save_dir = _resolve_client_save_dir(body.save_dir).as_posix() if body.save_dir else None | ||
| except ValueError as exc: | ||
| raise HTTPException(status_code=400, detail=str(exc)) |
There was a problem hiding this comment.
The adapter_name provided by the client should be validated to prevent directory traversal attacks. Since this name is used to construct filesystem paths on the server (e.g., in training_run_manager.save), it's critical to ensure it doesn't contain malicious sequences like .. or absolute paths. Using the existing validate_user_path helper is recommended.
| try: | |
| resolved_save_dir = _resolve_client_save_dir(body.save_dir).as_posix() if body.save_dir else None | |
| except ValueError as exc: | |
| raise HTTPException(status_code=400, detail=str(exc)) | |
| if not validate_user_path(token, body.adapter_name): | |
| raise HTTPException(status_code=400, detail=f'Invalid adapter_name: {body.adapter_name}') | |
| try: | |
| resolved_save_dir = _resolve_client_save_dir(body.save_dir).as_posix() if body.save_dir else None | |
| except ValueError as exc: | |
| raise HTTPException(status_code=400, detail=str(exc)) |
| import twinkle_client.types as types | ||
| from twinkle.data_format import InputFeature, Trajectory | ||
| from twinkle.server.common.checkpoint_factory import create_checkpoint_manager, create_training_run_manager | ||
| from twinkle.server.utils.checkpoint_base import _resolve_client_save_dir |
There was a problem hiding this comment.
Import validate_user_path to enable validation of client-provided adapter names, ensuring they are safe for filesystem operations.
| from twinkle.server.utils.checkpoint_base import _resolve_client_save_dir | |
| from twinkle.server.utils.checkpoint_base import _resolve_client_save_dir, validate_user_path |
| try: | ||
| resolved_save_dir = _resolve_client_save_dir(body.save_dir).as_posix() if body.save_dir else None | ||
| except ValueError as exc: | ||
| raise HTTPException(status_code=400, detail=str(exc)) |
What Changed
save_dirthrough training and adapter creation flowssave_dirvalues on the client side before sending requests, using the client process current working directorysave_diron the server side and return a clear error when the directory does not exist or is not a directoryImpact
save_dirwhen creating adapters or training runsExample
save_dircan be an absolute path:Or a relative path:
Relative paths are resolved from the client process current working directory before being sent to the server.
The resolved directory must already exist on the server. If it does not exist, the request fails with a clear error, for example: