fix: bound tensor header length during deserialization by dfgvaetyj3456356-hash · Pull Request #208 · coreweave/tensorizer

dfgvaetyj3456356-hash · 2026-06-04T02:41:30Z

Summary

Bounds each tensor header length against the metadata span that already describes where that tensor's data begins.

Before this change, _TensorHeaderDeserializer.from_io() trusted the 8-byte header length stored in the file and allocated a bytearray(header_len) before checking whether that length fit the metadata entry for the tensor. A malformed .tensors file could therefore make the lazy deserializer allocate based on an attacker-controlled header length, and small oversized headers could also be accepted because missing bytes were effectively padded in the header buffer.

Fix

Add an optional max_header_len guard to _TensorHeaderDeserializer.from_io().
In _copy_thread(), derive the legal header span from the metadata entry at the current file offset: entry.data_offset - entry.offset.
Reject unexpected header offsets and oversized header lengths before allocating the header buffer.

Tests

python -m py_compile tensorizer\serialization.py tests\test_serialization.py
python -m pytest tests\test_serialization.py::TestSerialization::test_oversized_tensor_header_is_rejected -q
python -m pytest tests\test_serialization.py::TestDeserialization::test_lazy_load -q
python -m pytest tests\test_serialization.py::TestSerialization::test_oversized_tensor_header_is_rejected tests\test_serialization.py::TestDeserialization::test_lazy_load -q
git diff --check

test_lazy_load emits the existing Hugging Face cache symlink warning on Windows, but passes.

fix: bound tensor header length during deserialization

1d16a64

dfgvaetyj3456356-hash requested a review from Eta0 as a code owner June 4, 2026 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: bound tensor header length during deserialization#208

fix: bound tensor header length during deserialization#208
dfgvaetyj3456356-hash wants to merge 1 commit into
coreweave:mainfrom
dfgvaetyj3456356-hash:security/bound-tensor-header-length

dfgvaetyj3456356-hash commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dfgvaetyj3456356-hash commented Jun 4, 2026

Summary

Fix

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant