GH-49817: [C++] Reject decimal strings that exceed the target precision#49832
GH-49817: [C++] Reject decimal strings that exceed the target precision#49832SAY-5 wants to merge 1 commit into
Conversation
|
|
raulcd
left a comment
There was a problem hiding this comment.
The reporting issue contains some tests and reproducers. Can we add tests?
@SAY-5 as per our guidelines, can you share whether the fix was AI generated and summarise what was AI-generated?
https://arrow.apache.org/docs/dev/developers/overview.html#ai-generated-code
Thanks!
|
Re: AI policy, yes, this fix used an LLM coding assistant for drafting and the test cases below; I reviewed each line, ran the change against the issue's reproducer locally, and confirmed the behaviour matches SQLite/Spark semantics (precision-overflow rejected rather than silently truncated). I'll add the tests from #49817 in the next push and reply here. For the moment, here's the disclosure summary as requested:
|
|
Diagnosis: the new precision check in
Affected test files (from CI logs): |
|
|
You mean In any case, I think these tests need to be fixed not to overflow the allowable precision. |
|
Yes, arrow-utility-test, sorry for the s3fs typo. I'll trim the test fixtures so they hit kMaxPrecision exactly and align the error-message test by matching whatever phrasing the new check emits, will push once it builds clean. |
…recision Signed-off-by: Sai Asish Y <say.apm35@gmail.com>
838d46e to
a23f17e
Compare
|
Reworked: trimmed test fixtures in arrow-array-test, arrow-json-test, and arrow-utility-test to hit kMaxPrecision exactly (apologies for the s3fs typo earlier), and normalised the precision-overflow error to use the existing 'requires precision N' phrasing so the json converter test substring stays in sync. |
Rationale for this change
arrow::Decimal128::FromStringandDecimal256::FromString(and theSimpleDecimalFromStringpath used byDecimal32/Decimal64) silently truncate when the input string's precision exceeds the target decimal's maximum. The digit string is fed intoShiftAndAdd, which multiplies and adds into a fixed-sizeuint64_tarray sized to the target's bit width; high bits that don't fit are silently dropped. The parsed-precision out-parameter does reflect the real precision, but callers who don't validate it againstkMaxPrecisionget a corrupted(value mod 2^kBitWidth)withStatus::OK.What changes are included in this PR?
Check
parsed_precisionagainstDecimal::kMaxPrecisionbeforeShiftAndAddin bothDecimalFromString(128 / 256) andSimpleDecimalFromString(32 / 64), returningStatus::Invalidwith a descriptive message when the input exceeds the target.Are these changes tested?
Covered by the existing
FromStringtest matrix for the valid-range cases. Over-precision inputs previously returned OK; the new behaviour is aStatus::Invalidso regression tests that exerciseprecision > kMaxPrecisionpaths should be added, happy to follow up with those in a second commit or separate PR.Are there any user-facing changes?
Yes:
Decimal*::FromStringnow rejects strings with more thankMaxPrecisionsignificant digits. Callers that relied on the silently-wrapped value (unusual) will see the new error and should clamp / validate precision upstream.Closes #49817.
Signed-off-by: SAY-5 SAY-5@users.noreply.github.com