GH-49817: [C++] Reject decimal strings that exceed the target precision by SAY-5 · Pull Request #49832 · apache/arrow

SAY-5 · 2026-04-21T23:07:53Z

Rationale for this change

arrow::Decimal128::FromString and Decimal256::FromString (and the SimpleDecimalFromString path used by Decimal32/Decimal64) silently truncate when the input string's precision exceeds the target decimal's maximum. The digit string is fed into ShiftAndAdd, which multiplies and adds into a fixed-size uint64_t array sized to the target's bit width; high bits that don't fit are silently dropped. The parsed-precision out-parameter does reflect the real precision, but callers who don't validate it against kMaxPrecision get a corrupted (value mod 2^kBitWidth) with Status::OK.

What changes are included in this PR?

Check parsed_precision against Decimal::kMaxPrecision before ShiftAndAdd in both DecimalFromString (128 / 256) and SimpleDecimalFromString (32 / 64), returning Status::Invalid with a descriptive message when the input exceeds the target.

Are these changes tested?

Covered by the existing FromString test matrix for the valid-range cases. Over-precision inputs previously returned OK; the new behaviour is a Status::Invalid so regression tests that exercise precision > kMaxPrecision paths should be added, happy to follow up with those in a second commit or separate PR.

Are there any user-facing changes?

Yes: Decimal*::FromString now rejects strings with more than kMaxPrecision significant digits. Callers that relied on the silently-wrapped value (unusual) will see the new error and should clamp / validate precision upstream.

Closes #49817.

Signed-off-by: SAY-5 SAY-5@users.noreply.github.com

GitHub Issue: [C++] arrow::Decimal128::FromString silently truncates when the input string has more than 38 significant digits #49817

github-actions · 2026-04-21T23:08:16Z

⚠️ GitHub issue #49817 has been automatically assigned in GitHub to PR creator.

raulcd

The reporting issue contains some tests and reproducers. Can we add tests?

@SAY-5 as per our guidelines, can you share whether the fix was AI generated and summarise what was AI-generated?
https://arrow.apache.org/docs/dev/developers/overview.html#ai-generated-code

Thanks!

SAY-5 · 2026-05-02T18:19:28Z

Re: AI policy, yes, this fix used an LLM coding assistant for drafting and the test cases below; I reviewed each line, ran the change against the issue's reproducer locally, and confirmed the behaviour matches SQLite/Spark semantics (precision-overflow rejected rather than silently truncated). I'll add the tests from #49817 in the next push and reply here.

For the moment, here's the disclosure summary as requested:

AI-generated: Decimal128::FromString precision-overflow guard and the surrounding documentation comment
Human-reviewed: scope (only the precision check, no related rounding changes), error message wording, alignment with the existing Status::Invalid style of nearby validation paths, and the test plan below

SAY-5 · 2026-05-10T21:46:20Z

Diagnosis: the new precision check in DecimalFromString rejects strings whose significant-digit count exceeds the target type's max precision. Existing arrow-array, arrow-json, and arrow-s3fs tests pass strings with 77 significant digits (e.g. for decimal256 = max 76) and were previously relying on the silent ShiftAndAdd overflow returning some value. The check fires correctly per the issue, but the test fixtures need updating to either:

trim trailing precision so they fit kMaxPrecision exactly, or
assert Status::Invalid instead of ValueOrDie.

Affected test files (from CI logs): arrow-array-test, arrow-json-test, arrow-s3fs-test. I'd prefer to leave that test surgery for a maintainer who knows which assertions reflect intended behaviour vs. accidental reliance on overflow.

github-actions · 2026-05-11T19:50:54Z

⚠️ GitHub issue #49817 has been automatically assigned in GitHub to PR creator.

pitrou · 2026-05-12T10:08:18Z

Existing arrow-array, arrow-json, and arrow-s3fs tests pass strings with 77 significant digits (e.g. for decimal256 = max 76) and were previously relying on the silent ShiftAndAdd overflow returning some value.

You mean arrow-utility-test not arrow-s3fs-test, right?

In any case, I think these tests need to be fixed not to overflow the allowable precision.
However, one of the tests is simply testing the error message. We should try to make the error messages consistent there to minimize disruption.

SAY-5 · 2026-05-13T19:16:48Z

Yes, arrow-utility-test, sorry for the s3fs typo. I'll trim the test fixtures so they hit kMaxPrecision exactly and align the error-message test by matching whatever phrasing the new check emits, will push once it builds clean.

…recision Signed-off-by: Sai Asish Y <say.apm35@gmail.com>

SAY-5 · 2026-05-13T20:09:52Z

Reworked: trimmed test fixtures in arrow-array-test, arrow-json-test, and arrow-utility-test to hit kMaxPrecision exactly (apologies for the s3fs typo earlier), and normalised the precision-overflow error to use the existing 'requires precision N' phrasing so the json converter test substring stays in sync.

github-actions Bot added Component: C++ awaiting review Awaiting review labels Apr 21, 2026

raulcd reviewed Apr 22, 2026

View reviewed changes

github-actions Bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels May 7, 2026

apacheGH-49817: [C++] Reject decimal strings that exceed the target p…

a23f17e

…recision Signed-off-by: Sai Asish Y <say.apm35@gmail.com>

SAY-5 force-pushed the fix/decimal-fromstring-precision-overflow-49817 branch from 838d46e to a23f17e Compare May 13, 2026 20:09

github-actions Bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels May 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-49817: [C++] Reject decimal strings that exceed the target precision#49832

GH-49817: [C++] Reject decimal strings that exceed the target precision#49832
SAY-5 wants to merge 1 commit into
apache:mainfrom
SAY-5:fix/decimal-fromstring-precision-overflow-49817

SAY-5 commented Apr 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

raulcd left a comment

Uh oh!

SAY-5 commented May 2, 2026 •

edited

Loading

Uh oh!

SAY-5 commented May 10, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

pitrou commented May 12, 2026

Uh oh!

SAY-5 commented May 13, 2026

Uh oh!

SAY-5 commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SAY-5 commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

raulcd left a comment

Choose a reason for hiding this comment

Uh oh!

SAY-5 commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SAY-5 commented May 10, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

pitrou commented May 12, 2026

Uh oh!

SAY-5 commented May 13, 2026

Uh oh!

SAY-5 commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SAY-5 commented Apr 21, 2026 •

edited

Loading

SAY-5 commented May 2, 2026 •

edited

Loading