branch-4.1 [fix](variant) Bind Variant search to nested indexes#63765
Merged
Conversation
### What problem does this PR solve? Issue Number: N/A Related PR: apache#63660 Problem Summary: Backport apache#63660 to branch-4.1. Bind Variant inverted-index search to the resolved scalar or nested Variant index reader, map nested leaf results back to the expected document scope, and preserve null bitmap semantics for empty bitset truth bitmaps. Adapt the segment index iterator call to the branch-4.1 ColumnReader API. Cherry-picked from commits 8310d28 and 315ad31. ### Release note Fix Variant inverted-index search binding for scalar and nested Variant paths. ### Check List (For Author) - Test: - Unit Test: ./run-be-ut.sh --run --filter='*Variant*:FunctionSearchTest.TestBuildLeafQueryDirectUnknownClauseUsesLeafMapper:FunctionSearchNestedTest.*:BitSetQueryTest.EmptyTruthBitmapPreservesNullBitmap' - Behavior changed: Yes. Fixes Variant inverted-index search binding and null bitmap handling. - Does this need documentation: No
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Member
Author
|
run buildall |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR cherry-picks Variant inverted-index search fixes into branch-4.1, separating Variant binding/nested search logic from the generic search() function and adding diagnostics for Variant index binding.
Changes:
- Adds
variant_inverted_index_searchsupport for Variant subcolumn binding, direct BKD reads, UNKNOWN bitmap handling, and nested-doc mapping. - Updates
search()query construction andBitSetQuery/BitSetWeightto preserve null bitmap semantics. - Adds diagnostics and focused unit tests for Variant binding, nested mapping, and empty-truth/null-bitmap behavior.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
be/src/exprs/function/function_search.cpp |
Integrates Variant resolver/evaluator, UNKNOWN queries, and direct-reader leaf query handling. |
be/src/exprs/function/function_search.h |
Moves Variant-specific resolver/evaluator APIs into the new header. |
be/src/exprs/function/variant_inverted_index_search.cpp |
Implements Variant field binding, nested leaf mapping, and nested search evaluation. |
be/src/exprs/function/variant_inverted_index_search.h |
Declares Variant search binding and nested mapping APIs. |
be/src/exprs/vsearch.cpp |
Adds Variant binding diagnostics during input collection and search evaluation. |
be/src/storage/index/inverted/query_v2/bit_set_query/bit_set_query.h |
Adds null-bitmap storage to BitSetQuery. |
be/src/storage/index/inverted/query_v2/bit_set_query/bit_set_weight.h |
Preserves scorers when only null bitmap data is present. |
be/src/storage/index/inverted/inverted_index_stats.h |
Stores capped Variant binding diagnostics. |
be/src/storage/index/inverted/inverted_index_profile.h |
Publishes binding diagnostics to runtime profile info strings. |
be/src/storage/segment/segment.cpp |
Adds index-file probe and iterator creation diagnostics. |
be/src/storage/segment/segment_iterator.cpp |
Passes stats into Variant subcolumn index discovery and logs iterator diagnostics. |
be/src/storage/segment/variant/variant_column_reader.cpp |
Adds direct/inherited/missing subcolumn index candidate diagnostics. |
be/src/storage/segment/variant/variant_column_reader.h |
Extends subcolumn index lookup API with optional stats. |
be/test/exprs/function/function_search_test.cpp |
Adds tests for Variant missing fields, resolver selection, and direct scalar reads. |
be/test/exprs/function/function_search_nested_test.cpp |
Adds tests for nested doc mapping and evaluator behavior. |
be/test/storage/index/inverted/query_v2/boolean_query_test.cpp |
Adds coverage for empty truth bitmap with preserved null bitmap. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+141
to
+146
| const bool is_text_field = | ||
| column_type != nullptr && is_string_type(column_type->get_storage_field_type()); | ||
| auto fb_it = _field_binding_map.find(field_name); | ||
| std::string analyzer_key; | ||
| if (is_text_field && is_variant_sub && fb_it != _field_binding_map.end() && | ||
| fb_it->second->__isset.index_properties && !fb_it->second->index_properties.empty()) { |
| @@ -252,6 +303,11 @@ Status VSearchExpr::evaluate_inverted_index(VExprContext* context, uint32_t segm | |||
| if (bundle.iterators.empty() && !is_nested_query) { | |||
Comment on lines
+1938
to
+1951
| FieldReaderBinding binding; | ||
| binding.logical_field_name = "var.items.active"; | ||
| binding.stored_field_name = "1.var.items.active"; | ||
| binding.stored_field_wstr = L"1.var.items.active"; | ||
| binding.column_type = bool_type; | ||
| binding.query_type = InvertedIndexQueryType::MATCH_PHRASE_QUERY; | ||
| binding.state = SearchFieldBindingState::BOUND; | ||
| TabletIndex index_meta; | ||
| binding.inverted_reader = std::make_shared<DummyInvertedIndexReader>(&index_meta); | ||
|
|
||
| std::string key = resolver.binding_key_for("1.var.items.active", | ||
| InvertedIndexQueryType::MATCH_PHRASE_QUERY); | ||
| binding.binding_key = key; | ||
| resolver._cache[key] = binding; |
Comment on lines
+2008
to
+2029
| FieldReaderBinding binding; | ||
| binding.logical_field_name = "var.items.active"; | ||
| binding.stored_field_name = "1.var.items.active"; | ||
| binding.stored_field_wstr = L"1.var.items.active"; | ||
| binding.column_type = bool_type; | ||
| binding.query_type = InvertedIndexQueryType::MATCH_ANY_QUERY; | ||
| binding.state = SearchFieldBindingState::BOUND; | ||
| TabletIndex index_meta; | ||
| binding.inverted_reader = std::make_shared<DummyInvertedIndexReader>(&index_meta); | ||
|
|
||
| std::string key = | ||
| resolver.binding_key_for("1.var.items.active", InvertedIndexQueryType::MATCH_ANY_QUERY); | ||
| binding.binding_key = key; | ||
| resolver._cache[key] = binding; | ||
|
|
||
| inverted_index::query_v2::QueryPtr out; | ||
| std::string out_binding_key; | ||
| Status st = function_search->build_leaf_query(clause, context, resolver, &out, &out_binding_key, | ||
| "OR", 0, 10); | ||
| ASSERT_TRUE(st.ok()); | ||
| ASSERT_NE(out, nullptr); | ||
| EXPECT_EQ(key, out_binding_key); |
Comment on lines
+2067
to
+2088
| FieldReaderBinding binding; | ||
| binding.logical_field_name = "var.items.flags.level"; | ||
| binding.stored_field_name = "1.var.items.flags.level"; | ||
| binding.stored_field_wstr = L"1.var.items.flags.level"; | ||
| binding.column_type = int_type; | ||
| binding.query_type = InvertedIndexQueryType::MATCH_ANY_QUERY; | ||
| binding.state = SearchFieldBindingState::BOUND; | ||
| TabletIndex index_meta; | ||
| binding.inverted_reader = std::make_shared<DummyInvertedIndexReader>(&index_meta); | ||
|
|
||
| std::string key = resolver.binding_key_for("1.var.items.flags.level", | ||
| InvertedIndexQueryType::MATCH_ANY_QUERY); | ||
| binding.binding_key = key; | ||
| resolver._cache[key] = binding; | ||
|
|
||
| inverted_index::query_v2::QueryPtr out; | ||
| std::string out_binding_key; | ||
| Status st = function_search->build_leaf_query(clause, context, resolver, &out, &out_binding_key, | ||
| "OR", 0, 10); | ||
| ASSERT_TRUE(st.ok()); | ||
| ASSERT_NE(out, nullptr); | ||
| EXPECT_EQ(key, out_binding_key); |
Contributor
|
skip buildall |
yiguolei
approved these changes
May 28, 2026
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
cherry-pick #63660 to branch-4.1