Skip to content

Flag in _field_caps to return only fields with values in index #103651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 122 commits into from
Feb 8, 2024
Merged
Show file tree
Hide file tree
Changes from 51 commits
Commits
Show all changes
122 commits
Select commit Hold shift + click to select a range
c8af341
add has_value to field_caps response
piergm Dec 19, 2023
e17fe2c
Merge branch 'elastic:main' into field_caps-has_value
piergm Dec 20, 2023
feccdc7
working prototype
piergm Dec 20, 2023
5ad8b99
move map to shard instead of index, adds queryparam
piergm Dec 21, 2023
ed3c3c6
rename query param to include_fields_with_no_value
piergm Dec 21, 2023
45d892c
code clean
piergm Dec 21, 2023
21061bb
include_fields_with_no_value defaults to true
piergm Dec 21, 2023
e92e9dc
defaults in tests
piergm Dec 21, 2023
838ebbd
defaults in tests
piergm Dec 21, 2023
f5bfd31
Update docs/changelog/103651.yaml
piergm Dec 21, 2023
4fc26d6
Merge branch 'main' into field_caps-has_value
piergm Dec 21, 2023
f886135
closing readers
piergm Dec 22, 2023
47af25e
small refactor
piergm Dec 22, 2023
683459f
move listener afterIndexShardStarted and close reader
piergm Dec 22, 2023
597f5a8
closes just searcher
piergm Dec 27, 2023
752d31e
updated tests
piergm Dec 28, 2023
a21d6fa
spotless
piergm Dec 28, 2023
c15dd78
added 1 to number of refresh of FrozenIndex test
piergm Dec 28, 2023
d6e9e65
updated docs and added tests
piergm Dec 28, 2023
8d1fa0a
Merge branch 'main' into field_caps-has_value
piergm Dec 28, 2023
58ab703
iter
piergm Dec 28, 2023
c19d573
iter
piergm Dec 28, 2023
76e5060
moved afterShardCreated to afterIndexCreated
piergm Dec 28, 2023
d53c56d
CI
piergm Dec 28, 2023
60d3731
Merge branch 'elastic:main' into field_caps-has_value
piergm Dec 29, 2023
322d65d
iter
piergm Dec 29, 2023
6e90f58
updated bwc tests
piergm Dec 29, 2023
5a91222
spotless
piergm Dec 29, 2023
4f4ef65
spotless
piergm Jan 3, 2024
8f37313
Merge branch 'main' into field_caps-has_value
piergm Jan 3, 2024
5add85b
alias are now returned if original field has value
piergm Jan 3, 2024
c1abe2b
mock IndexShard in tests
piergm Jan 3, 2024
d2332bd
reworked field caps request and response tests
piergm Jan 3, 2024
a500abd
removed todo
piergm Jan 3, 2024
62759c6
spotless
piergm Jan 3, 2024
39821d9
first set of tests
piergm Jan 3, 2024
7999611
set instead of map
piergm Jan 8, 2024
9c97ee5
iter
piergm Jan 10, 2024
734a72a
iter
piergm Jan 10, 2024
6da0369
iter
piergm Jan 10, 2024
ca6893a
more iter
piergm Jan 10, 2024
b6a7dee
Merge branch 'main' into field_caps-has_value
piergm Jan 10, 2024
db70f45
spotless
piergm Jan 10, 2024
e6c528d
updated tests
piergm Jan 10, 2024
cb2669b
iter
piergm Jan 15, 2024
a16e6ba
Merge branch 'main' into field_caps-has_value
piergm Jan 15, 2024
cc4edd3
correct tests resolve wrong merge of indexResponse
piergm Jan 15, 2024
75bea68
updated docs
piergm Jan 15, 2024
10ec350
add tests for nested fields
piergm Jan 15, 2024
8f848b8
add tests for obj fields
piergm Jan 15, 2024
774a21a
Merge branch 'elastic:main' into field_caps-has_value
piergm Jan 15, 2024
fc4c198
iter + tests
piergm Jan 16, 2024
b860d7e
Merge branch 'main' into field_caps-has_value
piergm Jan 16, 2024
7d17ca3
propagate flag in remote requests + comments
piergm Jan 16, 2024
9713967
Merge branch 'main' into field_caps-has_value
piergm Jan 18, 2024
c22377d
Update docs/reference/search/field-caps.asciidoc
piergm Jan 23, 2024
fdfbe3b
Merge branch 'main' into field_caps-has_value
piergm Jan 24, 2024
18e987b
iter
piergm Jan 24, 2024
31706c2
for loops instead of lambdas
piergm Jan 24, 2024
09bc9a6
Map instead of Set to leverage putIfAbsent for perf
piergm Jan 24, 2024
ed68390
Merge branch 'main' into field_caps-has_value
piergm Jan 25, 2024
49c6d87
handle updated docs
piergm Jan 25, 2024
341cd7c
lazy load non-empty fields
piergm Jan 25, 2024
8f8e623
revert frozenIndexTests due to lazy loading
piergm Jan 25, 2024
9d1fd93
sys prop to diable field has value feature
piergm Jan 30, 2024
25c906f
if feature disable use legacy hash
piergm Jan 30, 2024
7c768c8
moved field-has-value logic after refresh
piergm Jan 31, 2024
da717bd
Merge branch 'main' into field_caps-has_value
piergm Jan 31, 2024
578677d
Merge branch 'elastic:main' into field_caps-has_value
piergm Jan 31, 2024
72c972e
Update docs/reference/search/field-caps.asciidoc
piergm Jan 31, 2024
001e696
Merge branch 'main' into field_caps-has_value
piergm Feb 1, 2024
513083e
test for feature_rank
piergm Jan 31, 2024
da846a6
index shard to update field has value after refresh
piergm Feb 1, 2024
9547296
removed rank_feature tests from 'server'
piergm Feb 1, 2024
653a8f0
rank_feature tests in 'extras'
piergm Feb 1, 2024
50d8a8f
move fieldHasValue in MappedFieldType
piergm Feb 5, 2024
0cc060c
Merge branch 'main' into field_caps-has_value
piergm Feb 5, 2024
8fa1d0d
iter
piergm Feb 5, 2024
06b1641
iter
piergm Feb 5, 2024
f30daaf
javadoc + cleaning
piergm Feb 5, 2024
f290e71
fixing after restart issues
piergm Feb 5, 2024
e1f86ea
Merge branch 'main' into field_caps-has_value
piergm Feb 5, 2024
3ab7529
renamed include_fields_with_no_value to include_empty_fields + iter
piergm Feb 5, 2024
a5c269d
Merge branch 'elastic:main' into field_caps-has_value
piergm Feb 6, 2024
18c4353
keeping list of FieldInfos instead of merging them since they are alr…
piergm Feb 6, 2024
5a75987
rank_feature fieldType unitTests
piergm Feb 6, 2024
1bf7d39
RankFeatureMetaFieldType unitTests
piergm Feb 6, 2024
f5ae465
ScriptFieldTypes unitTests
piergm Feb 6, 2024
54edf53
ConstantFieldTypes unitTests
piergm Feb 6, 2024
09e6d92
MappedFieldTypeTest
piergm Feb 6, 2024
2f0ac60
iter
piergm Feb 6, 2024
8fa1b6e
renamed tests
piergm Feb 6, 2024
fa390e4
merging fieldInfos
piergm Feb 6, 2024
6f3ed17
rename
piergm Feb 6, 2024
1339277
removed tests covered by unit tests
piergm Feb 6, 2024
44904c8
basic yaml test
piergm Feb 6, 2024
29b93cd
iter
piergm Feb 6, 2024
8156756
corrected tests
piergm Feb 6, 2024
a7b64c4
yaml tests
piergm Feb 6, 2024
fe5245d
Merge branch 'main' into field_caps-has_value
piergm Feb 6, 2024
7391105
Merge branch 'elastic:main' into field_caps-has_value
piergm Feb 6, 2024
482c11f
yaml tests
piergm Feb 6, 2024
d9cd4f8
I've found restSpec
piergm Feb 6, 2024
9498dce
Merge branch 'main' into field_caps-has_value
piergm Feb 6, 2024
57de9f7
test
piergm Feb 6, 2024
8868e4d
iter
piergm Feb 7, 2024
f2db1cd
tests
piergm Feb 7, 2024
c0f453d
renamed test to standard
piergm Feb 7, 2024
5cef376
moved unmapped field to yml test
piergm Feb 7, 2024
daf034f
multi-cluster test
piergm Feb 7, 2024
5c8ea58
tested default behaviour in FieldTypeTestCase and overwritten when ne…
piergm Feb 7, 2024
4555a0e
removed old test class
piergm Feb 7, 2024
be7483f
standardizing tests
piergm Feb 7, 2024
058c6c8
Merge branch 'elastic:main' into field_caps-has_value
piergm Feb 7, 2024
b1aeb86
multi cluster tests
piergm Feb 7, 2024
4789d78
maybe fixed multi-cluster
piergm Feb 7, 2024
2e8f7b7
Merge branch 'main' into field_caps-has_value
piergm Feb 7, 2024
8115dcd
iter
piergm Feb 7, 2024
a4e2d8d
removed empty spaces
piergm Feb 8, 2024
b6d667b
Merge branch 'elastic:main' into field_caps-has_value
piergm Feb 8, 2024
4a2d503
iter
piergm Feb 8, 2024
be4e91a
Merge branch 'main' into field_caps-has_value
piergm Feb 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/103651.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 103651
summary: Flag in `_field_caps` to return only fields with values in index
area: Search
type: enhancement
issues: []
4 changes: 4 additions & 0 deletions docs/reference/search/field-caps.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,10 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=index-ignore-unavailab
(Optional, Boolean) If `true`, unmapped fields that are mapped in one index but not in another are included in the response. Fields that don't have any mapping are never included.
Defaults to `false`.

`include_fields_with_no_value`::
(Optional, Boolean) If `false`, fields that never had a value in any shards are not included in the response. Fields that are not empty are always included. This flag does not consider deletions, if a field was non-empty and all the documents containing that field were deleted the field is returned even if the flag is `false`.
Defaults to `true`.

`filters`::
(Optional, string) Comma-separated list of filters to apply to the response.
+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.unit.ByteSizeUnit;
import org.elasticsearch.common.unit.ByteSizeValue;
import org.elasticsearch.common.util.concurrent.ConcurrentCollections;
import org.elasticsearch.core.CheckedFunction;
import org.elasticsearch.core.CheckedRunnable;
import org.elasticsearch.core.IOUtils;
Expand Down Expand Up @@ -628,7 +629,8 @@ public static final IndexShard newIndexShard(
cbs,
IndexModule.DEFAULT_SNAPSHOT_COMMIT_SUPPLIER,
System::nanoTime,
null
null,
ConcurrentCollections.newConcurrentSet()
);
}

Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,7 @@ static TransportVersion def(int id) {
public static final TransportVersion HOT_THREADS_AS_BYTES = def(8_571_00_0);
public static final TransportVersion ML_INFERENCE_REQUEST_INPUT_TYPE_ADDED = def(8_572_00_0);
public static final TransportVersion ESQL_ENRICH_POLICY_CCQ_MODE = def(8_573_00_0);
public static final TransportVersion FIELD_CAPS_FIELD_HAS_VALUE = def(8_574_00_0);

/*
* STOP! READ THIS FIRST! No, really,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import org.elasticsearch.core.Nullable;
import org.elasticsearch.index.IndexService;
import org.elasticsearch.index.engine.Engine;
import org.elasticsearch.index.mapper.ConstantFieldType;
import org.elasticsearch.index.mapper.MappedFieldType;
import org.elasticsearch.index.mapper.RuntimeField;
import org.elasticsearch.index.query.MatchAllQueryBuilder;
Expand Down Expand Up @@ -39,10 +40,12 @@
*/
class FieldCapabilitiesFetcher {
private final IndicesService indicesService;
private final boolean includeFieldsWithNoValue;
private final Map<String, Map<String, IndexFieldCapabilities>> indexMappingHashToResponses = new HashMap<>();

FieldCapabilitiesFetcher(IndicesService indicesService) {
FieldCapabilitiesFetcher(IndicesService indicesService, boolean includeFieldsWithNoValue) {
this.indicesService = indicesService;
this.includeFieldsWithNoValue = includeFieldsWithNoValue;
}

FieldCapabilitiesIndexResponse fetch(
Expand Down Expand Up @@ -100,7 +103,19 @@ private FieldCapabilitiesIndexResponse doFetch(
}

final MappingMetadata mapping = indexService.getMetadata().mapping();
final String indexMappingHash = mapping != null ? mapping.getSha256() : null;
String indexMappingHash;
if (includeFieldsWithNoValue) {
indexMappingHash = mapping != null ? mapping.getSha256() : null;
} else {
// even if the mapping is the same if we return only fields with values we need
// to make sure that we consider all the shard-mappings pair, that is why we
// calculate a different hash for this particular case.
StringBuilder sb = new StringBuilder(indexService.getShard(shardId.getId()).getShardUuid());
if (mapping != null) {
sb.append(mapping.getSha256());
}
indexMappingHash = sb.toString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to understand how badly this affects the CCS optimizations around de-duplicating mappings. Are we going back to duplicating mappings for the sake of honouring the flag? Is this a reason why we may want to consider incorporating the flag in the mappings perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test it out and get back to you, it seems that we force the other cluster to not merge the results but instead send us back all individual index results therefore this change will duplicate mappings.
I am thinking we could let the cluster merge the results in case this flag is active.
I'll create a test in CCSFieldCapabilitiesIT and keep you posted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This do not affect CCS and de-duplicating is still working fine.
Normal de-duplication is done through hash first and by index name later (indexResponses.putIfAbsent(resp.getIndexName(), resp)).
In case the flag is set and we have multiple responses we perform a merge of just the field capabilities, returning one response per index with mapping and field capabilities merged from all intra-cluster index responses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assessment does not sound accurate to me. The mapping hash is used to deduplicate mappings, and decide to send back one mapping per hash instead one mapping per index. But the whole mapping needs to be sent back?

Using the flag, the index mapping hash deduplication is entirely disabled. We do have now field level deduplication across indices though, and the benefit that not returning empty fields will make responses smaller and decrease overhead on the merging side. But again, it is an entirely different trade-off and we need to measure the impact in the worst case scenario, compared to the previous index mapping hash based deduplication. Makes sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, my assessment was wrongly assuming that who different index would always have two different mapping.

}
if (indexMappingHash != null) {
final Map<String, IndexFieldCapabilities> existing = indexMappingHashToResponses.get(indexMappingHash);
if (existing != null) {
Expand All @@ -114,7 +129,9 @@ private FieldCapabilitiesIndexResponse doFetch(
fieldNameFilter,
filters,
fieldTypes,
fieldPredicate
fieldPredicate,
indicesService.getShardOrNull(shardId),
includeFieldsWithNoValue
);
if (indexMappingHash != null) {
indexMappingHashToResponses.put(indexMappingHash, responseMap);
Expand All @@ -127,7 +144,9 @@ static Map<String, IndexFieldCapabilities> retrieveFieldCaps(
Predicate<String> fieldNameFilter,
String[] filters,
String[] types,
Predicate<String> indexFieldfilter
Predicate<String> indexFieldfilter,
IndexShard indexShard,
boolean includeFieldsWithNoValue
) {
boolean includeParentObjects = checkIncludeParents(filters);

Expand All @@ -139,7 +158,8 @@ static Map<String, IndexFieldCapabilities> retrieveFieldCaps(
continue;
}
MappedFieldType ft = context.getFieldType(field);
if (filter.test(ft)) {
boolean includeField = includeFieldsWithNoValue || indexShard.fieldHasValue(ft.name()) || ft instanceof ConstantFieldType;
if (includeField && filter.test(ft)) {
IndexFieldCapabilities fieldCap = new IndexFieldCapabilities(
field,
ft.familyTypeName(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ class FieldCapabilitiesNodeRequest extends ActionRequest implements IndicesReque
private final QueryBuilder indexFilter;
private final long nowInMillis;
private final Map<String, Object> runtimeFields;
private final boolean includeFieldsWithNoValue;

FieldCapabilitiesNodeRequest(StreamInput in) throws IOException {
super(in);
Expand All @@ -55,6 +56,11 @@ class FieldCapabilitiesNodeRequest extends ActionRequest implements IndicesReque
indexFilter = in.readOptionalNamedWriteable(QueryBuilder.class);
nowInMillis = in.readLong();
runtimeFields = in.readGenericMap();
if (in.getTransportVersion().onOrAfter(TransportVersions.FIELD_CAPS_FIELD_HAS_VALUE)) {
includeFieldsWithNoValue = in.readBoolean();
} else {
includeFieldsWithNoValue = true;
}
}

FieldCapabilitiesNodeRequest(
Expand All @@ -65,7 +71,8 @@ class FieldCapabilitiesNodeRequest extends ActionRequest implements IndicesReque
OriginalIndices originalIndices,
QueryBuilder indexFilter,
long nowInMillis,
Map<String, Object> runtimeFields
Map<String, Object> runtimeFields,
boolean includeFieldsWithNoValue
) {
this.shardIds = Objects.requireNonNull(shardIds);
this.fields = fields;
Expand All @@ -75,6 +82,7 @@ class FieldCapabilitiesNodeRequest extends ActionRequest implements IndicesReque
this.indexFilter = indexFilter;
this.nowInMillis = nowInMillis;
this.runtimeFields = runtimeFields;
this.includeFieldsWithNoValue = includeFieldsWithNoValue;
}

public String[] fields() {
Expand Down Expand Up @@ -119,6 +127,10 @@ public long nowInMillis() {
return nowInMillis;
}

public boolean includeFieldsWithNoValue() {
return includeFieldsWithNoValue;
}

@Override
public void writeTo(StreamOutput out) throws IOException {
super.writeTo(out);
Expand All @@ -132,6 +144,9 @@ public void writeTo(StreamOutput out) throws IOException {
out.writeOptionalNamedWriteable(indexFilter);
out.writeLong(nowInMillis);
out.writeGenericMap(runtimeFields);
if (out.getTransportVersion().onOrAfter(TransportVersions.FIELD_CAPS_FIELD_HAS_VALUE)) {
out.writeBoolean(includeFieldsWithNoValue);
}
}

@Override
Expand All @@ -143,16 +158,24 @@ public ActionRequestValidationException validate() {
public String getDescription() {
final StringBuilder stringBuilder = new StringBuilder("shards[");
Strings.collectionToDelimitedStringWithLimit(shardIds, ",", "", "", 1024, stringBuilder);
return completeDescription(stringBuilder, fields, filters, allowedTypes);
return completeDescription(stringBuilder, fields, filters, allowedTypes, includeFieldsWithNoValue);
}

static String completeDescription(StringBuilder stringBuilder, String[] fields, String[] filters, String[] allowedTypes) {
static String completeDescription(
StringBuilder stringBuilder,
String[] fields,
String[] filters,
String[] allowedTypes,
boolean includeFieldsWithNoValue
) {
stringBuilder.append("], fields[");
Strings.collectionToDelimitedStringWithLimit(Arrays.asList(fields), ",", "", "", 1024, stringBuilder);
stringBuilder.append("], filters[");
Strings.collectionToDelimitedString(Arrays.asList(filters), ",", "", "", stringBuilder);
stringBuilder.append("], types[");
Strings.collectionToDelimitedString(Arrays.asList(allowedTypes), ",", "", "", stringBuilder);
stringBuilder.append("], includeFieldsWithNoValue[");
stringBuilder.append(includeFieldsWithNoValue);
stringBuilder.append("]");
return stringBuilder.toString();
}
Expand All @@ -179,12 +202,13 @@ public boolean equals(Object o) {
&& Arrays.equals(allowedTypes, that.allowedTypes)
&& Objects.equals(originalIndices, that.originalIndices)
&& Objects.equals(indexFilter, that.indexFilter)
&& Objects.equals(runtimeFields, that.runtimeFields);
&& Objects.equals(runtimeFields, that.runtimeFields)
&& includeFieldsWithNoValue == that.includeFieldsWithNoValue;
}

@Override
public int hashCode() {
int result = Objects.hash(originalIndices, indexFilter, nowInMillis, runtimeFields);
int result = Objects.hash(originalIndices, indexFilter, nowInMillis, runtimeFields, includeFieldsWithNoValue);
result = 31 * result + shardIds.hashCode();
result = 31 * result + Arrays.hashCode(fields);
result = 31 * result + Arrays.hashCode(filters);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ public final class FieldCapabilitiesRequest extends ActionRequest implements Ind
private String[] filters = Strings.EMPTY_ARRAY;
private String[] types = Strings.EMPTY_ARRAY;
private boolean includeUnmapped = false;
private boolean includeFieldsWithNoValue = true;
// pkg private API mainly for cross cluster search to signal that we do multiple reductions ie. the results should not be merged
private boolean mergeResults = true;
private QueryBuilder indexFilter;
Expand All @@ -62,6 +63,9 @@ public FieldCapabilitiesRequest(StreamInput in) throws IOException {
filters = in.readStringArray();
types = in.readStringArray();
}
if (in.getTransportVersion().onOrAfter(TransportVersions.FIELD_CAPS_FIELD_HAS_VALUE)) {
includeFieldsWithNoValue = in.readBoolean();
}
}

public FieldCapabilitiesRequest() {}
Expand Down Expand Up @@ -100,6 +104,9 @@ public void writeTo(StreamOutput out) throws IOException {
out.writeStringArray(filters);
out.writeStringArray(types);
}
if (out.getTransportVersion().onOrAfter(TransportVersions.FIELD_CAPS_FIELD_HAS_VALUE)) {
out.writeBoolean(includeFieldsWithNoValue);
}
}

@Override
Expand Down Expand Up @@ -168,6 +175,11 @@ public FieldCapabilitiesRequest includeUnmapped(boolean includeUnmapped) {
return this;
}

public FieldCapabilitiesRequest includeFieldsWithNoValue(boolean includeFieldsWithNoValue) {
this.includeFieldsWithNoValue = includeFieldsWithNoValue;
return this;
}

@Override
public String[] indices() {
return indices;
Expand All @@ -192,6 +204,10 @@ public boolean includeUnmapped() {
return includeUnmapped;
}

public boolean includeFieldsWithNoValue() {
return includeFieldsWithNoValue;
}

/**
* Allows to filter indices if the provided {@link QueryBuilder} rewrites to `match_none` on every shard.
*/
Expand Down Expand Up @@ -247,12 +263,21 @@ public boolean equals(Object o) {
&& Objects.equals(nowInMillis, that.nowInMillis)
&& Arrays.equals(filters, that.filters)
&& Arrays.equals(types, that.types)
&& Objects.equals(runtimeFields, that.runtimeFields);
&& Objects.equals(runtimeFields, that.runtimeFields)
&& includeFieldsWithNoValue == that.includeFieldsWithNoValue;
}

@Override
public int hashCode() {
int result = Objects.hash(indicesOptions, includeUnmapped, mergeResults, indexFilter, nowInMillis, runtimeFields);
int result = Objects.hash(
indicesOptions,
includeUnmapped,
mergeResults,
indexFilter,
nowInMillis,
runtimeFields,
includeFieldsWithNoValue
);
result = 31 * result + Arrays.hashCode(indices);
result = 31 * result + Arrays.hashCode(fields);
result = 31 * result + Arrays.hashCode(filters);
Expand All @@ -264,7 +289,7 @@ public int hashCode() {
public String getDescription() {
final StringBuilder stringBuilder = new StringBuilder("indices[");
Strings.collectionToDelimitedStringWithLimit(Arrays.asList(indices), ",", "", "", 1024, stringBuilder);
return FieldCapabilitiesNodeRequest.completeDescription(stringBuilder, fields, filters, types);
return FieldCapabilitiesNodeRequest.completeDescription(stringBuilder, fields, filters, types, includeFieldsWithNoValue);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,11 @@ public FieldCapabilitiesRequestBuilder setIncludeUnmapped(boolean includeUnmappe
return this;
}

public FieldCapabilitiesRequestBuilder setIncludeFieldsWithNoValue(boolean includeFieldsWithNoValue) {
request().includeFieldsWithNoValue(includeFieldsWithNoValue);
return this;
}

public FieldCapabilitiesRequestBuilder setIndexFilter(QueryBuilder indexFilter) {
request().indexFilter(indexFilter);
return this;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,8 @@ private void sendRequestToNode(String nodeId, List<ShardId> shardIds) {
originalIndices,
fieldCapsRequest.indexFilter(),
nowInMillis,
fieldCapsRequest.runtimeFields()
fieldCapsRequest.runtimeFields(),
fieldCapsRequest.includeFieldsWithNoValue()
);
transportService.sendChildRequest(
node,
Expand All @@ -203,7 +204,10 @@ private void afterRequestsCompleted(int numRequests) {
private void onRequestResponse(List<ShardId> shardIds, FieldCapabilitiesNodeResponse nodeResponse) {
for (FieldCapabilitiesIndexResponse indexResponse : nodeResponse.getIndexResponses()) {
if (indexResponse.canMatch()) {
if (indexSelectors.remove(indexResponse.getIndexName()) != null) {
if (fieldCapsRequest.includeFieldsWithNoValue() == false) {
// we accept all the responses because they may vary from node to node if we exclude empty fields
onIndexResponse.accept(indexResponse);
} else if (indexSelectors.remove(indexResponse.getIndexName()) != null) {
onIndexResponse.accept(indexResponse);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,15 @@ private void doExecuteForked(Task task, FieldCapabilitiesRequest request, final
resp = new FieldCapabilitiesIndexResponse(resp.getIndexName(), curr.getIndexMappingHash(), curr.get(), true);
}
}
indexResponses.merge(resp.getIndexName(), resp, (a, b) -> {
if (request.includeFieldsWithNoValue() || a.get().equals(b.get())) {
return a;
}
Map<String, IndexFieldCapabilities> mergedCaps = new HashMap<>(a.get());
mergedCaps.putAll(b.get());
return new FieldCapabilitiesIndexResponse(a.getIndexName(), a.getIndexMappingHash(), mergedCaps, a.canMatch());

});
indexResponses.putIfAbsent(resp.getIndexName(), resp);
if (fieldCapTask.isCancelled()) {
releaseResourcesOnCancel.run();
Expand Down Expand Up @@ -511,7 +520,7 @@ public void messageReceived(FieldCapabilitiesNodeRequest request, TransportChann
final Map<String, List<ShardId>> groupedShardIds = request.shardIds()
.stream()
.collect(Collectors.groupingBy(ShardId::getIndexName));
final FieldCapabilitiesFetcher fetcher = new FieldCapabilitiesFetcher(indicesService);
final FieldCapabilitiesFetcher fetcher = new FieldCapabilitiesFetcher(indicesService, request.includeFieldsWithNoValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall anymore: we said that the feature requires going to all shards, like index_filter does? Where do we make that decision? What happens if you provide the new flag without an index_filter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that all the _field_caps requests are going to all the shards. The current behaviour is to send from the coordinating node one request per index to all the data nodes which in turn sends a request to all the shards that have such index. Just the first response from a shard for a particular index is then accepted in a data node and one node response for each index is accepted in the coordinating node while all the other pending requests are cancelled or discarded.
The change in behaviour I introduced is to accept all responses in both stages (here and here) and merging (here) the index responses at the data node level in case we have multiple responses for one index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@original-brownbear is what I said correct?

Answering your question:

What happens if you provide the new flag without an index_filter

I have tests for requests with and without index_filter and the behaviour is consistent in both cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe your tests don't have enough data nodes or shards to see issues, but I am pretty sure that field caps without index filter will pick one shard per index, instead of all shards for every index. Can you double check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further analysis all code paths can go to all shards and all but my code path (return non-empty-fields only) breaks out of this loop when they find the first canMatch shard. When no index_filter is passed the first responding shard is always canMatch and breaks directly (unless there is an error), while when an index_filter is provided that could not be the case and the loop iterates to other shards until one returns a canMatch.
That said we should be good, in both cases (with and without index_filter) if my flag is provided we do not break out of the loop and therefore collect all the field capabilities from all the shards.
Another safe measure in place to avoid going to all the shards is an early cached return if the mapping hash is already present in a map, we are covered also there since for non-empty-field we calculate a unique shard-mapping hash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool thanks for the update. Could you check that we have test coverage about the new flag used in combination with index_filter, as well as not? Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In FieldCapsHasValueTests we have tests both with and without index_filter, so we should be good

final Predicate<String> fieldNameFilter = Regex.simpleMatcher(request.fields());
for (List<ShardId> shardIds : groupedShardIds.values()) {
final Map<ShardId, Exception> failures = new HashMap<>();
Expand All @@ -532,7 +541,9 @@ public void messageReceived(FieldCapabilitiesNodeRequest request, TransportChann
unmatched.clear();
failures.clear();
allResponses.add(response);
break;
if (request.includeFieldsWithNoValue()) {
break;
}
} else {
unmatched.add(shardId);
}
Expand Down
Loading