-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Flag in _field_caps to return only fields with values in index #103651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @piergm, I've created a changelog YAML for you. |
A quick question, in a CCS scenario, when there are old CCS clusters targeted, and just the requesting machine has the |
@kertal, yes, in CCS if we target es version prior to 8.13 we will not consider the flag and therefore we will be returning all fields, empty and non-empty, from those non updated clusters. That said the response per se is bwc and can be handled from the updated cluster, therefore we will not return any error from this miss match, instead we will return also empty fields from non updated clusters. |
thx for the confirmation @piergm 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some cosmetic comments, mostly around testing refinements. LGTM otherwise.
...er-extras/src/test/java/org/elasticsearch/index/mapper/extras/FieldCapsRankFeatureTests.java
Outdated
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/index/mapper/FieldTypeTestCase.java
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/index/mapper/MapperServiceTestCase.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Outdated
Show resolved
Hide resolved
@@ -519,7 +531,7 @@ public void messageReceived(FieldCapabilitiesNodeRequest request, TransportChann | |||
final Map<String, List<ShardId>> groupedShardIds = request.shardIds() | |||
.stream() | |||
.collect(Collectors.groupingBy(ShardId::getIndexName)); | |||
final FieldCapabilitiesFetcher fetcher = new FieldCapabilitiesFetcher(indicesService); | |||
final FieldCapabilitiesFetcher fetcher = new FieldCapabilitiesFetcher(indicesService, request.includeFieldsWithNoValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool thanks for the update. Could you check that we have test coverage about the new flag used in combination with index_filter, as well as not? Thanks!
if (mapping != null) { | ||
sb.append(mapping.getSha256()); | ||
} | ||
indexMappingHash = sb.toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assessment does not sound accurate to me. The mapping hash is used to deduplicate mappings, and decide to send back one mapping per hash instead one mapping per index. But the whole mapping needs to be sent back?
Using the flag, the index mapping hash deduplication is entirely disabled. We do have now field level deduplication across indices though, and the benefit that not returning empty fields will make responses smaller and decrease overhead on the merging side. But again, it is an entirely different trade-off and we need to measure the impact in the worst case scenario, compared to the previous index mapping hash based deduplication. Makes sense?
Performance wise this is just fine for the non-ccs case. The logic is optimal enough at this point it seems to make serialisation fo the response the most important aspect of this. I could not easily reproduce a case where the flag had much measurable overhead (maybe a couple percent when all fields have data but that's it, not a crazy scientific benchmark run here but still :)). E.g.
vs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not re-read everything again, but performance is fine in my testing and a quick read of the code looks just fine still -> LGTM
🎉 👋 🙇 ! |
Amazing 🤩 great job all 👍 |
@piergm according to this PR's labels, I need to update the changelog YAML, but I can't because the PR is closed. Please either update the changelog yourself on the appropriate branch, or adjust the labels. Specifically:
|
…d of custom query. (#178699) ## Summary Part of #178606. As of elastic/elasticsearch#103651 there is a new field caps option `include_empty_fields`. This PR updates AIOps Log Rate Analysis to make use of this option instead of a custom query and code that identified populated fields. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/5482 - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
We are adding a query parameter to the field_caps api in order to filter out fields with no values.
The parameter is called
include_empty_fields
and defaults to true, and if set to false itwill filter out from the field_caps response all the fields that has no value in the index.
We keep track of FieldInfos during refresh in order to know which field has value in an index.
We added also a sys.prop
es.field_caps_empty_fields_filter
in order to disable this feature if needed.