Skip to content

Commit abe8bf8

Browse files
kosabogiszabosteve
andauthored
Document 'delay' subparameter in transform checkpoints and usage guide (#872)
This PR updates the documentation to clarify how the `delay` subparameter works in continuous transforms. ### Preview: - [When to use transforms](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/872/explore-analyze/transforms/transform-usage) - [How transform checkpoints work](https://docs-v3-preview.elastic.dev/elastic/docs-content/pull/872/explore-analyze/transforms/transform-checkpoints) Related issue: elastic/developer-docs-team#198 --------- Co-authored-by: István Zoltán Szabó <[email protected]>
1 parent 83ab69a commit abe8bf8

File tree

2 files changed

+8
-0
lines changed

2 files changed

+8
-0
lines changed

explore-analyze/transforms/transform-checkpoints.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ To create a checkpoint, the {{ctransform}}:
1919

2020
Using a simple periodic timer, the {{transform}} checks for changes to the source indices. This check is done based on the interval defined in the transform’s `frequency` property.
2121

22+
If new data is ingested with a slight delay, it might not be immediately available when the {{transform}} runs. To prevent missing documents, you can use the `delay` parameter in the `sync` configuration. This shifts the search window backward, ensuring that late-arriving data is included before a checkpoint processes it. Adjusting this value based on your data ingestion patterns can help ensure completeness.
23+
2224
If the source indices remain unchanged or if a checkpoint is already in progress then it waits for the next timer.
2325

2426
If changes are found a checkpoint is created.

explore-analyze/transforms/transform-usage.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,9 @@ You might want to consider using {{transforms}} instead of aggregations when:
2727
* You want to create summary tables to optimize queries.
2828

2929
For example, if you have a high level dashboard that is accessed by a large number of users and it uses a complex aggregation over a large dataset, it may be more efficient to create a {{transform}} to cache results. Thus, each user doesn’t need to run the aggregation query.
30+
31+
* You need to account for late-arriving data.
32+
33+
In some cases, data might not be immediately available when a {{transform}} runs, leading to missing records in the destination index. This can happen due to ingestion delays, where documents take a few seconds or minutes to become searchable after being indexed. To handle this, the `delay` parameter in the {{transform}}’s sync configuration allows you to postpone processing new data. Instead of always querying the most recent records, the {{transform}} will skip a short period of time (for example, 60 seconds) to ensure all relevant data has arrived before processing.
34+
35+
For example, if a {{transform}} runs every 5 minutes, it usually processes data from 5 minutes ago up to the current time. However, if you set `delay` to 60 seconds, the {{transform}} will instead process data from 6 minutes ago up to 1 minute ago, making sure that any documents that arrived late are included. By adjusting the `delay` parameter, you can improve the accuracy of transformed data while still maintaining near real-time results.

0 commit comments

Comments
 (0)