[DOCS] Adds section about tokens to ELSER conceptual (#2568)

szabosteve · mergify[bot] · commit 7ac7c7b0b5a0 · 2023-10-18T07:49:38.000Z
* [DOCS] Adds section about tokens to ELSER conceptual. * [DOCS] Adds 'discrete' flag to section. (cherry picked from commit f9c8a20) # Conflicts: # docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
diff --git a/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc b/docs/en/stack/ml/nlp/ml-nlp-elser.asciidoc
@@ -20,13 +20,36 @@ meaning and user intent, rather than exact keyword matches.
 ELSER is an out-of-domain model which means it does not require fine-tuning on 
 your own data, making it adaptable for various use cases out of the box.
 
+
+[discrete]
+[[elser-tokens]]
+== Tokens - not synonyms
+
 ELSER expands the indexed and searched passages into collections of terms that 
 are learned to co-occur frequently within a diverse set of training data. The 
 terms that the text is expanded into by the model _are not_ synonyms for the 
+<<<<<<< HEAD
 search terms; they are learned associations. These expanded terms are weighted 
 as some of them are more significant than others. Then the {es} 
 {ref}/rank-features.html[rank features field type] is used to store the terms 
 and weights at index time, and to search against later. 
+=======
+search terms; they are learned associations capturing relevance. These expanded 
+terms are weighted as some of them are more significant than others. Then the 
+{es} {ref}/sparse-vector.html[sparse vector] 
+(or {ref}/rank-features.html[rank features]) field type is used to store the 
+terms and weights at index time, and to search against later.
+>>>>>>> f9c8a202 ([DOCS] Adds section about tokens to ELSER conceptual (#2568))
+
+This approach provides a more understandable search experience compared to 
+vector embeddings. However, attempting to directly interpret the tokens and 
+weights can be misleading, as the expansion essentially results in a vector in a 
+very high-dimensional space. Consequently, certain tokens, especially those with 
+low weight, contain information that is intertwined with other low-weight tokens 
+in the representation. In this regard, they function similarly to a dense vector 
+representation, making it challenging to separate their individual 
+contributions. This complexity can potentially lead to misinterpretations if not 
+carefully considered during analysis.
 
 
 [discrete]