Skip to content

Commit 5aa8764

Browse files
Merge #1630
1630: update documents.md r=maryamsulemani97 a=maryamsulemani97 Part of #1333 This PR updates `/learn/core_concepts/documents.md` I have removed the "[Limitation and requirements](https://docs.meilisearch.com/learn/core_concepts/documents.html#limitations-and-requirements)" section and moved the content to known _imitations.md Co-authored-by: Maryam Sulemani <[email protected]> Co-authored-by: Maryam <[email protected]>
2 parents 36c7b96 + ffc8296 commit 5aa8764

File tree

5 files changed

+63
-97
lines changed

5 files changed

+63
-97
lines changed

.code-samples.meilisearch.yaml

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -480,16 +480,6 @@ add_movies_json_1: |-
480480
-X POST 'http://127.0.0.1:7700/indexes/movies/documents'\
481481
-H 'Content-Type: application/json' \
482482
--data-binary @movies.json
483-
documents_guide_add_movie_1: |-
484-
curl \
485-
-X POST `http://localhost:7700/indexes/movies/documents` \
486-
-H 'Content-Type: application/json' \
487-
--data-binary '[
488-
{
489-
"movie_id": "123sq178",
490-
"title": "Amelie Poulain"
491-
}
492-
]'
493483
getting_started_add_documents_md: |-
494484
```bash
495485
curl \

.vuepress/public/sample-template.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,6 @@ settings_guide_distinct_1: |-
7373
settings_guide_searchable_1: |-
7474
settings_guide_displayed_1: |-
7575
settings_guide_sortable_1: |-
76-
documents_guide_add_movie_1: |-
7776
getting_started_add_documents_md: |-
7877
getting_started_search_md: |-
7978
getting_started_check_task_status: |-

learn/advanced/known_limitations.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,19 @@ This guide covers hard limits that cannot be altered. Meilisearch also has some
66

77
## Maximum number of query words
88

9-
**Limitation:** The maximum number of terms taken into account for each [search query](/reference/api/search.md#query-q) is 10. **If a search query includes more than 10 words, all words after the 10th will be ignored.**
9+
**Limitation:** The maximum number of terms taken into account for each [search query](/reference/api/search.md#query-q) is 10. If a search query includes more than 10 words, all words after the 10th will be ignored.
1010

1111
**Explanation:** Queries with many search terms can lead to long response times. This goes against our goal of providing a [fast search-as-you-type experience](/learn/what_is_meilisearch/philosophy.md#front-facing-search).
1212

13+
## Maximum number of document fields
14+
15+
**Limitation:** Documents have a soft maximum of 1000 fields.
16+
17+
**Explanation:** There is no limit on how many fields a document can have. However, documents with more than 1000 fields may cause the [ranking rules](/learn/core_concepts/relevancy.md#ranking-rules) to stop working, leading to undefined behavior.
18+
1319
## Maximum number of words per attribute
1420

15-
**Limitation:** Meilisearch can index a maximum of **65535 positions per attribute**. Any words exceeding the 65535 position limit will be silently ignored.
21+
**Limitation:** Meilisearch can index a maximum of 65535 positions per attribute. Any words exceeding the 65535 position limit will be silently ignored.
1622

1723
**Explanation:** This limit is enforced for relevancy reasons. The more words there are in a given attribute, the less relevant the search queries will be.
1824

learn/core_concepts/documents.md

Lines changed: 51 additions & 84 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
# Documents
22

3-
A **document** is an object composed of one or more **fields**. Each field consists of an **attribute** and its associated **value**.
4-
5-
Documents function as **containers for organizing data**, and are the basic building blocks of a Meilisearch database. To search for a document, it must first be added to an [index][indexes].
3+
A document is an object composed of one or more fields. Each field consists of an **attribute** and its associated **value**. Documents function as containers for organizing data, and are the basic building blocks of a Meilisearch database. To search for a document, you must first add it to an [index](/learn/core_concepts/indexes.md).
64

75
## Structure
86

@@ -11,28 +9,64 @@ Documents function as **containers for organizing data**, and are the basic buil
119
### Important terms
1210

1311
- **Document**: an object which contains data in the form of one or more fields
14-
- **[Field][fields]**: a set of two data items that are linked together: an **attribute** and a **value**
15-
- **Attribute**: the first part of a field. Acts as a name or description for its associated value.
12+
- **[Field](#fields)**: a set of two data items that are linked together: an attribute and a value
13+
- **Attribute**: the first part of a field. Acts as a name or description for its associated value
1614
- **Value**: the second part of a field, consisting of data of any valid JSON type
17-
- **[Primary Field][primary-field]**: A special field that is mandatory in all documents. It contains the primary key and document identifier.
18-
- **[Primary Key][primary-key]**: the attribute of the primary field. **All documents in the same index must possess the same primary key.** Its associated value is the document identifier.
19-
- **[Document Identifier][document-id]**: the value of the primary field. **Every document in a given index must have a unique identifier**.
15+
- **[Primary Field](#primary-field)**: a special field that is mandatory in all documents. It contains the primary key and document identifier
16+
17+
## Fields
18+
19+
A field is a set of two data items linked together: an attribute and a value. Documents are made up of fields.
20+
21+
An attribute functions a bit like a variable in most programming languages, i.e., it is a name that allows you to store, access, and describe some data. That data is the attribute's value. In the case of strings, a value **[can contain at most 65535 positions](/learn/advanced/known_limitations.md#maximum-number-of-words-per-attribute)**. Words exceeding the 65535 position limit will be ignored.
22+
23+
Every field has a data type dictated by its value. Every value must be a valid [JSON data type](https://www.w3schools.com/js/js_json_datatypes.asp).
24+
25+
If a field contains an object, Meilisearch flattens it during indexation using dot notation and brings the object's keys and values to the root level of the document itself. This flattened object is only an intermediary representation—you will get the original structure upon search. You can read more about this in our [dedicated guide](/learn/advanced/datatypes.md#objects).
26+
27+
With [ranking rules](/learn/core_concepts/relevancy.md#ranking-rules), you can decide what fields are more relevant than others. For example, you may decide recent movies should be more relevant than older ones. You can also configure how Meilisearch handles certain fields at an [index level](/learn/configuration/settings.md) in the settings.
28+
29+
### Displayed and searchable fields
30+
31+
By default, all fields in a document are both displayed and searchable. Displayed fields are contained in each matching document, while searchable fields are searched for matching query words.
32+
33+
You can modify this behavior using the [update settings endpoint](/reference/api/settings.md#update-settings), or the respective update endpoints for [displayed attributes](/reference/api/displayed_attributes.md#update-displayed-attributes), and [searchable attributes](/reference/api/searchable_attributes.md#update-searchable-attributes) so that a field is:
34+
35+
- Searchable but not displayed
36+
- Displayed but not searchable
37+
- Neither displayed nor searchable
38+
39+
In the latter case, the field will be completely ignored during search. However, it will still be [stored](/learn/configuration/displayed_searchable_attributes.md#data-storing) in the document.
40+
41+
## Primary field
42+
43+
The primary field is a special field that must be present in all documents. Its attribute is the [primary key](/learn/core_concepts/primary_key.md#primary-key-2) and its value is the [document id](/learn/core_concepts/primary_key.md#document-id). If you try to [index a document](/learn/getting_started/quick_start.md#add-documents) that's missing a primary key or possessing the wrong primary key for a given index, it will cause an error and no documents will be added.
44+
45+
To learn more, refer to the [primary key explanation](/learn/core_concepts/primary_key.md).
46+
47+
## Upload
48+
49+
By default, Meilisearch limits the size of all payloads—and therefore document uploads—to 100MB. You can [change the payload size limit](/learn/configuration/instance_options.md#payload-limit-size) at runtime using the `http-payload-size-limit` option.
50+
51+
Meilisearch uses a lot of RAM when indexing documents. Be aware of your [RAM availability](/resources/faq.md#what-are-the-recommended-requirements-for-hosting-a-meilisearch-instance) as you increase your batch size as this could cause Meilisearch to crash.
52+
53+
When using the [add new documents endpoint](/reference/api/documents.md#add-or-update-documents), all documents must be sent in an array even if there is only one document.
2054

2155
### Dataset format
2256

23-
You can provide your dataset in the following formats:
57+
Meilisearch accepts datasets in the following formats:
2458

2559
- [JSON](#json)
2660
- [NDJSON](#ndjson)
2761
- [CSV](#csv)
2862

2963
#### JSON
3064

31-
Documents represented as JSON objects are key-value pairs enclosed by curly brackets. As such, [any rule that applies to formatting JSON objects](https://www.w3schools.com/js/js_json_objects.asp) also applies to formatting Meilisearch documents. For example, **an attribute must be a string**, while **a value must be a valid [JSON data type](https://www.w3schools.com/js/js_json_datatypes.asp)**.
65+
Documents represented as JSON objects are key-value pairs enclosed by curly brackets. As such, [any rule that applies to formatting JSON objects](https://www.w3schools.com/js/js_json_objects.asp) also applies to formatting Meilisearch documents. For example, an attribute must be a string, while a value must be a valid [JSON data type](https://www.w3schools.com/js/js_json_datatypes.asp).
3266

3367
Meilisearch will only accept JSON documents when it receives the `application/json` content-type header.
3468

35-
As an example, let's say you are creating an **[index][indexes]** that contains information about movies. A sample document might look like this:
69+
As an example, let's say you are creating an index that contains information about movies. A sample document might look like this:
3670

3771
```json
3872
{
@@ -47,9 +81,11 @@ As an example, let's say you are creating an **[index][indexes]** that contains
4781
}
4882
```
4983

50-
In the above example, `"id"`, `"title"`, `"genres"`, `"release-year"`, and `"cast"` are **attributes**.
51-
Each attribute must be associated with a **value**, e.g. `"Kung Fu Panda"` is the value of `"title"`.
52-
At minimum, the document must contain one field with the **[primary key][primary-key]** attribute and a unique **[document id][document-id]** as its value. Above, that's: `"id": 1564`.
84+
In the above example:
85+
86+
- `"id"`, `"title"`, `"genres"`, `"release-year"`, and `"cast"` are attributes
87+
- Each attribute is associated with a value, e.g. `"Kung Fu Panda"` is the value of `"title"`
88+
- The document contains a field with the primary key attribute and a unique document id as its value: `"id": "1564saqw12ss"`
5389

5490
#### NDJSON
5591

@@ -82,75 +118,6 @@ The above JSON document would look like this in CSV:
82118

83119
Since CSV does not support arrays or nested objects, `cast` cannot be converted to CSV.
84120

85-
::: tip
121+
::: note
86122
If you don't specify the data type for an attribute, it will default to `:string`.
87123
:::
88-
89-
### Limitations and requirements
90-
91-
Documents have a **soft maximum of 1000 fields**; beyond that the [ranking rules](/learn/core_concepts/relevancy.md#ranking-rules) may no longer be effective, leading to undefined behavior.
92-
93-
Additionally, every document must have at minimum one field containing the **[primary key][primary-key]** and a **[unique id][document-id]**.
94-
95-
If you try to [index a document](/learn/getting_started/quick_start.md#add-documents) that's incorrectly formatted, missing a primary key, or possessing the [wrong primary key for a given index](/learn/core_concepts/indexes.md#primary-key), it will cause an error and no documents will be added.
96-
97-
## Fields
98-
99-
A field is a set of two data items linked together: an attribute and a value. Documents are made up of fields.
100-
101-
An attribute functions a bit like a variable in most programming languages, i.e. it is a name that allows you to store, access, and describe some data. That data is the attribute's **value**.
102-
103-
Every field has a [data type](/learn/advanced/datatypes.md) dictated by its value. Every value must be a valid [JSON data type](https://www.w3schools.com/js/js_json_datatypes.asp).
104-
105-
If a field contains an object, you can refer directly to its internal properties using dot notation: `attributeA.objectKeyA`. Dot notation also works with nested objects: `attributeA.objectKeyA.objectKeyB`. This syntax is supported across Meilisearch, including index settings and search parameters.
106-
107-
Take note that, in the case of strings, a value **[can contain at most 65535 positions](/learn/advanced/known_limitations.md#maximum-number-of-words-per-attribute). Words exceeding the 65535 position limit will be ignored.**
108-
109-
You can also apply [ranking rules](/learn/core_concepts/relevancy.md#ranking-rules) to some fields. For example, you may decide recent movies should be more relevant than older ones.
110-
111-
If you would like to adjust how a field gets handled by Meilisearch, you can do so in the [settings](/learn/configuration/settings.md).
112-
113-
### Field properties
114-
115-
A field may also possess **[field properties](/learn/configuration/displayed_searchable_attributes.md)**. Field properties determine the characteristics and behavior of the data added to that field.
116-
117-
At this time, there are two field properties: [searchable](/learn/configuration/displayed_searchable_attributes.md#searchable-fields) and [displayed](/learn/configuration/displayed_searchable_attributes.md#displayed-fields). A field can have one, both, or neither of these properties. **By default, all fields in a document are both displayed and searchable.**
118-
119-
To clarify, a field may be:
120-
121-
- Searchable but not displayed
122-
- Displayed but not searchable
123-
- Both displayed and searchable (default)
124-
- Neither displayed nor searchable
125-
126-
In the latter case, the field will be completely ignored when a search is performed. However, it will still be [stored](/learn/configuration/displayed_searchable_attributes.md#data-storing) in the document.
127-
128-
## Primary field
129-
130-
The primary field is a special field that must be present in all documents. Its attribute is the [primary key](/learn/core_concepts/primary_key.md#primary-key-2) and its value is the [document id](/learn/core_concepts/primary_key.md#document-id).
131-
132-
To learn more, refer to the [primary key explanation](/learn/core_concepts/primary_key.md).
133-
134-
## Upload
135-
136-
By default, Meilisearch limits the size of all payloads—and therefore document uploads—to 100MB.
137-
138-
To upload more documents in one go, it is possible to [change the payload size limit](/learn/configuration/instance_options.md#payload-limit-size) at runtime using the `http-payload-size-limit` option.
139-
140-
```bash
141-
./meilisearch --http-payload-size-limit=1048576000
142-
```
143-
144-
The above code sets the payload limit to 1GB, instead of the 100MB default.
145-
146-
**Meilisearch uses a lot of RAM when indexing documents**. Be aware of your [RAM availability](/resources/faq.md#what-are-the-recommended-requirements-for-hosting-a-meilisearch-instance) as you increase the size of your batch as this could cause Meilisearch to crash.
147-
148-
When using the [route to add new documents](/reference/api/documents.md#add-or-update-documents), all documents must be sent in an array **even if there is only one document**.
149-
150-
<CodeSamples id="documents_guide_add_movie_1" />
151-
152-
[primary-field]: /learn/core_concepts/documents.md#primary-field
153-
[primary-key]: /learn/core_concepts/primary_key.md#primary-key-2
154-
[document-id]: /learn/core_concepts/primary_key.md#document-id
155-
[fields]: /learn/core_concepts/documents.md#fields
156-
[indexes]: /learn/core_concepts/indexes.md

learn/getting_started/quick_start.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,10 @@ Open a new terminal window and run the following command:
180180

181181
Meilisearch stores data in the form of discrete records, called [documents](/learn/core_concepts/documents.md). Documents are grouped into collections, called [indexes](/learn/core_concepts/indexes.md).
182182

183+
::: note
184+
Currently, Meilisearch only supports [JSON, CSV, and NDJSON formats](/learn/core_concepts/documents.md#dataset-format).
185+
:::
186+
183187
The previous command added documents from `movies.json` to a new index called `movies`. After adding documents, you should receive a response like this:
184188

185189
```json

0 commit comments

Comments
 (0)