You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<!-- description --> Create a custom schema for custom documents (that are not supported out of the box) to extract information from similar documents using the Document Information Extraction service.
10
+
<!-- description --> Create a custom schema for custom documents (which are not supported out of the box) to extract information from similar documents using the Document Information Extraction service.
11
11
12
12
## You will learn
13
13
- How to create a custom schema for custom documents
14
14
- How to add standard and custom data fields for the header information of custom documents
15
15
16
16
## Intro
17
-
The core functionality of Document Information Extraction is to automatically extract structured information from documents using machine learning. The service supports extraction from the following standard document types out of the box: invoices, payment advices and purchase orders.
17
+
The core functionality of Document Information Extraction is to automatically extract structured information from documents using machine learning. The service supports extraction from the following standard document types out of the box: invoices, payment advices, and purchase orders.
18
18
19
19
You can also use the [Schema Configuration](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/3c7862e30fc2488ea95f58f1d77e424e.html) and [Template](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/1eeb08998f49409681c06a01febc3172.html) features to extract information from custom documents that are different from the standard document types. You can customize the information extracted from custom document types by creating a custom schema and adding the specific information that you have in your documents.
20
20
21
21
In this tutorial, we'll use power of attorney documents as an example of a custom document type that is not supported by Document Information Extraction out of the box. A power of attorney document is a legal instrument authorizing one to act as the attorney or agent for another person in specified or all legal or financial matters.
22
22
23
-
If you are new to the Document Information Extraction UI, try out first the tutorial: [Use Machine Learning to Extract Information from Documents with Document Information Extraction UI](cp-aibus-dox-ui).
23
+
If you are new to the Document Information Extraction UI, first try out the tutorial: [Use Machine Learning to Extract Information from Documents with Document Information Extraction UI](cp-aibus-dox-ui).
24
24
25
25
---
26
26
@@ -30,7 +30,7 @@ If you are new to the Document Information Extraction UI, try out first the tuto
30
30
1. Open the Document Information Extraction UI, as described in the tutorial: [Use Trial to Set Up Account for Document Information Extraction and Go to Application](cp-aibus-dox-booster-app) or [Use Free Tier to Set Up Account for Document Information Extraction and Go to Application](cp-aibus-dox-free-booster-app).
31
31
32
32
33
-
>If you **HAVE NOT** just used the **Set up account for Document Information Extraction** booster to create a service instance for Document Information Extraction, and subscribe to the Document Information Extraction UI, observe the following:
33
+
>If you **HAVE NOT** just used the **Set up account for Document Information Extraction** booster to create a service instance for Document Information Extraction and subscribe to the Document Information Extraction UI, observe the following:
34
34
35
35
>- To access the [Schema Configuration](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/3c7862e30fc2488ea95f58f1d77e424e.html) and [Template](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/1eeb08998f49409681c06a01febc3172.html) features, ensure that you use the `blocks_of_100` plan to create the service instance for Document Information Extraction Trial.
36
36
@@ -51,7 +51,7 @@ If you are new to the Document Information Extraction UI, try out first the tuto
Here, you find the SAP schemas. The Document Information Extraction UI includes preconfigured SAP schemas for the following standard document types: purchase order, payment advice, and invoice. In addition, there’s an SAP schema for custom documents (`SAP_OCROnly_schema`). You can't delete SAP schemas. You can use them as they're, you can edit them directly, or create copies and adapt the list of fields according to your needs.
54
+
Here, you find the SAP schemas. The Document Information Extraction UI includes preconfigured SAP schemas for the following standard document types: purchase order, payment advice, and invoice. In addition, there’s an SAP schema for custom documents (`SAP_OCROnly_schema`). You can't delete SAP schemas. You can use them as they are, you can edit them directly, or create copies and adapt the list of fields according to your needs.
@@ -98,13 +98,13 @@ As your first header field, add the shipper number of your power of attorney doc
98
98
99
99
1. Enter an appropriate name for your field, `shipperNumber`, for example.
100
100
101
-
2. Select `string` for the `Data Type`. Note that a shipper number is a `string`, even though it consists of numbers, as it is an arbitrary combination of numbers without meaning. In contrast, price is an example for the data type `number`.
101
+
2. Select `string` for the `Data Type`. Note that a shipper number is a `string`, even though it consists of numbers, as it is an arbitrary combination of numbers without meaning. In contrast, price is an example of the data type `number`.
102
102
103
-
3.Click**Add** to create the header field.
103
+
3.Select `default` for the `Setup Type` and click**Add** to create the header field.
104
104
105
105
<!-- border -->
106
106
107
-
The field now displays in your list of header fields where you find all the information again that you have just entered. You can edit or delete the field by clicking the respective icons on the right.
107
+
The field now displays in your list of header fields, where you again find all the information that you have just entered. You can edit or delete the field by clicking the respective icons on the right.
108
108
109
109
<!-- border -->
110
110
@@ -114,7 +114,7 @@ Click **Add** again to open the `Add Data Field` dialog.
114
114
115
115
2. Select `string` for the `Data Type`.
116
116
117
-
3.Click**Add** to create the field.
117
+
3.Select `default` for the `Setup Type` and click**Add** to create the field.
- How to add standard and custom data fields for the header and line item information of purchase order documents
17
17
18
18
## Intro
19
-
The core functionality of Document Information Extraction is to automatically extract structured information from documents using machine learning. The service supports extraction from the following standard document types out of the box: invoices, payment advices and purchase orders. You can customize the information extracted from these document types by creating a custom schema and adding the specific information that you have in your documents. Additionally, you can add completely new document types.
19
+
The core functionality of Document Information Extraction is to automatically extract structured information from documents using machine learning. The service supports extraction from the following standard document types out of the box: invoices, payment advices, and purchase orders. You can customize the information extracted from these document types by creating a custom schema and adding the specific information that you have in your documents. Additionally, you can add completely new document types.
20
20
21
-
If you are new to the Document Information Extraction UI, try out first the tutorial: [Use Machine Learning to Extract Information from Documents with Document Information Extraction UI](cp-aibus-dox-ui).
21
+
If you are new to the Document Information Extraction UI, first try out the tutorial: [Use Machine Learning to Extract Information from Documents with Document Information Extraction UI](cp-aibus-dox-ui).
22
22
23
23
---
24
24
@@ -28,7 +28,7 @@ If you are new to the Document Information Extraction UI, try out first the tuto
28
28
1. Open the Document Information Extraction UI, as described in the tutorial: [Use Trial to Set Up Account for Document Information Extraction and Go to Application](cp-aibus-dox-booster-app) or [Use Free Tier to Set Up Account for Document Information Extraction and Go to Application](cp-aibus-dox-free-booster-app).
29
29
30
30
31
-
>If you **HAVE NOT** just used the **Set up account for Document Information Extraction** booster to create a service instance for Document Information Extraction, and subscribe to the Document Information Extraction UI, observe the following:
31
+
>If you **HAVE NOT** just used the **Set up account for Document Information Extraction** booster to create a service instance for Document Information Extraction and subscribe to the Document Information Extraction UI, observe the following:
32
32
33
33
>- To access the [Schema Configuration](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/3c7862e30fc2488ea95f58f1d77e424e.html) and [Template](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/1eeb08998f49409681c06a01febc3172.html) features, ensure that you use the `blocks_of_100` plan to create the service instance for Document Information Extraction Trial.
34
34
@@ -49,7 +49,7 @@ If you are new to the Document Information Extraction UI, try out first the tuto
Here, you find the SAP schemas. The Document Information Extraction UI includes preconfigured SAP schemas for the following standard document types: purchase order, payment advice, and invoice. In addition, there’s an SAP schema for custom documents (`SAP_OCROnly_schema`). You can't delete SAP schemas. You can use them as they're, you can edit them directly, or create copies and adapt the list of fields according to your needs.
52
+
Here, you find the SAP schemas. The Document Information Extraction UI includes preconfigured SAP schemas for the following standard document types: purchase order, payment advice, and invoice. In addition, there’s an SAP schema for custom documents (`SAP_OCROnly_schema`). You can't delete SAP schemas. You can use them as they are, edit them directly, or create copies and adapt the list of fields according to your needs.
@@ -87,11 +87,11 @@ Now, your schema shows up in the list. Access the schema by clicking on the row.
87
87
88
88
A schema defines a list of header fields and line item fields that represent the information you want to extract from a document.
89
89
90
-
Header fields represent information that are specific to your document and only occur one time. Those may include the document number, any sender information or the total amount of the order. In contrast, line item fields represent the products that you ordered where each line is one product, often with a certain quantity attached. Thus, the line item fields extract the information for each product in your order. Those may include the article number, the price and the quantity.
90
+
Header fields represent information that is specific to your document and only occurs one time. This may include the document number, any sender information, or the total amount of the order. In contrast, line item fields represent the products that you ordered, where each line is one product, often with a certain quantity attached. Thus, the line item fields extract the information for each product in your order. Those may include the article number, the price and the quantity.
91
91
92
-
Document Information Extraction already contains an amount of fields it can extract. See [here](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/b1c07d0c51b64580881d11b4acb6a6e6.html) which header fields are supported and [here](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/ff3f5efe11c14744b2ce60b95d210486.html) which line item fields are supported. Additionally, you can define custom fields. In the next step, you'll learn about both.
92
+
Document Information Extraction already includes a number of fields that it can extract. See [here](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/b1c07d0c51b64580881d11b4acb6a6e6.html) which header fields are supported and [here](https://help.sap.com/viewer/5fa7265b9ff64d73bac7cec61ee55ae6/SHIP/en-US/ff3f5efe11c14744b2ce60b95d210486.html) which line item fields are supported. Additionally, you can define custom fields. In the next step, you'll learn about both.
93
93
94
-
The image below shows an example purchase order. All the fields that you define in your schema in this tutorial are highlighted. All information outside of the table that occur once are header fields. All information within the table occur per product and are line item fields. You can of course extend or reduce the information that you want to extract.
94
+
The image below shows an example purchase order. All the fields that you define in your schema in this tutorial are highlighted. The header fields represent all information outside of the table that occurs once. The line item fields represent all information within the table, which occurs per product. You can, of course, extend or reduce the information that you want to extract.
95
95
96
96
<!-- border -->
97
97
@@ -105,21 +105,21 @@ To define your first header field, click **Add** to the right of the headline `H
For each field, you have to enter a name, a data type and optionally a default extractor and a description. The potential data types are `string`, `number`, `date`, `discount` and `currency`. To use one of the included standard fields of Document Information Extraction, select them for the default extractor.
108
+
For each field, you have to enter a name, a data type, a setup type, and optionally a default extractor and a description. The potential data types are `string`, `number`, `date`, `discount` and `currency`. To use one of the included standard fields of Document Information Extraction, select them for the default extractor.
109
109
110
110
As your first header field, add the number of your purchase order which identifies your document.
111
111
112
112
1. Enter an appropriate name for your field, `purchaseOrderNumber`, for example.
113
113
114
-
2. Select `string` for the `Data Type`. Note that a document number is a `string`, even though it consists of numbers, as it is an arbitrary combination of numbers without meaning. In contrast, price is an example for the data type `number`.
114
+
2. Select `string` for the `Data Type`. Note that a document number is a `string`, even though it consists of numbers, as it is an arbitrary combination of numbers without meaning. In contrast, price is an example of the data type `number`.
115
115
116
-
3. As all business documents have a unique identification, Document Information Extraction already includes a standard field. Select `documentNumber` for the `Default Extractor`.
116
+
3. As all business documents have a unique identification, Document Information Extraction already includes a standard field. Select `default` for the `Setup Type` and then select `documentNumber` for the `Default Extractor`.
117
117
118
118
4. Click **Add** to create the header field.
119
119
120
120
<!-- border -->
121
121
122
-
The field now displays in your list of header fields where you find all the information again that you have just entered. You can edit or delete the field by clicking the respective icons on the right.
122
+
The field now displays in your list of header fields, where you again find all the information that you have just entered. You can edit or delete the field by clicking the respective icons on the right.
123
123
124
124
<!-- border -->
125
125
@@ -131,11 +131,11 @@ Click **Add** again to open the dialog.
131
131
132
132
2. Select `string` for the `Data Type`.
133
133
134
-
3. As Document Information Extraction offers no equivalent field, leave the default extractor blank. Click **Add** to create the field.
134
+
3. As Document Information Extraction offers no equivalent field, select `default` for the `Setup Type` but leave the default extractor blank. Click **Add** to create the field.
135
135
136
136
<!-- border -->
137
137
138
-
You have now created your first custom field. Go ahead and create the list of header fields as shown in the table and image below. Pay attention which fields have a default extractor and which do not. Feel free to extend or reduce the list of header fields.
138
+
You have now created your first custom field. Go ahead and create the list of header fields as shown in the table and image below. Pay attention to which fields have a default extractor and which do not. Feel free to extend or reduce the list of header fields.
You have now created your first line item field. Go ahead and create the list of line item fields as shown in the table and image below. Pay attention which fields have a default extractor and which do not. Feel free to extend or reduce the list of line item fields.
185
+
You have now created your first line item field. Go ahead and create the list of line item fields as shown in the table and image below. Pay attention to which fields have a default extractor and which do not. Feel free to extend or reduce the list of line item fields.
0 commit comments