Skip to content

Commit ce9d165

Browse files
committed
Merge branch 'docs' of github.com:meilisearch/docs-scraper into docs
2 parents 606d36e + a075155 commit ce9d165

File tree

1 file changed

+83
-0
lines changed

1 file changed

+83
-0
lines changed

README.md

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,21 +3,32 @@
33
A scraper for your documentation website, indexing the content into a MeiliSearch instance.
44

55
- [Installation and Usage](#installation-and-usage)
6+
<<<<<<< HEAD
67
- [From Source Code](#from-source-code)
8+
=======
9+
- [From source code](#from-source-code)
10+
>>>>>>> a0751550092b2ba1e88fed4ce3b8dbc6fea5d1e1
711
- [With Docker](#with-docker)
812
- [In a GitHub Action](#in-a-github-action)
913
- [About the API Key](#about-the-api-key)
1014
- [Configuration file](#configuration-file)
1115
- [Related projects](#related-projects)
16+
<<<<<<< HEAD
1217
- [Development Workflow](#development-workflow)
18+
=======
19+
>>>>>>> a0751550092b2ba1e88fed4ce3b8dbc6fea5d1e1
1320
- [Credits](#credits)
1421

1522

1623
## Installation and Usage
1724

1825
This project supports Python 3.6+.
1926

27+
<<<<<<< HEAD
2028
### From Source Code
29+
=======
30+
### From source code
31+
>>>>>>> a0751550092b2ba1e88fed4ce3b8dbc6fea5d1e1
2132
2233
Set both environment variables `MEILISEARCH_HOST_URL` and `MEILISEARCH_API_KEY`.
2334

@@ -35,6 +46,7 @@ $ docker run -t --rm \
3546
-e MEILISEARCH_API_KEY=<your-meilisearch-api-key> \
3647
-v <absolute-path-to-your-config-file>:/docs-scraper/config.json \
3748
getmeili/docs-scraper:v0.9.0 pipenv run ./docs_scraper config.json
49+
<<<<<<< HEAD
3850
```
3951

4052
### In a GitHub Action
@@ -101,6 +113,74 @@ A generic configuration file:
101113
}
102114
```
103115

116+
=======
117+
```
118+
119+
### In a GitHub Action
120+
121+
To run after your deployment job:
122+
123+
```yml
124+
run-scraper:
125+
needs: <your-deployment-job>
126+
runs-on: ubuntu-18.04
127+
steps:
128+
- uses: actions/checkout@master
129+
- name: Run scraper
130+
env:
131+
HOST_URL: ${{ secrets.MEILISEARCH_HOST_URL }}
132+
API_KEY: ${{ secrets.MEILISEARCH_API_KEY }}
133+
CONFIG_FILE_PATH: <path-to-your-config-file>
134+
run: |
135+
docker run -t --rm \
136+
-e MEILISEARCH_HOST_URL=$HOST_URL \
137+
-e MEILISEARCH_API_KEY=$API_KEY \
138+
-v $CONFIG_FILE_PATH:/docs-scraper/config.json \
139+
getmeili/docs-scraper:v0.9.0 pipenv run ./docs_scraper config.json
140+
```
141+
142+
Here is the [GitHub Action file](https://github.com/meilisearch/documentation/blob/master/.github/workflows/gh-pages-scraping.yml) we use in production for the MeiliSearch documentation.
143+
144+
### About the API Key
145+
146+
The API key you must provide as environment variable should have the permissions to add documents into your MeiliSearch instance.
147+
148+
Thus, you need to provide the private key or the master key.
149+
150+
_More about [MeiliSearch authentication](https://docs.meilisearch.com/guides/advanced_guides/authentication.html)._
151+
152+
## Configuration file
153+
154+
A generic configuration file:
155+
156+
```json
157+
{
158+
"index_uid": "docs",
159+
"start_urls": ["https://www.example.com/doc/"],
160+
"sitemap_urls": ["https://www.example.com/sitemap.xml"],
161+
"stop_urls": [],
162+
"selectors": {
163+
"lvl0": {
164+
"selector": ".docs-lvl0",
165+
"global": true,
166+
"default_value": "Documentation"
167+
},
168+
"lvl1": {
169+
"selector": ".docs-lvl1",
170+
"global": true,
171+
"default_value": "Chapter"
172+
},
173+
"lvl2": ".docs-content .docs-lvl2",
174+
"lvl3": ".docs-content .docs-lvl3",
175+
"lvl4": ".docs-content .docs-lvl4",
176+
"lvl5": ".docs-content .docs-lvl5",
177+
"lvl6": ".docs-content .docs-lvl6",
178+
"text": ".docs-content p, .docs-content li"
179+
}
180+
}
181+
```
182+
183+
>>>>>>> a0751550092b2ba1e88fed4ce3b8dbc6fea5d1e1
104184
The scraper will focus on the highlighted information depending on your selectors.
105185

106186
Here is the [configuration file](https://github.com/meilisearch/documentation/blob/master/.vuepress/scraper/config.json) we use for the MeiliSearch documentation.
@@ -110,6 +190,7 @@ Here is the [configuration file](https://github.com/meilisearch/documentation/bl
110190
After having crawled your documentation, you might need a search bar to improve your user experience!
111191

112192
For the front part, check out the [docs-searchbar.js repository](https://github.com/meilisearch/docs-searchbar.js), wich provides a front-end search bar adapted for documentation.
193+
<<<<<<< HEAD
113194

114195
## Development Workflow
115196

@@ -135,6 +216,8 @@ $ git push --tag origin master
135216
```
136217

137218
A GitHub Action will be triggered and push the `latest` and `vX.X.X` version of Docker image on [DockerHub](https://hub.docker.com/repository/docker/getmeili/docs-scraper)
219+
=======
220+
>>>>>>> a0751550092b2ba1e88fed4ce3b8dbc6fea5d1e1
138221
139222
## Credits
140223

0 commit comments

Comments
 (0)