You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[🤖 Compatibility with MeiliSearch](#-compatibility-with-meilisearch)
@@ -459,6 +462,41 @@ If used, `min_indexed_level` is ignored.
459
462
}
460
463
```
461
464
465
+
#### `js_render` (optional)
466
+
467
+
When `js_render` is set to `true`, the scraper will use ChromeDriver. This is needed for pages that are rendered with JavaScript, for example, pages generated with React, Vue, or applications that are running in development mode: `autoreload``watch`.
468
+
469
+
After installing ChromeDriver, provide the path to the bin using the following environment variable `CHROMEDRIVER_PATH` (default value is `/usr/bin/chromedriver`).
470
+
471
+
The default value of `js_render` is `false`.
472
+
473
+
```json
474
+
{
475
+
"js_render": true
476
+
}
477
+
```
478
+
479
+
#### `js_wait` (optional)
480
+
481
+
This setting can be used when `js_render` is set to `true` and the pages need time to fully load. `js_wait` takes an integer is specifies the number of seconds the scraper should wait for the page to load.
482
+
483
+
```json
484
+
{
485
+
"js_render": true,
486
+
"js_wait": 1
487
+
}
488
+
```
489
+
490
+
#### `allowed_domains` (optional)
491
+
492
+
This setting specifies the domains that the scraper is allowed to access. In most cases the `allowed_domains` will be automatically set using the `start_urls` and `stop_urls`. When scraping a domain that contains a port, for example `http://localhost:8080`, the domain needs to be manually added to the configuration.
493
+
494
+
```json
495
+
{
496
+
"allowed_domains": ["localhost"]
497
+
}
498
+
```
499
+
462
500
### Authentication
463
501
464
502
__WARNING:__ Please be aware that the scraper will send authentication headers to every scraped site, so use `allowed_domains` to adjust the scope accordingly!
0 commit comments