Skip to content

Commit 65c0f38

Browse files
authored
Add blog post decribing 'HtmlUnit Remote' (#1874)
* Add blog post decribing 'HtmlUnit Remote' * Small adjustment and attibution --------- Co-authored-by: Diego Molina <[email protected]> [deploy site]
1 parent 8a57ce0 commit 65c0f38

File tree

1 file changed

+84
-0
lines changed

1 file changed

+84
-0
lines changed
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: "HtmlUnit Remote: Acquiring Remote HtmlUnitDriver Session in Selenium 4 Grid"
3+
linkTitle: "HtmlUnit Remote: Acquiring Remote HtmlUnitDriver Session in Selenium 4 Grid"
4+
date: 2024-08-19
5+
tags: ["Grid", "HtmlUnitDriver"]
6+
categories: ["Grid"]
7+
author: Scott Babcock [@sbabcoc](https://www.github.com/sbabcoc)
8+
description: >
9+
This post describes 'HtmlUnit Remote', a wrapper for HtmlUnitDriver that enables Selenium 4 Grid to manage remote instances of this "headless" browser.
10+
---
11+
12+
# HTMLUNIT REMOTE
13+
[![Maven Central](https://img.shields.io/maven-central/v/com.nordstrom.ui-tools/htmlunit-remote.svg)](https://central.sonatype.com/search?q=com.nordstrom.ui-tools+htmlunit-remote&core=gav)
14+
15+
The [HtmlUnit Remote](https://github.com/seleniumhq-community/htmlunit-remote) project implements a [W3C WebDriver protocol](https://www.w3.org/TR/webdriver2) wrapper for [HtmlUnitDriver](https://github.com/SeleniumHQ/htmlunit-driver), which enables **Selenium 4 Grid** to supply remote sessions of this headless browser.
16+
17+
### Background
18+
19+
To eliminate behavioral differences between local and remote configurations, the [Selenium Foundation](https://github.com/sbabcoc/Selenium-Foundation) framework always acquires browser sessions from a **Grid** instance, managing its own local grid instance when not configured to use an existing grid. **Selenium 3 Grid** could be configured to supply **HtmlUnitDriver** sessions, supported by special-case handling within the Node server itself. This handling was not carried over into **Selenium 4 Grid**, which was completely re-engineered with new architecture and vastly expanded capabilities.
20+
21+
The lack of **HtmlUnitDriver** support in **Selenium 4 Grid** necessitated reconfiguring the **Selenium Foundation** project unit tests from using this Java-only managed artifact to using a standard browser like Chrome, an external dependency that requires additional resources and imposes additional risks of failure.
22+
23+
The driver service implemented by **HtmlUnit Remote** enables **Selenium 4 Grid** to supply **HtmlUnitDriver** sessions.
24+
25+
### Project Rationale
26+
27+
My initial objective for creating **HtmlUnit Remote** was to retain feature parity in **Selenium Foundation** for the set of browsers supported with **Selenium 3** and **Selenium 4**. Although I could configure my unit tests to target a conventional browser, I also wanted to avoid additional external dependencies with associated risks.
28+
29+
Once I began investigating the features and functionality I would need to enable **Selenium 4 Grid** to supply **HtmlUnitDriver** sessions, I recognized an additional benefit this project could provide - comprehensive standardized configurability.
30+
31+
### HtmlUnitDriver Configuration
32+
33+
All remote drivers are configured via a standard **Selenium** feature - the [Capabilities](https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/Capabilities.java) object. Prior to the **HtmlUnit Remote** project, many of the options of [HtmlUnit](https://www.htmlunit.org/) could not be accessed or modified via the **Capabilities** API. These were only available via custom **HtmlUnitDriver** methods, and the way that non-standard capabilities had been added to the **Capabilities** object didn't conform to the **W3C** specification.
34+
35+
This meant that the initial phase of the **HtmlUnit Remote** project was to implement a comprehensive W3C-compliant configuration object - the **HtmlUnitDriverOptions** class. This class extends [AbstractDriverOptions](https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/remote/AbstractDriverOptions.java), adding driver-specific capabilities under an extension named `garg:htmlunitOptions`. Support for this class provides full configurability of all **HtmlUnitDriver** options through the standard **Capabilities** API.
36+
37+
This standardized configuration API has been incorporated directly into **HtmlUnitDriver**, providing the core implementation for manipulating every driver setting. To maintain backward compatibility, all of the existing constructors and configuration methods have been retained, reimplemented to use this new core API.
38+
39+
### W3C Remote Protocol Wrapper
40+
41+
With full standardized configurability in place, the next step was to create a server that implements the [W3C WebDriver protocol](https://www.w3.org/TR/webdriver2). The **HtmlUnitDriverServer** functions as a remote protocol wrapper around one or more **HtmlUnitDriver** sessions, performing the following tasks:
42+
* Create and manage driver sessions
43+
* Route driver commands to specified driver sessions
44+
* Package driver method results into HTTP responses
45+
46+
### HtmlUnit Remote Packaging
47+
48+
Rather than bulk up the existing driver with remote-specific features, **HtmlUnitDriverServer** and associated facilities are packaged in a companion `htmlunit-remote` artifact. In addition to the server, this artifact defines a driver information provider (**HtmlUnitDriverInfo**), a driver service (**HtmlUnitDriverService**), and a custom slot matcher (**HtmlUnitSlotMatcher**).
49+
50+
### Connecting to the Grid
51+
52+
Next up is **HtmlUnitDriverInfo**, which specifies the basic characteristics of the driver and provides a method that creates a driver session with specified capabilities. This class implements the standard [WebDriverInfo](https://github.com/SeleniumHQ/selenium/blob/trunk/java/src/org/openqa/selenium/WebDriverInfo.java) interface.
53+
54+
With availability of **HtmlUnitDriver** advertised by this information provider, **Selenium 4 Grid** nodes can be configured to provide driver sessions:
55+
56+
##### htmlunit.toml
57+
```
58+
[node]
59+
detect-drivers = false
60+
[[node.driver-configuration]]
61+
display-name = "HtmlUnit"
62+
stereotype = "{\"browserName\": \"htmlunit\"}"
63+
64+
[distributor]
65+
slot-matcher = "org.openqa.selenium.htmlunit.remote.HtmlUnitSlotMatcher"
66+
```
67+
The `selenium-server` JAR doesn't include the **HtmlUnitDriver** artifacts; these need to be specified as extensions to the grid class path via the `--ext` option:
68+
69+
```
70+
java -jar selenium-server-<version>.jar --ext htmlunit-remote-<version>-grid-extension.jar standalone --config htmlunit.toml
71+
```
72+
The `grid-extension` artifact provides all of the specifications and service providers required to enable **Selenium 4 Grid** to supply remote sessions of **HtmlUnitDriver**. This artifact combines `htmlunit-remote` with `htmlunit3-driver`, `htmlunit`, and all of their unique dependencies.
73+
74+
### Implementation Details
75+
76+
**HtmlUnit Remote** provides the following elements:
77+
* **HtmlUnitDriverInfo** - This class informs **Selenium 4 Grid** that **HtmlUnitDriver** is available and provides a method to create new driver instances.
78+
* **HtmlUnitSlotMatcher** - This custom slot matcher extends **DefaultSlotMatcher**, indicating a match if both the slot stereotype and requested browser capabilities specify `htmlunit` as the browser name.
79+
* **HtmlUnitDriverService** - This class manages a server that hosts instances of **HtmlUnitDriver**.
80+
* **HtmlUnitDriverServer** - This is the server class that hosts **HtmlUnitDriver** instances, enabling remote operation via the [W3C WebDriver protocol](https://www.w3.org/TR/webdriver2).
81+
82+
In operation, **HtmlUnitDriverService** is instantiated by **Selenium 4 Grid** node servers that are configured to support **HtmlUnitDriver**. Unlike other driver services, which launch a new process for each created driver session, **HtmlUnitDriverService** starts a single in-process server that hosts all of the driver sessions it creates.
83+
84+
_This is a guest blog post by [Scott Babcock](https://www.github.com/sbabcoc)_

0 commit comments

Comments
 (0)