-
Notifications
You must be signed in to change notification settings - Fork 93
Clojure and Java Dependencies
This page covers the basics of JVM dependency management and some of the issues you may encounter while trying to work with clj-webdriver (a Clojure library) and Selenium-WebDriver (a suite of Java libraries).
- Basics of how JVM dependencies are managed
- Problems that arise from transitive dependency resolution
The Aether library provides the underpinnings of repository and artifact resolution used by both Java and Clojure build tools. A repository is either a local directory or a remote server location that houses JVM artifacts. Artifacts in this context are packaged archives of libraries or applications that target the Java Virtual Machine, usually JAR files. Dependency management, then, refers to the tools and processes whereby developers can specify artifacts their project requires, and let the dependency resolution tooling take care of finding those JAR's and any further JAR's that those JAR's might require to function properly (this is transitive dependency management).
Maven, Leiningen and other build tools utilize Aether under the covers to handle transitive dependency management. Since Maven popularized the use of transitive dependency management in the JVM community, we often refer to repositories that are compatible with Aether's approach as "Maven repositories."
A Maven-style repository is structured according to group id's, artifact id's, and versions. The group and artifact id's are born out of Java convention for specifying package names for organizing your source code. The Java convention for organizing your source code packages and therefore your artifact's group id is to use the fully qualified domain name of either your company or your project in reverse as the group id. So since Clojure's official website is at http://clojure.org, it's JAR's group id is org.clojure
. The artifact id, then, becomes the specific library being released. In the case of Clojure, the main language's JAR has an artifact id of just clojure
, whereas core libraries have artifact id's like core.logic
or tools.logging
.
Although sometimes Clojure namespaces are organized according to the same Java package conventions, more often than not a more concise approach is used for namespaces. In the case of Clojure, although its dependency coordinates (group id + artifact id + version) are [org.clojure/clojure x.y.z]
, its namespaces look like clojure.core
and clojure.pprint
, without the org
prefix. This demonstrates that there is no requirement that your published artifact's group id and the namespaces in your code have to match in any way, though you should obviously document the difference.
The final piece of the dependency puzzle is versions. Versions can either be snapshots or stable versions. A snapshot version is suffixed with -SNAPSHOT
. Artifacts with snapshot versions are generally stored in separate repositories from stable, production-ready releases. When a snapshot is used, be aware that dependency management tools like Leiningen and Maven will attempt to find the most up-to-date snapshot. So if you're depending on [some.library 0.1.0-SNAPSHOT]
and everything works today, you may find that tomorrow the developer of some.library
has pushed up a new SNAPSHOT and your code may no longer be compatible.
For production applications and stable library releases, never depend on snapshot versions of libraries.
If you've used Leiningen or Maven for any project, you should have a local ~/.m2/repository
directory on your machine that contains your local Maven repository. This local cache of JVM artifacts is laid out in the same way a remote Maven repository would be organized, like Maven Central or Clojars. Artifacts are stored under folders in the format {group id}/{artifact id}/version
. If the group id has any .
characters, a separate folder is created for each part of the group id, so org.clojure
is stored under org/clojure
. The same is not true of artifact id's, so for example Clojure's org.clojure/tools.logging
is stored under org/clojure/tools.logging
.
If you look inside the version folder for a particular group and artifact id of a project, you'll notice that there's more than just a JAR file in there. You'll likely see files such as:
- clj-webdriver-0.7.2.jar
- clj-webdriver-0.7.2.jar.sha1
- clj-webdriver-0.7.2.pom
- clj-webdriver-0.7.1.pom.sha1
The SHA1 files are files with the SHA1 hash value of the corresponding file's contents, which can be used to verify a successful download of the artifacts. The .pom
file is a Maven-style XML file that specifies metadata about the corresponding artifact, including its own dependencies. This is how your build tooling is able to discover the transitive dependencies of JAR's you specify in your own pom.xml
or project.clj
file.
The biggest problem with automated transitive dependency management is understanding its behavior when two different artifacts both depend on a third artifact. If they both depend on the exact same version, there's no problem; if they depend on different versions, which one "wins"?
The answer is: you should check.
You can use mvn dependency:tree
in Maven projects or lein deps :tree
in Clojure projects to see exactly how the final dependency resolution works itself out. For clj-webdriver, you should be certain that the versions of Selenium-WebDriver's JAR's that you want are the ones that "win" by looking at the output of those commands.
A nice one-liner I use to help focus on dependencies I'm investigating pipes lein deps :tree
to egrep
like this:
lein deps :tree | egrep --color "$|selenium-java"
This will show you all the output of lein deps :tree
but also colorize, in this case, selenium-java
so that it's easier to see in the (often ludicrously verbose) output of this command.
If you discover that some JAR you depend on is "winning" the dependency resolution and you need to specify different versions of the JAR's it depends on, then you can use exclusions to prevent its transitive dependencies from being respected, and then specify the JAR's that you need explicitly.
A good example is the [com.codeborne/phantomjsdriver "1.2.1"]
artifact for PhantomJS. This JAR specifies dependencies on some core Selenium-WebDriver JAR's, but in your own project you're likely using more recent versions. To prevent the PhantomJS JAR from controlling what core Selenium-WebDriver JAR's your project pulls in, in your project.clj
you can include it in your dependencies like this:
:dependencies [[com.codeborne/phantomjsdriver "1.2.1"
:exclusion [org.seleniumhq.selenium/selenium-java
org.seleniumhq.selenium/selenium-server
org.seleniumhq.selenium/selenium-remote-driver]]
[org.seleniumhq.selenium/selenium-java "2.47.1"]]
This excludes the PhantomJS dependencies on Selenium-WebDriver and explicitly pulls in version 2.47.1 for use in your code.
You might ask: won't that break the library that needed the older versions? It might. If it does, you're in deeper trouble, and will likely need to port the functionality you need directly. However, frequently you're able to specify newer versions of JAR's that other JAR's also depend on without breaking them, because the underlying code they rely on is still present in the newer JAR's.
The clj-webdriver Uncyclo by Daniel Gregoire and community is licensed under a Creative Commons Attribution 4.0 International License. Based on a work at https://github.com/semperos/clj-webdriver/wiki.