Skip to content
This repository was archived by the owner on Nov 9, 2017. It is now read-only.

Clojure and Java Dependencies

Daniel Gregoire edited this page Aug 14, 2015 · 6 revisions

This page covers the basics of JVM dependency management and some of the issues you may encounter while trying to work with clj-webdriver (a Clojure library) and Selenium-WebDriver (a suite of Java libraries).

  • Basics of how JVM dependencies are managed
  • Problems that arise from transitive dependency resolution

Basics of JVM Dependency Management

The Aether library provides the underpinnings of repository and artifact resolution used by both Java and Clojure build tools. A repository is either a local directory or a remote server location that houses JVM artifacts. Artifacts in this context are packaged archives of libraries or applications that target the Java Virtual Machine, usually JAR files. Dependency management, then, refers to the tools and processes whereby developers can specify artifacts their project requires, and let the dependency resolution tooling take care of finding those JAR's and any further JAR's that those JAR's might require to function properly (this is transitive dependency management).

Maven, Leiningen and other build tools utilize Aether under the covers to handle transitive dependency management. Since Maven popularized the use of transitive dependency management in the JVM community, we often refer to repositories that are compatible with Aether's approach as "Maven repositories."

A Maven-style repository is structured according to group id's, artifact id's, and versions. The group and artifact id's are born out of Java convention for specifying package names for organizing your source code. The Java convention for organizing your source code packages and therefore your artifact's group id is to use the fully qualified domain name of either your company or your project in reverse as the group id. So since Clojure's official website is at http://clojure.org, it's JAR's group id is org.clojure. The artifact id, then, becomes the specific library being released. In the case of Clojure, the main language's JAR has an artifact id of just clojure, whereas core libraries have artifact id's like core.logic or tools.logging.

Although sometimes Clojure namespaces are organized according to the same Java package conventions, more often than not a more concise approach is used for namespaces. In the case of Clojure, although its dependency coordinates (group id + artifact id + version) are [org.clojure/clojure x.y.z], its namespaces look like clojure.core and clojure.pprint, without the org prefix. This demonstrates that there is no requirement that your published artifact's group id and the namespaces in your code have to match in any way, though you should obviously document the difference.

The final piece of the dependency puzzle is versions. Versions can either be snapshots or stable versions. A snapshot version is suffixed with -SNAPSHOT. Artifacts with snapshot versions are generally stored in separate repositories from stable, production-ready releases. When a snapshot is used, be aware that dependency management tools like Leiningen and Maven will attempt to find the most up-to-date snapshot. So if you're depending on [some.library 0.1.0-SNAPSHOT] and everything works today, you may find that tomorrow the developer of some.library has pushed up a new SNAPSHOT and your code may no longer be compatible.

For production applications and stable library releases, never depend on snapshot versions of libraries.

If you've used Leiningen or Maven for any project, you should have a local ~/.m2/repository directory on your machine that contains your local Maven repository. This local cache of JVM artifacts is laid out in the same way a remote Maven repository would be organized, like Maven Central or Clojars. Artifacts are stored under folders in the format {group id}/{artifact id}/version. If the group id has any . characters, a separate folder is created for each part of the group id, so org.clojure is stored under org/clojure. The same is not true of artifact id's, so for example Clojure's org.clojure/tools.logging is stored under org/clojure/tools.logging.

If you look inside the version folder for a particular group and artifact id of a project, you'll notice that there's more than just a JAR file in there. You'll likely see files such as:

  • clj-webdriver-0.7.2.jar
  • clj-webdriver-0.7.2.jar.sha1
  • clj-webdriver-0.7.2.pom
  • clj-webdriver-0.7.1.pom.sha1

The SHA1 files are files with the SHA1 hash value of the corresponding file's contents, which can be used to verify a successful download of the artifacts. The .pom file is a Maven-style XML file that specifies metadata about the corresponding artifact, including its own dependencies. This is how your build tooling is able to discover the transitive dependencies of JAR's you specify in your own pom.xml or project.clj file.

What's inside a JAR?

A JAR file is actually just a ZIP file with a different extension and some conventions around the contents it contains. A JAR file usually contains a manifest file that specifies things like how the artifact was packaged and whether it has a main class to use when running the JAR via java -jar. Here's Clojure's manifest for 1.7.0:

Manifest-Version: 1.0
Archiver-Version: Plexus Archiver
Created-By: Apache Maven
Built-By: hudson
Build-Jdk: 1.6.0_20
Main-Class: clojure.main

In addition to the manifest, the project's pom.xml file is included in the JAR so that its dependencies can be tracked. Some artifacts are packaged just for their POM file, and act as a convenient way to group together a set of related dependencies (see an example below).

Most importantly, a JAR file contains the code for your library or application. In the case of Clojure, the JAR file can contain a mixture of Clojure source files (ending in .clj) as well as compiled Java class files (ending in .class). Auxiliary configuration or properties files needed by your code can also be included in the JAR.

You can inspect the contents of a JAR file by running jar -tvf on it at the command line:

jar -tvf selenium-java-2.47.1.jar

The output of this particular JAR is:

     0 Thu Jul 30 10:46:10 EDT 2015 META-INF/
   131 Thu Jul 30 10:46:08 EDT 2015 META-INF/MANIFEST.MF
     0 Thu Jul 30 10:46:10 EDT 2015 META-INF/maven/
     0 Thu Jul 30 10:46:10 EDT 2015 META-INF/maven/org.seleniumhq.selenium/
     0 Thu Jul 30 10:46:10 EDT 2015 META-INF/maven/org.seleniumhq.selenium/selenium-java/
  5136 Thu Jul 30 10:45:32 EDT 2015 META-INF/maven/org.seleniumhq.selenium/selenium-java/pom.xml
   122 Thu Jul 30 10:46:10 EDT 2015 META-INF/maven/org.seleniumhq.selenium/selenium-java/pom.properties

This is an example, mentioned above, of an artifact whose sole purpose is to pull in a group of related JAR's, in this case the JAR's needed to work with Selenium-WebDriver from Java. The dependencies in that POM file are:

    <dependencies>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-chrome-driver</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-edge-driver</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-htmlunit-driver</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-firefox-driver</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-ie-driver</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-safari-driver</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-support</artifactId>
            <version>${project.version}</version>
        </dependency>
        <dependency>
            <groupId>org.webbitserver</groupId>
            <artifactId>webbit</artifactId>
        </dependency>
        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-leg-rc</artifactId>
            <version>${project.version}</version>
        </dependency>
    </dependencies>

A more conventional example of a JAR's contents would be clj-webdriver:

jar -tvf clj-webdriver-0.7.1.jar

Whose output is:

   127 Wed Aug 05 21:50:06 EDT 2015 META-INF/MANIFEST.MF
  2866 Wed Aug 05 21:50:06 EDT 2015 META-INF/maven/clj-webdriver/clj-webdriver/pom.xml
  2745 Wed Aug 05 21:50:06 EDT 2015 META-INF/leiningen/clj-webdriver/clj-webdriver/project.clj
  2745 Wed Aug 05 21:50:06 EDT 2015 project.clj
  6100 Wed Aug 05 21:50:06 EDT 2015 META-INF/leiningen/clj-webdriver/clj-webdriver/README.md
     0 Wed Aug 05 21:50:06 EDT 2015 META-INF/
     0 Wed Aug 05 21:50:06 EDT 2015 META-INF/maven/
     0 Wed Aug 05 21:50:06 EDT 2015 META-INF/maven/clj-webdriver/
     0 Wed Aug 05 21:50:06 EDT 2015 META-INF/maven/clj-webdriver/clj-webdriver/
   154 Wed Aug 05 21:50:06 EDT 2015 META-INF/maven/clj-webdriver/clj-webdriver/pom.properties
    32 Sun Aug 02 00:19:18 EDT 2015 log4j.properties
     0 Wed Aug 05 14:46:18 EDT 2015 clj_webdriver/
  6185 Wed Aug 05 09:55:10 EDT 2015 clj_webdriver/cache.clj
  2598 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/cookie.clj
 19400 Wed Aug 05 17:07:34 EDT 2015 clj_webdriver/core.clj
  4232 Fri Jul 31 16:48:14 EDT 2015 clj_webdriver/core_by.clj
 14570 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/core_driver.clj
 12659 Wed Aug 05 09:54:10 EDT 2015 clj_webdriver/core_element.clj
  1804 Wed Aug 05 10:57:00 EDT 2015 clj_webdriver/driver.clj
   709 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/element.clj
  1739 Wed Aug 05 13:55:40 EDT 2015 clj_webdriver/firefox.clj
  2225 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/form_helpers.clj
     0 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/js/
  2930 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/js/browserbot.clj
   594 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/options.clj
     0 Wed Aug 05 14:46:18 EDT 2015 clj_webdriver/remote/
  1046 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/remote/driver.clj
  6749 Wed Aug 05 14:46:18 EDT 2015 clj_webdriver/remote/server.clj
 46777 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/taxi.clj
 14173 Wed Aug 05 13:55:40 EDT 2015 clj_webdriver/util.clj
  1900 Thu Jul 30 17:01:42 EDT 2015 clj_webdriver/wait.clj
  5045 Wed Aug 05 09:54:10 EDT 2015 clj_webdriver/window.clj
  1668 Wed Aug 05 12:38:34 EDT 2015 clj_webdriver/wire.clj

When running a JVM application, JAR's added to the JVM classpath make their contents available at the root of the classpath. In the above output, this means clj-webdriver's log4j.properties file is at the root of the classpath, even though in the project it is saved under a resources folder. See Leiningen's documentation for which folders in your project are "on the classpath" by default and how you can change those settings.

Dependency Resolution Problems

The biggest problem with automated transitive dependency management is understanding its behavior when two different artifacts both depend on a third artifact. If they both depend on the exact same version, there's no problem; if they depend on different versions, which one "wins"?

The answer is: you should check.

You can use mvn dependency:tree in Maven projects or lein deps :tree in Clojure projects to see exactly how the final dependency resolution works itself out. For clj-webdriver, you should be certain that the versions of Selenium-WebDriver's JAR's that you want are the ones that "win" by looking at the output of those commands.

A nice one-liner I use to help focus on dependencies I'm investigating pipes lein deps :tree to egrep like this:

lein deps :tree | egrep --color "$|selenium-java"

This will show you all the output of lein deps :tree but also colorize, in this case, selenium-java so that it's easier to see in the (often ludicrously verbose) output of this command.

If you discover that some JAR you depend on is "winning" the dependency resolution and you need to specify different versions of the JAR's it depends on, then you can use exclusions to prevent its transitive dependencies from being respected, and then specify the JAR's that you need explicitly.

A good example is the [com.codeborne/phantomjsdriver "1.2.1"] artifact for PhantomJS. This JAR specifies dependencies on some core Selenium-WebDriver JAR's, but in your own project you're likely using more recent versions. To prevent the PhantomJS JAR from controlling what core Selenium-WebDriver JAR's your project pulls in, in your project.clj you can include it in your dependencies like this:

:dependencies [[com.codeborne/phantomjsdriver "1.2.1"
                                              :exclusion [org.seleniumhq.selenium/selenium-java
                                                          org.seleniumhq.selenium/selenium-server
                                                          org.seleniumhq.selenium/selenium-remote-driver]]
               [org.seleniumhq.selenium/selenium-java "2.47.1"]]

This excludes the PhantomJS dependencies on Selenium-WebDriver and explicitly pulls in version 2.47.1 for use in your code.

You might ask: won't that break the library that needed the older versions? It might. If it does, you're in deeper trouble, and will likely need to port the functionality you need manually. However, frequently you're able to specify newer versions of JAR's that other JAR's also depend on without breaking them, because the underlying code they rely on is still present in the newer JAR's. Libraries that honor semantic versioning tend to behave well within respective version ranges, but since errors for missing classes can show up at runtime in Clojure applications, if you use exclusions you should test your code thoroughly.

Clone this wiki locally