Skip to content

Commit 5a1b54b

Browse files
authored
Fixing out of date requirements docs (#1844) (#1883)
This commit makes several fixes to the requirements documentation, which had fallen out of date for java, hadoop, hive, and spark. Closes #1843
1 parent 425a3ac commit 5a1b54b

File tree

2 files changed

+13
-20
lines changed

2 files changed

+13
-20
lines changed

docs/src/reference/asciidoc/core/intro/requirements.adoc

Lines changed: 10 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ TIP: {eh} adds no extra requirements to Hadoop (or the various libraries built o
1010
[[requirements-jdk]]
1111
=== JDK
1212

13-
JDK level 6.0 (or above) just like Hadoop. As JDK 6 as well as JDK 7 have been both EOL-ed and are not supported by recent product updates, we strongly recommend using the latest JDK 8 (at least u20 or higher). If that is not an option, use JDK 7.0 update u55 (required for Elasticsearch 1.2 or higher). An up-to-date support matrix for Elasticsearch is available https://www.elastic.co/subscriptions/matrix[here]. Do note that the JVM versions are *critical* for a stable environment as an incorrect version can corrupt the data underneath as explained in this http://www.elastic.co/blog/java-1-7u55-safe-use-elasticsearch-lucene/[blog post].
13+
JDK level 8 (at least u20 or higher). An up-to-date support matrix for Elasticsearch is available https://www.elastic.co/subscriptions/matrix[here]. Do note that the JVM versions are *critical* for a stable environment as an incorrect version can corrupt the data underneath as explained in this http://www.elastic.co/blog/java-1-7u55-safe-use-elasticsearch-lucene/[blog post].
1414

1515
One can check the available JDK version from the command line:
1616

@@ -54,21 +54,17 @@ $ curl -XGET http://localhost:9200
5454
[[requirements-hadoop]]
5555
=== Hadoop
5656

57-
Hadoop 2.x (ideally the latest stable version, currently 2.7.3). {eh} is tested daily against Apache Hadoop; any distro compatible with Apache Hadoop should work just fine.
57+
{eh} is compatible with Hadoop 2 and Hadoop 3 (ideally the latest stable version). It is tested daily against Apache Hadoop, but any distro
58+
compatible with Apache Hadoop should work just fine.
5859

5960
To check the version of Hadoop, one can refer either to its folder or jars (which contain the version in their names) or from the command line:
6061

6162
[source, bash]
6263
----
6364
$ bin/hadoop version
64-
Hadoop 2.4.1
65+
Hadoop 3.3.1
6566
----
6667

67-
[[requirements-yarn]]
68-
=== Apache YARN / Hadoop 2.x
69-
70-
{eh} binary is tested against Hadoop 2.x and designed to run on Yarn without any changes or modifications.
71-
7268
[[requirements-hive]]
7369
=== Apache Hive
7470

@@ -105,7 +101,7 @@ Spark 1.3.0 or higher. We recommend using the latest release of Spark (currently
105101
The same applies when using the Hadoop layer to integrate the two as {eh} supports the majority of
106102
Hadoop distributions out there.
107103

108-
The Spark version can be typically discovery by looking at its folder name:
104+
The Spark version can be typically discovered by looking at its folder name:
109105

110106
["source","bash",subs="attributes"]
111107
----
@@ -131,16 +127,13 @@ Welcome to
131127
[[requirements-spark-sql]]
132128
==== Apache Spark SQL
133129

134-
If planning on using Spark SQL make sure to download the appropriate jar. While it is part of the Spark distribution,
135-
it is _not_ part of Spark core but rather has its own jar. Thus, when constructing the classpath make sure to
130+
If planning on using Spark SQL make sure to add the appropriate Spark SQL jar as a dependency. While it is part of the Spark distribution,
131+
it is _not_ part of the Spark core jar but rather has its own jar. Thus, when constructing the classpath make sure to
136132
include +spark-sql-<scala-version>.jar+ or the Spark _assembly_ : +spark-assembly-{sp-v}-<distro>.jar+
137133

138-
{eh} supports Spark SQL 1.3 though 1.6 and also Spark SQL 2.0. Since Spark 2.x is not compatible with Spark 1.x,
139-
two different artifacts are provided by {eh}.
140-
{eh} supports Spark SQL {sp-v} through its main jar. Since Spark SQL 2.0 is _not_
141-
https://spark.apache.org/docs/latest/sql-programming-guide.html#upgrading-from-spark-sql-10-12-to-13[backwards compatible]
142-
with Spark SQL 1.6 or lower, {eh} provides a dedicated jar. See the Spark chapter for more information.
143-
Note that Spark 1.0-1.2 are no longer supported (again due to backwards incompatible changes in Spark).
134+
{eh} supports Spark SQL 1.3 though 1.6, Spark SQL 2.x, and Spark SQL 3.x. {eh} supports Spark SQL 2.x on Scala 2.11 through its main jar.
135+
Since Spark 1.x, 2.x, and 3.x are not compatible with each other, and Scala versions are not compatible, multiple different artifacts are
136+
provided by {eh}. Choose the jar appropriate for your Spark and Scala version. See the Spark chapter for more information.
144137

145138
[[requirements-storm]]
146139
=== Apache Storm

docs/src/reference/asciidoc/index.adoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,12 @@
1010
:ey: Elasticsearch on YARN
1111
:description: Reference documentation of {eh}
1212
:ver-d: {version}-SNAPSHOT
13-
:sp-v: 2.2.0
13+
:sp-v: 3.2.0
1414
:st-v: 1.0.1
1515
:pg-v: 0.15.0
16-
:hv-v: 1.2.1
16+
:hv-v: 2.3.8
1717
:cs-v: 2.6.3
18-
:hadoop-docs-v: 2.7.6
18+
:hadoop-docs-v: 3.3.1
1919

2020
include::{asciidoc-dir}/../../shared/versions/stack/{source_branch}.asciidoc[]
2121
include::{asciidoc-dir}/../../shared/attributes.asciidoc[]

0 commit comments

Comments
 (0)