Skip to content

Commit d7d7541

Browse files
authored
Fixing out of date requirements docs (#1844) (#1884)
This commit makes several fixes to the requirements documentation, which had fallen out of date for java, hadoop, hive, and spark. Closes #1843
1 parent 87ee0ed commit d7d7541

File tree

2 files changed

+13
-20
lines changed

2 files changed

+13
-20
lines changed

docs/src/reference/asciidoc/core/intro/requirements.adoc

Lines changed: 10 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ TIP: {eh} adds no extra requirements to Hadoop (or the various libraries built o
1010
[[requirements-jdk]]
1111
=== JDK
1212

13-
JDK level 6.0 (or above) just like Hadoop. As JDK 6 as well as JDK 7 have been both EOL-ed and are not supported by recent product updates, we strongly recommend using the latest JDK 8 (at least u20 or higher). If that is not an option, use JDK 7.0 update u55 (required for Elasticsearch 1.2 or higher). An up-to-date support matrix for Elasticsearch is available https://www.elastic.co/subscriptions/matrix[here]. Do note that the JVM versions are *critical* for a stable environment as an incorrect version can corrupt the data underneath as explained in this http://www.elastic.co/blog/java-1-7u55-safe-use-elasticsearch-lucene/[blog post].
13+
JDK level 8 (at least u20 or higher). An up-to-date support matrix for Elasticsearch is available https://www.elastic.co/subscriptions/matrix[here]. Do note that the JVM versions are *critical* for a stable environment as an incorrect version can corrupt the data underneath as explained in this http://www.elastic.co/blog/java-1-7u55-safe-use-elasticsearch-lucene/[blog post].
1414

1515
One can check the available JDK version from the command line:
1616

@@ -54,21 +54,17 @@ $ curl -XGET http://localhost:9200
5454
[[requirements-hadoop]]
5555
=== Hadoop
5656

57-
Hadoop 2.x (ideally the latest stable version, currently 2.7.3). {eh} is tested daily against Apache Hadoop; any distro compatible with Apache Hadoop should work just fine.
57+
{eh} is compatible with Hadoop 2 and Hadoop 3 (ideally the latest stable version). It is tested daily against Apache Hadoop, but any distro
58+
compatible with Apache Hadoop should work just fine.
5859

5960
To check the version of Hadoop, one can refer either to its folder or jars (which contain the version in their names) or from the command line:
6061

6162
[source, bash]
6263
----
6364
$ bin/hadoop version
64-
Hadoop 2.4.1
65+
Hadoop 3.3.1
6566
----
6667

67-
[[requirements-yarn]]
68-
=== Apache YARN / Hadoop 2.x
69-
70-
{eh} binary is tested against Hadoop 2.x and designed to run on Yarn without any changes or modifications.
71-
7268
[[requirements-hive]]
7369
=== Apache Hive
7470

@@ -103,7 +99,7 @@ native integration (which is recommended) with {sp} it does not matter what bina
10399
The same applies when using the Hadoop layer to integrate the two as {eh} supports the majority of
104100
Hadoop distributions out there.
105101

106-
The Spark version can be typically discovery by looking at its folder name:
102+
The Spark version can be typically discovered by looking at its folder name:
107103

108104
["source","bash",subs="attributes"]
109105
----
@@ -129,16 +125,13 @@ Welcome to
129125
[[requirements-spark-sql]]
130126
==== Apache Spark SQL
131127

132-
If planning on using Spark SQL make sure to download the appropriate jar. While it is part of the Spark distribution,
133-
it is _not_ part of Spark core but rather has its own jar. Thus, when constructing the classpath make sure to
128+
If planning on using Spark SQL make sure to add the appropriate Spark SQL jar as a dependency. While it is part of the Spark distribution,
129+
it is _not_ part of the Spark core jar but rather has its own jar. Thus, when constructing the classpath make sure to
134130
include +spark-sql-<scala-version>.jar+ or the Spark _assembly_ : +spark-assembly-{sp-v}-<distro>.jar+
135131

136-
{eh} supports Spark SQL 1.3 though 1.6 and also Spark SQL 2.0. Since Spark 2.x is not compatible with Spark 1.x,
137-
two different artifacts are provided by {eh}.
138-
{eh} supports Spark SQL {sp-v} through its main jar. Since Spark SQL 2.0 is _not_
139-
https://spark.apache.org/docs/latest/sql-programming-guide.html#upgrading-from-spark-sql-10-12-to-13[backwards compatible]
140-
with Spark SQL 1.6 or lower, {eh} provides a dedicated jar. See the Spark chapter for more information.
141-
Note that Spark 1.0-1.2 are no longer supported (again due to backwards incompatible changes in Spark).
132+
{eh} supports Spark SQL 1.3 though 1.6, Spark SQL 2.x, and Spark SQL 3.x. {eh} supports Spark SQL 2.x on Scala 2.11 through its main jar.
133+
Since Spark 1.x, 2.x, and 3.x are not compatible with each other, and Scala versions are not compatible, multiple different artifacts are
134+
provided by {eh}. Choose the jar appropriate for your Spark and Scala version. See the Spark chapter for more information.
142135

143136
[[requirements-storm]]
144137
=== Apache Storm

docs/src/reference/asciidoc/index.adoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,12 @@
1010
:ey: Elasticsearch on YARN
1111
:description: Reference documentation of {eh}
1212
:ver-d: {version}-SNAPSHOT
13-
:sp-v: 2.2.0
13+
:sp-v: 3.2.0
1414
:st-v: 1.0.1
1515
:pg-v: 0.15.0
16-
:hv-v: 1.2.1
16+
:hv-v: 2.3.8
1717
:cs-v: 2.6.3
18-
:hadoop-docs-v: 2.7.6
18+
:hadoop-docs-v: 3.3.1
1919

2020
include::{asciidoc-dir}/../../shared/versions/stack/{source_branch}.asciidoc[]
2121
include::{asciidoc-dir}/../../shared/attributes.asciidoc[]

0 commit comments

Comments
 (0)