Apache® Spark™ News

Spark In MapReduce (SIMR)

Hadoop integration has always been a key goal of Spark and YARN users have long been able to run Spark on YARN. However, up to now, it has been relatively hard to run Apache Spark on Hadoop MapReduce v1 clusters, i.e. clusters that do not have YARN installed. Typically, users would have to get permission to install Spark/Scala on some subset of the machines, a process that could be time consuming. Enter SIMR (Spark In MapReduce), which has been released in conjunction with Spark 0.8.1.