Apache® Spark™ News

A new model of data with streaming and Spark

Streaming data is changing the way people look at Big Data processing. The benefit of streaming is that data can be used as it comes in, rather than waiting for it to be sorted, stored and evaluated. In a market where data-driven customer interactions happen in seconds, this is a huge advantage. To help bring streaming data into focus, Jeff Frick and George Gilbert, cohosts of theCUBE, from the SiliconANGLE Media team, joined Reynold Xin at the Spark Summit East 2016 conference. Xin is the cofounder and chief architect for Spark at Databricks, Inc.

Diving into Spark Streaming’s Execution Model

With so many distributed stream processing engines available, people often ask us about the unique benefits of Spark Streaming. From early on, Apache Spark has provided an unified engine that natively supports both batch and streaming workloads. This is different from other systems that either have a processing engine designed only for streaming, or have similar batch and streaming APIs but compile internally to different engines. Spark’s single execution engine and unified programming model for batch and streaming lead to some unique benefits over other traditional streaming systems. In particular, four major aspects are:

Four Things to Know about Reliable Spark Streaming with Typesafe and Databricks

Last week, we were happy to have a Typesafe co-webinar with Databricks, the company founded by the team that started the Spark research project at UC Berkeley that later became Apache Spark. Our Big Data Architect Dean Wampler and Datatbrick's Lead Engineer for Spark Streaming, Tathagata Das (TD) provided a 1-hour presentation with Q/A on Spark Streaming, which makes it easy to build scalable fault-tolerant streaming applications with Apache Spark. In this webinar, we reviewed: - See more at: https://www.typesafe.com/blog/four-things-to-know-about-reliable-spark-streaming-typesafe-databricks#sthash.7Nm47kiw.dpuf

Databricks is now Generally Available

We are excited to announce today, at Spark Summit 2015, the general availability of the Databricks – a hosted data platform from the team that created Apache Spark. With Databricks, you can effortlessly launch Spark clusters, explore data interactively, run production jobs, and connect third-party applications. We believe Databricks is the easiest way to use big data.