- This event has passed.
Boston Apache Spark User Group: April Presentation Night
April 21, 2017 @ 6:00 pm - 8:00 pm
A big thank you to Databricks for sponsoring this event on short notice!
* 6:00 – 6:30: Mingling + food and drink
* 6:30 – 6:35: Opening Remarks
* 6:40 – 7:20: Feature Talk
Feature Talk Title: Evolution of an Apache Spark Architecture for Processing Game Data
Intended Audience: All levels (new to Spark through running Spark in production)
Speaker: Nick Afshartous (WB Games)
Feature Talk Abstract: We discuss lessons learned from our first production deployment of a Spark Streaming pipeline for processing game data. Deployment is to the AWS Cloud where we use managed services (i.e. EMR, S3 and Redshift). However, having downstream dependencies with outages and unpredictable response latencies can pose significant challenges. To address, we evolved the architecture by separating data processing from post-processing tasks (i.e. copying data into Redshift). Post-processing tasks are sent downstream from Spark to a task executor that was built using Akka Streams and Reactive Kafka. The end result is a loosely coupled architecture where the Spark streaming job is a firehose to S3 and fault-tolerant when Redshift is unavailable.