Suan Pan

OpenX Inc.

Bo Liu

Suan Pan

A behind the scenes look at the the design and the technology

Bo Liu

Our Story

We need to count number

Grid

Real time?

Real Time Counting Engine

  1. Receiving LWES events
  2. Parse events to key count pairs
  3. Aggregating counts
  4. Storing result
  5. Batch/Ad-hoc time range query
  6. High Available
  7. Fault Tolerant
  8. Soft Real Time

Abacus

    Approach
  1. Event listeners receive LWES events and parses them into key/count pairs, cache and flush them to relay servers
  2. Relay servers call Couchbase to increment counts
  3. Query servers convert range query to Couchbase get operations

HA vs FT

  1. High Availablity

Fault Tolerance Semantics

  1. At Most Once
  2. At Least Once
  3. Exactly Once

Example: Incremental counting problem

Exactly once is important

Exactly Once Counting System

Suanpan

Exactly Onece - Receiving Data

  1. Kafka
    • High level consumer API
    • Simple consumer API
  2. Spark Streaming
    • Kafka high level API + WAL
    • Kafka simple API + checkpoint

Exactly Onece - Processing Stream

Exactly Onece - Processing Stream

Exactly Onece - Updating Result

  1. Spark Streaming output operations have at least once semantics
    • Idempotent updates
    • Transactional updates

How Spark Streaming Works - Representation

http://spark.apache.org/docs/latest/streaming-programming-guide.html https://dzone.com/refcardz/apache-spark

How Spark Streaming Works - Scheduling

http://subprotocol.com/system/spark-darling-of-big-data.html https://dzone.com/refcardz/apache-spark

How Spark Streaming Works - Parallelism

From http://blog.octo.com/en/gather-shopping-receipts-architecture-overview/

Suanpan Clusters

Road Map

  1. HOX/OTE - PROD 1/20/16
  2. TargetCPM - PROD 1/20/16
  3. RTG - turned on in PROD 2/23/16
  4. NTR - QA
  5. RTRM- DEV
  6. Bidder Monitoring - design

Functional Programming

  1. No side effects
  2. First class function
  3. Lazy evaluation
  4. Just feels good

Suanpan

  1. Exactly once / Idempotent / FT
  2. M/R
  3. Stream processing

Lessons learned

Lessons learned

      val keyCounts =
        stream
        .flatMap {eventToKeyCounts(_)}
        .reduceByKey {(x, y) => x + y, new SimplePartitioner(parN)}
        .map { case ((partition, key), cnt) => (key, cnt) }
        .updateStateByKey {updateCallback _, new SimplePartitioner(parN)}
        // mapWithState - faster state updating API since Spark 1.6
        
      keyCounts .foreachRDD { (rdd: RDD[(Key, Cnt)], time: Time) =>
        rdd .foreachPartition { partition => pushToRiak(partition) }
      }
    

Lessons learned

Lessons learned

Lessons learned

Special Thanks!!!

Q & A

Compute Engine

Spark Storm Flink
Latency Seconds Sub-Second Sub-Second
Processing Model Mini-Batch Spout/Bolt Spout/Bolt
API Model M/R Spout/Bolt M/R
Version (by 6/2015) 1.3.1 0.10.0 0.9.0
Fault-tolerance Semantics exactly-once at-least-once exactly-once