Strata Presentation: “Messaging, storage, or both” (Part 1)

October 24, 2017

Matteo Merli

Last month I had the privilege of presenting at the Strata Data Conference in New York for the first time. Fellow Streamlio engineer Sijie Guo and I gave our presentation entitled “Messaging, storage, or both.” Our talk dived into the technologies that we have been working on for the past couple of years at Yahoo (Yahoo and Twitter for Sijie) and now at Streamlio: Apache Pulsar (incubating) and Apache BookKeeper.

It was a great experience. Messaging plus stream storage is a hot topic nowadays and we’ve been asked by several people who attended, as well as some who missed the talk, for the slides, so we’ve decided to record the presentation in three parts. In this first part I give an overview of the Pulsar messaging system, and what makes it different from other messaging systems from an enterprise standpoint, with its focus on scalability, multi-tenancy, and geo-replication. I also outline how BookKeeper makes much of this enterprise functionality possible, which Sijie will go into in more depth in parts 2 and 3.

For those of you unfamiliar with Apache Pulsar, it’s a next-generation pub-sub messaging system developed at Yahoo. Pulsar was developed from the ground up to address several shortcomings of existing open source messaging systems and has been running in production for three years, powering critical applications like Yahoo! Mail, Yahoo! Finance, Yahoo! Sports, Flickr, the Gemini Ads Platform, and Sherpa, Yahoo’s distributed key-value store. Pulsar was open sourced in late 2016 and is currently undergoing incubation under the auspices of the Apache Software Foundation.

Watch the 10-minute video below, or read more on the Streamlio blog.