Streamlio Architecture

A cloud-native architecture for messaging, processing and stream storage

An introduction to a unified architecture for streaming

Legacy architectures for data processing were designed for enterprise datacenters and monolithic applications. Combining all components of data infrastructure in a monolithic stack, they are unable to take full advantage of modern virtualization, container, and cloud technologies. That is also true for existing messaging and streaming solutions.

The Streamlio platform, powered by Apache Pulsar, was built with a modern architecture designed for performance, scalability and flexibility.

Multi-Layer Architecture

Most data processing systems are built with monolithic architectures–either scale-up single-node architectures or “distributed monoliths” in which all components are co-resident on each node. In contrast, Streamlio uses a decoupled architecture consisting of multiple layers–message serving, processing and stream storage. Each of those layers is distributed and independently scalable to precisely match workload demands. Read more about the architecture.

Message serving

Receiving and serving data is handled by a scalable, stateless set of message brokers. Because the brokers in this layer are stateless, failover is fast and simple–a new broker can take over for any failed brokers without requiring data recopying or redistribution.

Processing

Processing data as quickly as it arrives has typically required complex frameworks, new programming paradigms and new skills. Apache Pulsar, which powers the Streamlio solution, takes a different approach. Inspired by serverless processing, Pulsar allows developers to write Pulsar Functions without requiring a bulky SDK or new paradigms. Developers simply write their processing logic as a function that they deploy to a Pulsar cluster. Execution of functions is handled by a scalable set of processing resources that can be run locally or using containers and schedulers like Kubernetes. Learn more about processing capabilities in the Streamlio solution.

Stream storage

Data persistence and retention is provided by a scalable log-storage subsystem based on Apache BookKeeper. This decoupled storage layer provides greater flexibility, resiliency and throughput than monolithic architectures, which depend on the storage capacity and throughput of a single broker. Read more about the storage architecture used in the Streamlio solution.

Highlights

The technology powering the Streamlio solution, Apache Pulsar, has been proven in production, handling millions of topics and millions of messages per second.

Unique architecture

A scale-out message processing layer combined with the Apache BookKeeper stream storage solution provide the combination of performance, durability, and scalability needed for modern streaming

Proven performance

Scale-out architecture, performance isolation of read and write operations, and fine-grained tunability provide low latency and high throughput for publishing and consuming data deliver leading performance

No data loss

Data persistence guarantees and built-in multi-datacenter replication across datacenters and geographic regions ensure that data is always protected and available without needing additional components or complex configurations

Easy scalability

Independent scaling of message processing and streaming data storage, without data redistribution, make it possible to scale on the fly to support millions of topics and messages per second

Multi-tenancy

Designed with the security, isolation, resource management, and scalable performance needed to support large numbers of topics, publishers and consumers in a single solution to avoid the complexities of siloed data

Ease of use

Unified publish-subscribe and queuing in one solution, multi-language API, Kafka compatibility, and support for the OpenMessaging standards make it easy to develop data flows that meet the needs of diverse streaming applications

Powered by Open Source

Streamlio’s platform is powered by Apache Pulsar and the associated open source technologies that we helped to create and operate at production scale. Streamlio is committed to continuing to help innovate and extend these technologies through our contributions. Learn more about the open source communities we help support and how you can engage.