A cloud-native architecture for messaging, processing and stream storage
Legacy architectures for data processing were designed for enterprise datacenters and monolithic applications. Combining all components of data infrastructure in a monolithic stack, they are unable to take full advantage of modern virtualization, container, and cloud technologies. That is also true for existing messaging and streaming solutions.
The Streamlio platform, powered by Apache Pulsar, was built with a modern architecture designed for performance, scalability and flexibility.
Most data processing systems are built with monolithic architectures–either scale-up single-node architectures or “distributed monoliths” in which all components are co-resident on each node. In contrast, Streamlio uses a decoupled architecture consisting of multiple layers–message serving, processing and stream storage. Each of those layers is distributed and independently scalable to precisely match workload demands. Read more about the architecture.
Receiving and serving data is handled by a scalable, stateless set of message brokers. Because the brokers in this layer are stateless, failover is fast and simple–a new broker can take over for any failed brokers without requiring data recopying or redistribution.
Processing data as quickly as it arrives has typically required complex frameworks, new programming paradigms and new skills. Apache Pulsar, which powers the Streamlio solution, takes a different approach. Inspired by serverless processing, Pulsar allows developers to write Pulsar Functions without requiring a bulky SDK or new paradigms. Developers simply write their processing logic as a function that they deploy to a Pulsar cluster. Execution of functions is handled by a scalable set of processing resources that can be run locally or using containers and schedulers like Kubernetes. Learn more about processing capabilities in the Streamlio solution.
Data persistence and retention is provided by a scalable log-storage subsystem based on Apache BookKeeper. This decoupled storage layer provides greater flexibility, resiliency and throughput than monolithic architectures, which depend on the storage capacity and throughput of a single broker. Read more about the storage architecture used in the Streamlio solution.
The technology powering the Streamlio solution, Apache Pulsar, has been proven in production, handling millions of topics and millions of messages per second.
A scale-out message processing layer combined with the Apache BookKeeper stream storage solution provide the combination of performance, durability, and scalability needed for modern streaming
Scale-out architecture, performance isolation of read and write operations, and fine-grained tunability provide low latency and high throughput for publishing and consuming data deliver leading performance
Data persistence guarantees and built-in multi-datacenter replication across datacenters and geographic regions ensure that data is always protected and available without needing additional components or complex configurations
Independent scaling of message processing and streaming data storage, without data redistribution, make it possible to scale on the fly to support millions of topics and messages per second
Designed with the security, isolation, resource management, and scalable performance needed to support large numbers of topics, publishers and consumers in a single solution to avoid the complexities of siloed data
Streamlio’s platform is powered by Apache Pulsar and the associated open source technologies that we helped to create and operate at production scale. Streamlio is committed to continuing to help innovate and extend these technologies through our contributions. Learn more about the open source communities we help support and how you can engage.