As Product Manager, you sit at the intersection of UX, business, and technology.
Your existing technology stack may be a limitation to implementing your vision, especially if you are a data-driven Product Manager confronting the reality of exponentially increasing, and almost unimaginable data growth. Customers require a modern experience, one that feels effortless, intuitive, and personalized. No customer has ever asked for a slower, static, generic experience, and your product vision includes a real-time requirement operating on massive amounts of data to deliver a meaningful new experience to your users.
You run a feasibility study internally and discover limitations in your current batch-first data warehouse stack. You hear the words: ‘We have too much data to process. You want real-time but our batch jobs run overnight and take 8 to 10 hours.” Your vision becomes a stale, wishful dream that you knew would work if only you had the right tools.
The key to augmenting your existing batch-first environment, unlocking the real-time potential of your data, and realizing your dream product consists of three equally important components, all operating in real-time:
This transition from a batch-first paradigm to a real-time-first (or streaming-first) paradigm can be accomplished seamlessly using Streamlio, an enterprise-grade, unified, real-time solution integrating messaging, compute, and stream storage. Streamlio augments the existing batch-first infrastructure –typically a monolithic, legacy infrastructure – and makes turn-key real-time infrastructure possible. We seamlessly ingest and process incoming “hot” data (real-time data on the order of milliseconds) and “warm” data (stored data on the order of seconds, days, months, or even a year). Connectors from Streamlio tap into data warehouse sources, transported via data messaging, and allow real-time infrastructure to augment existing data warehouses and data lakes.
Real-time personalization is one example of a real-time use case. Imagine the data-driven physical market of the future, where a shopper places tomatoes, pasta sauce, and garlic bread into his smart-basket and companies are then able to bid, in real-time, on advertisements offering pricing discounts to the shopper’s smartphone. A shopper can choose to make his shopping profile accessible to advertisers, and this historical purchasing profile, combined with demographic information, food allergies, time of day, and real-time basket of physical goods, can drive real-time mobile advertising bids for, say, discounted Italian pasta campaigns. The value of showing him pasta sauce advertisements could have been near zero based on his previous day’s purchase history, but with the real-time basket of tomatoes, pasta sauce, and garlic bread, there is an increased propensity to purchase pasta, hence the increased value of a real-time ad bid.
This type of real-time personalization requires messaging, compute, and stream storage as follows:
The winning advertiser is then routed directly to the edge node (via messaging) and transmitted to the shopper, with a display notification of discounted pasta brand influencing his purchasing decision, all as he walks through the store and before he has made his next purchase.
Another developing landscape of real-time needs is developing in the Internet of Things and the Industrial Internet of Things (IoT/IIoT). The pattern is again messaging, compute, and stream storage. Sensor data from millions or hundreds of millions of sources flow into edge nodes. These edge nodes require a messaging system to buffer and transport data to a compute system, often within the same node, to aggregate and perform calculations (such as outlier detection) prior to discarding raw data and transmitting transformed data to a central datacenter. The volume of raw data in Iot/IIoT use cases would overwhelm the current network, and is only expected to grow.
One such use case is anomaly detection to prevent a coordinated network attack or bot attack. To discover an attack, the aggregate sensor traffic must be calculated in real-time at the edge node and compared to historical ranges of normal traffic. If raw traffic were sent to the central datacenter, a bot attack could overwhelm the network and cause cascading outages. Preventing such a coordinated attack requires all data to be aggregated and compared against historical norms; if a surge is detected at the edge node, security measures can be deployed to prevent cascading failures.
Product managers leveraging real-time data and developing use cases around Smart Cities can combine geolocation sources generated from mobile phones, traffic data from autonomous vehicles and smart surfaces, and weather forecasts to create new products. Vast data are no longer stored, to be queried for simple business intelligence and policy decision making, but rather used in real-time to generate automated real-time functionality with no human in the loop. Automated actions can include optimization of traffic lights to reduce fuel consumption, or linking emergency services directly to sensor data generated from vehicles in an accident to deploy instantly and save lives.
Streamlio is designed to seamlessly integrate messaging, compute and stream storage, the three requirements of real-time. We deliver an end-to-end real-time solution that augments existing data stores and data warehouses, unlocking real-time products that Product Managers have envisioned but could never bring to market because of difficulties updating existing batch-first technology stack to a streaming-first paradigm. Data will continue to increase in volume and velocity, and a batch-first paradigm based on legacy storage is insufficient to deliver the user experience modern customers are expecting. We are the ex-Twitter/Yahoo co-creators of underlying technology, Apache Pulsar (for durable messaging), Heron (for compute), and BookKeeper (for stream storage), with proven reliability at scale.