A wrapper around the Flink Java framework for deploying INEv plans for complex event processing (CEP) with extended functionality. The work in this repository builds on the efforts of S. Akili, S. Purtzel, and D. Pahl1 and features adaptivity of the originally generated executution plans to flactuations in event frequency at runtime.
-
🎓 This project originated as a product of the research and provides empirical support for the claims made in the graduation thesis.
-
📝 Check out my blog posts on the learnings.
The core feature is the ability of the deployed system where one of the sub-queries in the evaluation plan has a multi-sink placement to fall back to a single-node placement of the said sub-query as soon as the original plan becomes ineffient. The inefficiency refers to inflated amounts of data transmitted in the network.
In brief, each node in the event processing network runs its own instance of the Flink CEP engine fronted with a TCP server to process incoming connections and backed with the Sink
to filter, deduplicate and forward relevant events further. Data is trasmitted using Flink streaming. The Monitor
component track event rates and triggers the global Coordinator
component based on the conditions specified in the paper on INEv graphs. The Coordinator
acts as a centralized manager to generate a new evaluation plan and disseminate it among the nodes.
- No revert feature for the applied repair
- The repository provides a POC for evaluation plans featuring just one instance of a multi-sink query. This, however, can be extended to be more generic.
- Only
AND
andSEQ
query operators - The
Coordinator
components can be integrated with one of the nodes in the network to eliminate the centralization element - Low data throughput in the Flink CEP engine. A slight increase in data input frequency incurs great processing latency.
Footnotes
-
Affilicated with the Humboldt University of Berlin. The codebase is stored in a private repository on a privately hosted instance of GitLab. ↩