Skip to content

A wrapper around the Apache Flink Java lib to deploy INEv plans for distributed complex event processing with extended functionality

License

Notifications You must be signed in to change notification settings

Kristina-Pianykh/flink-multinode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flink Multinode

A wrapper around the Flink Java framework for deploying INEv plans for complex event processing (CEP) with extended functionality. The work in this repository builds on the efforts of S. Akili, S. Purtzel, and D. Pahl1 and features adaptivity of the originally generated executution plans to flactuations in event frequency at runtime.

  • 🎓 This project originated as a product of the research and provides empirical support for the claims made in the graduation thesis.

  • 📝 Check out my blog posts on the learnings.

Adaptivity Feature

The core feature is the ability of the deployed system where one of the sub-queries in the evaluation plan has a multi-sink placement to fall back to a single-node placement of the said sub-query as soon as the original plan becomes ineffient. The inefficiency refers to inflated amounts of data transmitted in the network.

Original Network Original Network Repaired Network Repaired Network

Architecture

In brief, each node in the event processing network runs its own instance of the Flink CEP engine fronted with a TCP server to process incoming connections and backed with the Sink to filter, deduplicate and forward relevant events further. Data is trasmitted using Flink streaming. The Monitor component track event rates and triggers the global Coordinator component based on the conditions specified in the paper on INEv graphs. The Coordinator acts as a centralized manager to generate a new evaluation plan and disseminate it among the nodes.

Architecture

Limitations (aka Optimization Potential)

  • No revert feature for the applied repair
  • The repository provides a POC for evaluation plans featuring just one instance of a multi-sink query. This, however, can be extended to be more generic.
  • Only AND and SEQ query operators
  • The Coordinator components can be integrated with one of the nodes in the network to eliminate the centralization element
  • Low data throughput in the Flink CEP engine. A slight increase in data input frequency incurs great processing latency.

Latency Cumulative

Footnotes

  1. Affilicated with the Humboldt University of Berlin. The codebase is stored in a private repository on a privately hosted instance of GitLab.

About

A wrapper around the Apache Flink Java lib to deploy INEv plans for distributed complex event processing with extended functionality

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published