In recent years we witnessed the rise of applications in which data is continuously generated and pushed towards consumers in real time through complex processing pipelines. Software connectors like remote procedure call (RPC) do not fit with the needs of such applications, for which the publish/subscribe and the stream connectors are more suitable.
This paper introduces the design space of the stream software connector by analyzing recent stream processing engine frameworks and domain specific languages featuring native streaming support. On the one side, we want to classify and compare streaming systems based on a taxonomy derived from the wide range of features they offer (i.e., pipeline dynamicity and representation, load balancing and deployment flexibility).
On the other side, the gaps in the design space we identify point at future research directions in the area of distributed stream processing.
To do so, we gather valuable architectural knowledge in terms of architectural issues and alternatives, elicited by surveying the most important architectural decisions made by the designers of several representative streaming framework architectures.