Skip to content

Commit

Permalink
update the post
Browse files Browse the repository at this point in the history
  • Loading branch information
Sung-Soo Kim committed Jan 27, 2014
1 parent a33971d commit 182d914
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion _posts/2014-01-28-operators.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ Operators
The simplest continuous query operators are *stateless*; examples include *duplicate-preserving projection*, *selection*, and *union*. These operators process new tuples *on-the-fly* without storing any temporary results, either by discarding unwanted attributes (projection) or dropping tuples that do not satisfy the selection condition (technically, the union operator temporarily buffers the inputs to ensure that its output stream is ordered). Figure 2.1(a) shows a simple example of selection (of all the “a” tuples) over the character stream S1.
A *non-blocking*, *pipelined join* (of two character streams, *S1* and *S2*) is illustrated in Figure 2.1(b). A *hash-based implementation* maintains hash tables on both inputs. When a new tuple arrives on one of the inputs, it is inserted into its hash table and probed against the other stream’s hash table to generate results involving the new tuple, if any. Joins of more than two streams and joins of streams with a static relation are straightforward extensions. In the former, for each arrival on one input, the states of all the other inputs are probed in some order. In the latter, new arrivals on the stream trigger the probing of the relation.
Since maintaining hash tables on *unbounded* streams is not practical, most DSMSs only support *window joins*. Query *Q2* below is an example of a *tumbling window join* (on the attribute attr) of two streams, *S1* and *S2*, where the result tuples must satisfy the join predicate and belong to the same one-minute tumbling window. Similar to *Q1*, tumbling windows are created by grouping on the timestamp attribute. At the end of each window, the *join operator* can clear its hash tables and start producing results for the next window.
```Q2: SELECT *
```sql
Q2: SELECT *
FROM S1, S2 WHERE S1.attr = S2.attr GROUP BY S1.timestamp/60 AS minute
```One disadvantage of *Q2* is that *matching tuples*, whose timestamps are only a few seconds apart but happen to fall into different tumbling windows, will not be reported. Another option is a *sliding window join*, where two tuples join if they satisfy the join predicate and if their timestamps are at most one window length, call it *w*, apart. A sliding window join may be expressed in a similar way to *Q3* below:
```Q3: SELECT *
Expand Down

0 comments on commit 182d914

Please sign in to comment.