In LINQ Where
is a streaming operator. Where-as OrderByDescending
is a non-streaming operator. AFAIK, a streaming operator only gathers the next item that is necessary. A non-streaming operator evaluates the entire data stream at once.
I fail to see the relevance of defining a Streaming Operator. To me, it is redundant with Deferred Execution. Take the example where I have written a custom extension and consumed it using the where operator and and orderby.
public static class ExtensionStuff
{
public static IEnumerable<int> Where(this IEnumerable<int> sequence, Func<int, bool> predicate)
{
foreach (int i in sequence)
{
if (predicate(i))
{
yield return i;
}
}
}
}
public static void Main()
{
TestLinq3();
}
private static void TestLinq3()
{
int[] items = { 1, 2, 3,4 };
var selected = items.Where(i => i < 3)
.OrderByDescending(i => i);
Write(selected);
}
private static void Write(IEnumerable<int> selected)
{
foreach(var i in selected)
Console.WriteLine(i);
}
In either case, Where
needs to evaluate each element in order to determine which elements meet the condition. The fact that it yields seems to only become relevant because the operator gains deferred execution.
So, what is the importance of Streaming Operators?
There are two aspects: speed and memory.
The speed aspect becomes more apparent when you use a method like .Take()
to only consume a portion of the original result set.
// Consumes ten elements, yields 5 results.
Enumerable.Range(1, 1000000).Where(i => i % 2 == 0)
.Take(5)
.ToList();
// Consumes one million elements, yields 5 results.
Enumerable.Range(1, 1000000).Where(i => i % 2 == 0)
.OrderByDescending(i => i)
.Take(5)
.ToList();
Because the first example uses only streaming operators before the call to Take
, you only end up yielding values 1 through 10 before Take
stops evaluating. Furthermore, only one value is loaded into memory at a time, so you have a very small memory footprint.
In the second example, OrderByDescending
is not streaming, so the moment Take pulls the first item, the entire result that's passed through the Where
filter has to be placed in memory for sorting. This could take a long time and produce a big memory footprint.
Even if you weren't using Take
, the memory issue can be important. For example:
// Puts half a million elements in memory, sorts, then outputs them.
var numbers = Enumerable.Range(1, 1000000).Where(i => i % 2 == 0)
.OrderByDescending(i => i);
foreach(var number in numbers) Console.WriteLine(number);
// Puts one element in memory at a time.
var numbers = Enumerable.Range(1, 1000000).Where(i => i % 2 == 0);
foreach(var number in numbers) Console.WriteLine(number);
The fact that it yields seems to only become relevant because the operator gains deferred execution.
So, what is the importance of Streaming Operators?
I.e. you could not process infinite sequences with buffering / non-streaming extension methods - while you can "run" such a sequence (until you abort) just fine using only streaming extension methods.
Take for example this method:
public IEnumerable<int> GetNumbers(int start)
{
int num = start;
while(true)
{
yield return num;
num++;
}
}
You can use Where
just fine:
foreach (var num in GetNumbers(0).Where(x => x % 2 == 0))
{
Console.WriteLine(num);
}
OrderBy()
would not work in this case since it would have to exhaustively enumerate the results before emitting a single number.
while(true)yield return 4;
if you wanted - Servy 2012-04-05 21:23
Just to be explicit; in the case you mentioned there's no advantage to the fact that where streams, since the orderby sucks the whole thing in anyway. There are however times where the advantages of streaming are used (other answers/comments have given examples), so all LINQ operators stream to the best of their ability. Orderby streams as much as it can, which happens to be not very much. Where streams very effectively.
Where
and OrderBy
- Zev Spitz 2016-09-29 01:35
SkipWhile
could qualify, depending on how you define what the quantity of "streaming" is - Servy 2016-09-29 13:19
Select
), whereas non-streaming requires all the values in order to evaluate the first value (e.g. OrderBy
). By that definition, why should SkipWhile
be any less streaming than Select
? The first iteration doesn't require all the values - Zev Spitz 2016-09-29 13:27
SkipWhile
doesn't need to consume all of the input sequence in order to return the first item, but it needs to consume an arbitrary percentage of it, at which point it then switches to consuming one item from the input sequence for each result yielded. So again, this is why I say it depends on how you define your terms; there is no single well defined definition in this context - Servy 2016-09-29 13:31
Where
should be thought of as intermittently streaming and non-streaming. But some condition that determines if the value is passed to the next operator / foreach
doesn't fundamentally change the behavior of the query; each value can be evaluated precisely when it's needed to pass onto other operators. OrderBy
is very different -- the first result requires an evaluation of all the results - Zev Spitz 2016-09-30 00:33
Where
does or doesn't stream. I'm using a analog definition of the term to describe the degree to which something streams its data. Where
doesn't stream the data perfectly, the way something like Select
does. It streams the data quite a lot more than other operators, such as OrderBy
. It's a spectrum. Where
is rather close to the "perfect streaming" site without being quite there, OrderBy
is very close to the other side. If you want to draw a line on that spectrum you can - Servy 2016-09-30 13:07
Where
can span the entire gamut of the degree of streaming -- Where(x=>true)
is fully streaming like Select
, Where(x=>false)
is nonstreaming like OrderBy
, and the degree of streamin in var rnd = new Random(); var query = data.Where(x=>x>rnd.Next())
depends on the algorithm of the psuedorandom number generator. This means that it is pointless to categorize methods in terms of their degree of streaming, because it depends on the arguments passed to the method - Zev Spitz 2016-10-01 16:47
items
- cHao 2012-04-05 21:11