Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Came across this code.

var dic = new Dictionary<int, string>();
for(int i=0; i<20000; i++)
{
    dic.Add(i, i.ToString());
}

var list = dic.Where(f => f.Value.StartsWith("1")).Select(f => f.Key);//.ToList(); //uncomment for fast results 
Console.WriteLine(list.GetType());
var list2 = dic.Where(f => list.Contains(f.Key)).ToList();
Console.WriteLine(list2.Count());

So when .ToList() is commented it's slow, when not - it's fast. Reproducable here How could this be explained? Should I always make everything ToList() to ensure speed (i.e. in which circumstances IEnumerable would be more preferable)? Note I'm talking only about linq to objects, I know linq to sql laziness and stuff.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
351 views
Welcome To Ask or Share your Answers For Others

1 Answer

This is because of deferred execution: when you comment out ToList, the enumeration is produced by evaluating the sequence of filters for each item in the dictionary. When you do a ToList, however, the sequence is "materialized" in memory, so all the evaluations are performed exactly once.

The logic behind the second Where without ToList looks like this:

// The logic is expanded for illustration only.
var list2 = new List<KeyValuePair<int,string>>();
foreach (var d in dict) {
    var list = new List<int>();
    // This nested loop does the same thing on each iteration,
    // redoing n times what could have been done only once.
    foreach (var f in dict) {
        if (f.Value.StartsWith("1")) {
            list.Add(f.Key);
        }
    }
    if (list.Contains(d.Key)) {
        list2.Add(d);
    }
}

The logic with ToList looks like this:

// The list is prepared once, and left alone
var list = new List<int>();
foreach (var f in dict) {
    if (f.Value.StartsWith("1")) {
        list.Add(f.Key);
    }
}
var list2 = new List<KeyValuePair<int,string>>();
// This loop uses the same list in all its iterations.
foreach (var d in dict) {
    if (list.Contains(d.Key)) {
        list2.Add(d);
    }
}

As you can see, the ToList transforms an O(n^2) program with two nested loops of size n into O(2*n) with two sequential loops of size n each.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...