Safe foreach loops with C#
This is rewrite of my old post about bullet-proof foreach loops. The post covers internals and functioning of foreach. It also shows how to write safe loops and how to modify collections that foreach is stepping through. This post is excellent reading for those who want to have better understanding of foreach loops.
Internals of foreach
foreach operates internally on IEnumerable or IEnumerable<T>. The following piece of code:
public static void DoSomething(IList<MyClass> items)
{
foreach (var item in items)
{
}
}
is translated to this by language compiler when we compile the code:
public static void DoSomething(IList<MyClass> items)
{
foreach (MyClass myClass in (IEnumerable<MyClass>)items)
;
}
Here is the hierarchy of interfaces that foreach needs to operate on some collection.
public interface IEnumerable<out T> : IEnumerable
{
IEnumerator<T> GetEnumerator();
}
public interface IEnumerable
{
IEnumerator GetEnumerator();
}
public interface IEnumerator<out T> : IEnumerator, IDisposable
{
T Current { get; }
}
public interface IEnumerator
{
object Current { get; }
bool MoveNext();
void Reset();
}
We can use foreach on everythings that implements IEnumerable interface. Notice that generic version IEnumerable<T> hides Current property and replaces it by generic version.
When called, foreach asks enumerator from collection and starts stepping through it by calling MoveNext() method. After every successful call there is current element given by Current property. This element is assigned to our loop variable and loop body is executed.
In glance this is how foreach works.
Back to early days: ArrayList
Let’s focus on old days classics before coming back to current time and let’s see good old ArrayList. ArrayList, as we remember, was array of objects of any type represented as list. Here is piece of demo code.
public static void DoSomething(ArrayList items)
{
foreach(MyClass item in items)
{
// Do something with item
}
}
Although simple piece of code it already has problems built in:
- items can be null,
- item can be null,
- item can be of some other type,
- there’s cast and it comes with some overhead.
First three are concrete error situations and last one is out of topic for this blog post. Here is piece of code that shows how to run into troubles with DoSomething() method.
var item = new MyClass();
DoSomething(null); // NullReferenceException
DoSomething(new ArrayList { item, null }); // NullReferenceException
DoSomething(new ArrayList { item, "aaa" }); // InvalidCastException
By moving from ArrayList to generic IEnumerable we get rid of last two problems. Additional null checks cover first two problems.
public static void DoSomething(IEnumerable<MyClass> items)
{
if(items == null)
{
return;
}
foreach(var item in items)
{
if(item == null)
{
continue;
}
item.MyMethod();
}
}
Some notes:
- IEnumerable<T> is smallest interface that foreach supports,
- It’s okay to use IList<T> or ICollection<T> or some other generic collection instead,
- Using generics makes it impossible to add some other type of objects to collection.
Now we have basics covered. The loop above is bullet-proof enough and when somethings goes wrong it goes wrong either in enumerator or in loop body. Why I say something can go wrong in enumerator? Well, we can write our own enumerator and make some mistake. But this is something that óur loop cannot control.
Dark side of foreach
Foreach is tricky for beginners. Take a look at the followin code and guess what happens when we run it.
static void Main(string[] args)
{
var list = new List<MyClass> { new MyClass(), new MyClass() };
DoSomething(list);
}
public static void DoSomething(ICollection<MyClass> items)
{
if(items == null)
{
return;
}
foreach(var item in items)
{
if(item == null)
{
continue;
}
items.Remove(item);
}
}
We end up with following exception: InvalidOperationException: ‘Collection was modified; enumeration operation may not execute.’ We cannot modify collection that foreach is traversing. This piece of code ends up with exactly the same exception.
foreach(var item in items)
{
if(item == null)
{
continue;
}
items.Add(new MyClass());
}
For-loops are free of this problem but it is easy to run out from collection boundaries if we are not careful.
Deleting elements in loop
Usually we need to delete items from collection when going through it. Here are some simple tricks how to do it.
For foreach we can use LINQ to get copy of collection.
foreach(var item in items.ToList())
{
if(item == null)
{
continue;
}
items.Remove(item);
}
When calling ToList() LINQ-method on collection we get copy of items collection and it is not bound to original collection anyhow.
From real life. Once I helped debug nasty piece of spaghetti code where changes were made to foreach collection. I don’t know the reason but author of this code ran multiple times to InvalidOperationException when trying to edit collection in foreach body. Instead of taking a moment to read documentation he or she put problematic loops to try-catch block and set up a flag when catch was hit. Then there was code to try some other approach based on data in collection. Lesson: if unfamiliar exception is thrown then stop for a moment and find out what is actually wrong with code.
Big collections are something we don’t want to duplicate in memory. We want to use same collection in loop and same time we want to remove items. In this case we can use regular for-loop that moves from end to start in collection. Sample is here.
for(var i = items.Count - 1; i >= 0; i--)
{
var item = items[i];
if(item == null)
{
continue;
}
items.Remove(item);
}
Trick is simple here – we only delete elements that have index bigger or equal to loop variable i. There is no danger to run out from loop boundaries if we don’t delete elements that loop has not stepped over yet.
Good tips…specially for all the new beginners out there.
I understand your issues with ArrayLists, but could you please elaborate what’s wrong with IList?
Afaik, using IList over List basically tells the consumer of the code a thing or two about the intent of the value, doesn’t make it any more or less dangerous.
IList is just an interface that lists implement. You can use also IList instead of ArrayList in this example.
whats wrong with this code?
Looks pretty similar and works just the same no?
public void SaveChanges(ArrayList entries)
{
if(entries == null)
continue;
foreach(MyObject bizObject in entries)
{
if(bizObject == null)
continue;
if(bizObject.IsDirty)
bizObject.Save();
}
}
ArrayList can contain all kinds of objects. If you have objects of class A in ArrayList nothing stops you adding objects of class B there. So, to be safe you should check the object type. Using generic list in this case guarantees that there is no objects of other types.
If you are using value types then you lose in performance because of unboxing operations.