Getting distinct values from arrays (through .NET Framework history)

Getting distinct values from arrays is not a unique problem. Here will I show some options how to do it. We are will use array of integer in examples here. This blog entry shows you somehow the mighty evolution of .Net Framework.

Let’s say we have array of integers like this:

int[] nrs = new int[10];
nrs[0] = 1;
nrs[1] = 2;
nrs[2] = 2;
nrs[3] = 3;
nrs[4] = 4;
nrs[5] = 4;
nrs[6] = 5;
nrs[7] = 5;
nrs[8] = 6;
nrs[9] = 7;

We can see that this array contains duplicate elements. Let’s try now to get distinct values of out if. The result we expect is the following array:

nrs[0] = 1;
nrs[1] = 2;
nrs[2] = 3;
nrs[3] = 4;
nrs[4] = 5;
nrs[5] = 6;
nrs[6] = 7;

1. .NET Framework 1.0/1.1

The first method we will see works okay on all .Net Framework versions. Also the older versions are supported.

public int[] GetDistinctValues(int[] array)
{
   
ArrayList list = new ArrayList
();

   
for (int
i = 0; i < array.Length; i++)
    {
       
if
(list.Contains(array[i]))
           
continue
;
        list.Add(array[i]);
    }
   
return (int[])list.ToArray(typeof(int));
}

This method works on all versions of .NET Framework. It has some bad side effects on large arrays. Because we are using ArrayList that is built to hold objects it makes some bad overhead when using value types with it. Value types will be boxed when they are assigned to objects. When casting back from objects to value types then unboxing happens.

2. .NET Framework 2.0

.NET Framework 2.0 helps us to avoid overhead of boxing and unboxing because we can use generics. The following method is more powerful because it has no side effects of boxing and unboxing.

public int[] GetDistinctValues(int[] array)
{
   
List<int> list = new List<int
>();

   
for (int
i = 0; i < array.Length; i++)
    {
       
if
(list.Contains(array[i]))
           
continue
;
        list.Add(array[i]);
    }
   
return list.ToArray();
}

This method doesn’t use all the power of generic. So let’s make this method usable on all types we want to use.

public T[] GetDistinctValues<T>(T[] array)
{
   
List<T> tmp = new List
<T>();

   
for (int
i = 0; i < array.Length; i++)
    {
       
if
(tmp.Contains(array[i]))
           
continue
;
        tmp.Add(array[i]);
    }
   
return tmp.ToArray();

}

Now you can use this method also with strings, doubles and objects. Example with array shown in the beginning of this blog entry:

nrs = GetDistinctValues<int>(nrs);

Now we have used all the power of .Net Framework 2.0 and as we can see we made our method very general.

3. .NET Framework 3.5

The last thing is the shortest one and wyou don’t have to expect here more than one line of code.

nrs = nrs.Distinct().ToArray();

This is possible due to extension methods for arrays and lists that .NET Framework 3.5 offers to us. So when you are using Visual Studio 2008 you can solve problems like this easily.

Gunnar Peipman

Gunnar Peipman is ASP.NET, Azure and SharePoint fan, Estonian Microsoft user group leader, blogger, conference speaker, teacher, and tech maniac. Since 2008 he is Microsoft MVP specialized on ASP.NET.

    9 thoughts on “Getting distinct values from arrays (through .NET Framework history)

    • May 15, 2008 at 2:28 pm
      Permalink

      Nice, interesting post, and a good application of a generic method in the .NET 2.0 section.

    • May 16, 2008 at 3:20 pm
      Permalink

      Word of warning: If you have more than a couple hundred elements in your array, you’d be much better off doing a quick sort on your list then doing a single pass through the array and only add to the distinct list when the element is not the same as the previous element. Depending on your memory requirements, you could also use a dictionary (or the new HashSet) to keep track of what has already been added. If I remember correctly this later method is what LINQ’s Distinct keyword does.

    • May 18, 2008 at 4:40 pm
      Permalink

      Yes, that’s correct – you should sort larger arrays before asking distinct values because checking operations work faster when input array is sorted.

    • May 23, 2008 at 4:15 pm
      Permalink

      Nice, simply superb to understand …

    • November 4, 2009 at 11:25 am
      Permalink

      Use Ling to get distinct values. Just add reference to System.Linq namespace and your arrays get a lot of extension methods you can use. ust go to array and open IntelliSense to see what is there for you.

    • December 27, 2010 at 12:26 pm
      Permalink

      very helpullllll……..and nice article.

      thanx…!

    • October 12, 2011 at 6:04 pm
      Permalink

      Very helpful article! Thanks

    • June 7, 2012 at 5:03 pm
      Permalink

      very helpful as it shown the progress from version to version.. elaboratively explained…

    Leave a Reply

    Your email address will not be published. Required fields are marked *