Getting distinct values from arrays is not a unique problem. Here will I show some options how to do it. We are will use array of integer in examples here. This blog entry shows you somehow the mighty evolution of .Net Framework.
Let’s say we have array of integers like this:
int[] nrs = new int[10];
nrs[0] = 1;
nrs[1] = 2;
nrs[2] = 2;
nrs[3] = 3;
nrs[4] = 4;
nrs[5] = 4;
nrs[6] = 5;
nrs[7] = 5;
nrs[8] = 6;
nrs[9] = 7;
We can see that this array contains duplicate elements. Let’s try now to get distinct values of out if. The result we expect is the following array:
nrs[0] = 1;
nrs[1] = 2;
nrs[2] = 3;
nrs[3] = 4;
nrs[4] = 5;
nrs[5] = 6;
nrs[6] = 7;
1. .NET Framework 1.0/1.1
The first method we will see works okay on all .Net Framework versions. Also the older versions are supported.
public int[] GetDistinctValues(int[] array)
{
ArrayList list = new ArrayList();
for (int i = 0; i < array.Length; i++)
{
if (list.Contains(array[i]))
continue;
list.Add(array[i]);
}
return (int[])list.ToArray(typeof(int));
}
This method works on all versions of .NET Framework. It has some bad side effects on large arrays. Because we are using ArrayList that is built to hold objects it makes some bad overhead when using value types with it. Value types will be boxed when they are assigned to objects. When casting back from objects to value types then unboxing happens.
2. .NET Framework 2.0
.NET Framework 2.0 helps us to avoid overhead of boxing and unboxing because we can use generics. The following method is more powerful because it has no side effects of boxing and unboxing.
public int[] GetDistinctValues(int[] array)
{
List<int> list = new List<int>();
for (int i = 0; i < array.Length; i++)
{
if (list.Contains(array[i]))
continue;
list.Add(array[i]);
}
return list.ToArray();
}
This method doesn’t use all the power of generic. So let’s make this method usable on all types we want to use.
public T[] GetDistinctValues<T>(T[] array)
{
List<T> tmp = new List<T>();
for (int i = 0; i < array.Length; i++)
{
if (tmp.Contains(array[i]))
continue;
tmp.Add(array[i]);
}
return tmp.ToArray();
}
Now you can use this method also with strings, doubles and objects. Example with array shown in the beginning of this blog entry:
nrs = GetDistinctValues<int>(nrs);
Now we have used all the power of .Net Framework 2.0 and as we can see we made our method very general.
3. .NET Framework 3.5
The last thing is the shortest one and wyou don’t have to expect here more than one line of code.
nrs = nrs.Distinct().ToArray();
This is possible due to extension methods for arrays and lists that .NET Framework 3.5 offers to us. So when you are using Visual Studio 2008 you can solve problems like this easily.
View Comments (9)
Nice, interesting post, and a good application of a generic method in the .NET 2.0 section.
Word of warning: If you have more than a couple hundred elements in your array, you'd be much better off doing a quick sort on your list then doing a single pass through the array and only add to the distinct list when the element is not the same as the previous element. Depending on your memory requirements, you could also use a dictionary (or the new HashSet) to keep track of what has already been added. If I remember correctly this later method is what LINQ's Distinct keyword does.
Yes, that's correct - you should sort larger arrays before asking distinct values because checking operations work faster when input array is sorted.
Nice, simply superb to understand ...
Is there an algorithm to distinct value in array?
Thx
Use Ling to get distinct values. Just add reference to System.Linq namespace and your arrays get a lot of extension methods you can use. ust go to array and open IntelliSense to see what is there for you.
very helpullllll........and nice article.
thanx...!
Very helpful article! Thanks
very helpful as it shown the progress from version to version.. elaboratively explained...