Using Linq and Extension Methods to chunk large data sets
Ever needed to take a large list and split it into smaller subsets of data for processing? Well this is the Extension Method for you. Tonight we had to split a small dataset (500 items) into even smaller sets of 10 so the provider’s web service wouldn’t timeout.
Seeing as I was going to miss out on my evening, I thought I’d see if I could do it a little differently using Linq and this is what I came up with:
/// <summary> /// Simple method to chunk a source IEnumerable into smaller (more manageable) lists /// </summary> /// <param name="source">The large IEnumerable to split</param> /// <param name="chunkSize">The maximum number of items each subset should contain</param> /// <returns>An IEnumerable of the original source IEnumerable in bite size chunks</returns> public static IEnumerable<IEnumerable<TSource>> ChunkData<TSource>(this IEnumerable<TSource> source, int chunkSize) { for (int i = 0; i < source.Count(); i += chunkSize) yield return source.Skip(i).Take(chunkSize); }
It should extend any IEnumerable and allow you to split it into smaller chunks which you can then process to your heart’s content.
Here’s a quick example of it in use:
var list = new List<string>() { "Item 1", "Item 2", "Item 3", "Item 4", "Item 5", "Item 6", "Item 7", "Item 8", "Item 9", "Item 10" }; Console.WriteLine("Original list is {0} items", list.Count); var chunked = list.ChunkData(3); Console.WriteLine("Returned the data in {0} subsets", chunked.Count()); int i = 1; foreach (var subset in chunked) { Console.WriteLine("{0} items are in subset #{1}", subset.Count(), i++); int si = 1; foreach (var s in subset) Console.WriteLine("\t\tItem #{0}: {1}", si++, s); }
And this will output
Original list is 10 items Returned the data in 4 subsets 3 items are in subset #1 Item #1: Item 1 Item #2: Item 2 Item #3: Item 3 3 items are in subset #2 Item #1: Item 4 Item #2: Item 5 Item #3: Item 6 3 items are in subset #3 Item #1: Item 7 Item #2: Item 8 Item #3: Item 9 1 items are in subset #4 Item #1: Item 10
2 lines of code to do all that work -Neat
Liked this post? Got a suggestion? Leave a comment