Beyond The Interview

Beyond the Interview: Data Structures (#1 Arrays & Lists)

Dive beyond the typical interview prep and explore how choosing the right data structures—like arrays, lists, and spans—can revolutionize your coding efficiency. This series offers real-world insights, practical benchmarks, and a touch of humour to enhance your development skills.

Ciaran O'Loughlin

29 Mar 2025 • 7 min read

About This Series

We’ve all been there—grinding LeetCode, memorizing tree traversals, and pretending to care about Red-Black Trees because some Big Tech interviewer insists they’re “fundamental.”

But here’s the reality: 99% of the time, you’re just using Lists, Arrays, and HashMaps—and that’s fine.

So why bother with the rest? Because sometimes the default isn’t the best choice.

Maybe a different data structure completely revolutionizes your system.
Maybe it opens up new possibilities you hadn’t considered.
Maybe it’s the key to removing that one persistent bottleneck that refuses to budge no matter how many CPU cores you throw at it.

Some data structures solve real-world problems, while others exist purely to torture CS students. In this series, we’ll cut through the nonsense—breaking down:
✅ When data structures actually matter
✅ What trade-offs they introduce
✅ Where they fit into practical software development

No fluff. No pointless academic exercises. Just real-world insights, benchmarks, and maybe a few laughs along the way.

Bonus: In your next tech interview, you’ll be able to talk about the pros and cons of Bloom Filters and Suffix Trees as easily as reversing an array—and that will make you stand out.

I mention some internal workings related to performance (stack vs heap, garbage collection, etc). If you’re not up to speed, check out Dig Deeper: Memory Management in .NET].

Explore This Series

Arrays & Lists | Linked Lists | Stacks & Queues | Hash Tables & Bloom Filters

📌 Arrays

We all know Arrays and Lists—we can write them in our sleep. And let’s be honest, sometimes we do.
They’re the default, the workhorses, the structures we reach for without a second thought.

But just because they’re basic doesn’t mean they’re simple.

Under the hood, there’s more going on than just arr[i] and .Add(). So, for this first instalment, let’s dig deeper into our default data structures.

We’ll explore their variations, quirks, optimizations, and little-known features—because even after years of using them, you might still be missing some tricks.

📌 Arrays in C# – More Than Just `T[]`

Array vs. List<T> vs. Span<T> – Choosing the Right Structure

Arrays are low-level and fixed size, while List<T> is a dynamic array wrapper that resizes automatically.
But .NET also gives us Span<T>, which can avoid heap allocations and improve performance when working with large datasets.

int[] array = new int[100];   // Fixed-size, fast, heap-allocated  
List<int> list = new();       // Dynamically resizes, higher overhead, GC pressure for large collections  
Span<int> span = stackalloc int[100];  // Stack-allocated, zero GC pressure

Memory & Performance: Why Arrays Are Fast

✅ Arrays have contiguous memory allocation, meaning CPU caches love them.
✅ Accessing arr[i] is O(1) due to pointer arithmetic.
✅ List<T> internally uses an array, but with resizing overhead.

📌 Resizing & Growth Strategy of `List<T>`

Ever wondered why List<T>.Add() doesn’t always allocate new memory?

.NET doubles the capacity when resizing, making most insertions amortized O(1).
If you're not careful, your list will keep doubling in size like a Gremlin you fed after midnight.

Any time we add an element to a list, .NET doesn’t allocate the space for that element on the fly. In fact, the first few times it doesn’t actually grow at all.

📌 Example:

First insert: Capacity = 4
Second, third, and fourth inserts? Still 4.
Add a fifth? Boom. Doubles to 8.

🚀 But here’s the problem: Arrays need contiguous memory.
We can’t just expand into adjoining memory like the USA when they hear a country has untapped oil and needs some “freedom.”
Instead, we build a new array (twice the size), move all our stuff over, and abandon the old one.

And when we fill up that space? Do it all again.

📌 Example: Dynamic List Growth

public void CompletelyManaged() 
{ 
    var list = new List<int>(); 
    for (int i = 0; i < 1000; i++) 
    { 
        list.Add(i); 
        Console.WriteLine($"List size: {list.Count}, Capacity: {list.Capacity}"); 
    } 
}

📌 Output:

List size: 1, Capacity: 4
List size: 2, Capacity: 4
List size: 3, Capacity: 4
List size: 4, Capacity: 4
// Create a new array, copy 4 items over, unallocate old array
List size: 5, Capacity: 8
List size: 6, Capacity: 8
List size: 7, Capacity: 8
List size: 8, Capacity: 8
// Create a new array, copy 8 items over, unallocate old array
List size: 9, Capacity: 16
<etc>

At first glance, it seems fine.
But if performance matters, or if you’re working with large collections, this overhead can be brutal.

How to Avoid Performance Issues

Instead of:

var list = new List<int>();  
for (int i = 0; i < 1000; i++) 
{
    list.Add(i);
}

Use:

list = new List<int>(1000); // Pre-allocate to avoid resizes.

Alternatively, set the list capacity when you know its final size:

list.Capacity = someObject.Items.Count();

📌 Example: Pre-allocating List Capacity

public void SetCapacity()
{
    var list = new List<int>();
    list.Capacity = 1000; // Setting the capacity
    for (int i = 0; i < 1000; i++)
    {
        list.Add(i);
        Console.WriteLine($"List size: {list.Count}, Capacity: {list.Capacity}");
    }
}

📌 Output:

List size: 1, Capacity: 1000
List size: 2, Capacity: 1000
List size: 3, Capacity: 1000
List size: 4, Capacity: 1000
List size: 5, Capacity: 1000
...
List size: 999, Capacity: 1000
List size: 1000, Capacity: 1000

Now, no unnecessary reallocations. No wasted CPU cycles copying arrays.

🔹 Even better? Set capacity in the constructor:

var list = new List<int>(1000);

📌 Why?
Creating a list without specifying capacity defaults to 4, causing unnecessary growth.

📊 Performance Comparison

Scenario	Growth Strategy	GC Pressure
`List<int>()`	Starts at 4, doubles each time	High (Frequent resizing)
`List<int>(1000)`	Pre-allocated upfront	Low (One allocation)

🚀 Let’s see the actual performance impact.

📌 Benchmark: Dynamic Growth vs Pre-Allocated List

public List<LegacyBusinessObject> DynamicGrowth()
{
    var list = new List<LegacyBusinessObject>();
    for (int i = 0; i < 10000; i++)
    {
        list.Add(new LegacyBusinessObject());
    }
    return list;
}

public List<LegacyBusinessObject> FixedSize()
{
    var list = new List<LegacyBusinessObject>(10000);
    for (int i = 0; i < 10000; i++)
    {
        list.Add(new LegacyBusinessObject());
    }
    return list;
}

📌 Benchmark Results:

Method	Mean	Error	StdDev	Median
`DynamicGrowth()`	1,299.5 ms	175.68 ms	512.47 ms	1,022.3 ms
`FixedSize()`	124.4 ms	2.49 ms	5.19 ms	124.6 ms

🤯 Over 10× faster just by setting the capacity upfront!
That’s definitely something to consider when dealing with large collections.

📌 Key Takeaways

✅ Pre-allocate list capacity if you know the final size.
✅ Avoid unnecessary reallocations by setting list.Capacity = X.
✅ Dynamic list growth can be 10× slower than pre-allocating capacity.

📌 `Span<T>` & `Memory<T>` – High-Performance .NET Features

When performance is critical, newer features like Span<T>, ReadOnlySpan<T>, and Memory<T> help reduce allocations and optimize large buffer handling.

🔹 `Span<T>` – Avoid Heap Allocations for Faster Performance

✅ Lightweight, stack-allocated view of memory
✅ No heap allocations (when used with stackalloc)
✅ Better cache locality

// Stack allocation: No GC, memory exists only for method duration
Span<int> slice = stackalloc int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

Just remember we can only use stackalloc with value types. Reference types will always be on the heap.

Fast slicing without heap allocation

int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Span<int> slice = numbers.AsSpan(2, 4); // View of [3, 4, 5, 6]

🚀 It’s like a Formula 1 car—amazing when used correctly, but dangerous in the wrong hands (like Mazepin).

🔹 `Memory<T>` – Persistent Buffer Handling (Async-Safe)

Unlike Span<T>, Memory<T> can persist beyond method calls and works in async methods.

Best for handling large buffers in streaming, networking, or file processing.

Memory<byte> buffer = new byte[1024]; // Heap allocated but efficiently sliceable
Memory<byte> slice = buffer.Slice(100, 50); // No new memory allocation!

🔹 When to Use?

Use Case	Best Choice
Fast, temporary buffer operations	`Span<T>`
Immutable slicing (strings, arrays)	`ReadOnlySpan<T>`
Large buffers across async calls	`Memory<T>`
To look like a badass and keep junior devs awake at night	`stackalloc`

✅ Use List<T> unless you have a specific reason not to.
✅ Arrays are best when performance & memory efficiency matter.
✅ Consider Span<T> and Memory<T> for high-performance scenarios.

📌 Jagged Arrays vs. Multidimensional Arrays (`[][]` vs. `[,]`)

So you need a list of things that have their own list of things? Like a list of student IDs, each with a list of test scores. Let’s say it’s a small school—only 3 students. Great, let’s get the multidimensional array out!

var scores = new int[3,5]; // 3 students, 5 test scores each
scores[0, 0] = 95; // First score for first student
scores[2, 3] = 88; // Fourth score for third student

This is lovely and fast, due to the contiguous memory layout. It’s a simple, predictable structure.

However, with the new school year starting, some new students arrive. Bob is here part-time—he only takes one subject. Ok, not a problem, we’ll just throw in some null checks or whatever, it’ll be grand.

Then Sue arrives, and she takes 8 subjects. Ugh, fine, resize the array to cater for this overachiever’s 8 scores. But now every other student’s row in the array has lots of wasted space. And Bob’s row? Almost completely empty.

Maybe it’s time for a Jagged Array instead.

var scores = new int[3][]; // Jagged array

This means we can have a variable number of scores per student without any wasted space.

So which one do we choose?

Jagged arrays clearly look like the winner due to their flexibility, but they’re also slower. Since each element has its own array, they are all stored in different areas of the heap.

Multidimensional arrays, on the other hand, are stored contiguously in memory, resulting in much faster access.

📌 Rule of Thumb:

✅ If your row sizes vary considerably, jagged arrays ([][]) are your best option.
✅ For dense, fixed-size grids, multidimensional arrays ([,]) win.

Further Reading: Array Pooling, while not covered here, can be an absolute game-changer in high-performance apps. Pre-allocating memory pools can reduce garbage collection pressure even more than manually tweaking list.Capacity. Check out: ArrayPool.Shared.Rent(size);