ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Collection Classes Deep Dive

        By Gary Short
     Head of Gibraltar Labs
      Gibraltar Software


                               1
Introduction
?   Gary Short
?   Head of Gibraltar Labs
?   C# MVP
?   @garyshort
?   gary.short@gibraltarsoftare.com
?   facebook.com/TheOtherGaryShort



                                      2
Why do we Care About This Stuff?




                                   3
4
Let¡¯s Start With Something We Know




                                     5
List<T> Demo




               6
What we Learned
?   Don¡¯t add elements in a loop
?   Add causes capacity growths
?   Capacity growths uses Array.Copy()
?   Array.Copy() is a O(n) operation
?   O(n) is sloooooowwwwwww. ?
?   Use AddRange() instead
?   Or set ¡°large enough¡± initial capacity.

                                              7
How Slow is Slow?




                    8
Performance: Add Versus AddRange
                  30000


                  25000


                  20000
Number of Ticks




                  15000
                                                                               Add
                                                                               AddRange
                  10000


                  5000


                      0
                          10   100       1000      10000    100000   1000000
                                     Number of Elements Added
What About Removing Stuff?




                             10
Demo




       11
What we Learned



Prefer RemoveAt() as there¡¯s no IndexOf() step




                                                 12
List<T> - Sorting
?   Uses QuickSort under the hood
?   Fastest general purpose sort algorithm
?   O(n log n) in best case
?   O(n log n) in average case
?   Though worst case is O(n^2) ?




                                             13
Performance: O(n log n) Vs O(n^2)
         120


         100


          80
Effort




          60
                                                                   O(n log n)
                                                                   O(n^2)
          40


          20


           0
               1   2      3   4     5      6      7   8   9   10
                              Elements to be Sorted
QuickSort Demo




                 15
So What is the Worst Case?

? If the list is already sorted
   ¨C First partition has lower = 0, upper = n
   ¨C Then calls Partition(n-1);
   ¨C This happens a further n-2 times




                                                16
.Net Collection Classes Deep Dive  - Rocksolid Tour 2013
Can we Mitigate the Worst Case?
? Median of Three
  ¨C Take an element from the ¡°top¡± of the array
  ¨C Take an element from the ¡°middle¡± of the array
  ¨C Take an element from the ¡°bottom¡± of the array
  ¨C Find the median value of the three
  ¨C Pivot on the median
? Let¡¯s see if Microsoft uses this algorithm.


                                                     18
Disadvantage: O(n) Add, Insert, Remove




                                         19
What if we Need Fast Add, Insert & Remove?




                                             20
LinkedList<T>
? Double linked
  ¨C Each item points to the previous and next items
  ¨C This means it¡¯s super fast
     ? Add, insert and remove are all O(1) operations




                                                        21
Demo




       22
Disadvantage: O(n) lookups




                             23
What if we Need Fast Lookups?




                                24
Dictionary<TKey, TValue>
? Performance depends on key.GetHashCode()
  ¨C Hash codes must be evenly distributed across int
     ? If two keys return hashes that give the same index
        ¨C Dictionary must look for nearest free location to store item
        ¨C Must search later to return the item
        ¨C This hurts performance
  ¨C Use your own type, then this is on you. ?




                                                                         25
Dictionary<TKey, TValue>
? Objects used as keys must also implement
  IEquatable.Equals()
? Or override Equals()
? Why?
  ¨C Different keys may return the same hashcode
  ¨C Equals() is used by the dictionary comparing keys
  ¨C So you must ensure the following
     ? If A.Equals(B) then A.HashCode() and B.HashCode() return
       the same HashCode()
     ? Override Equals() but not GetHashCode() == compile error.


                                                                   26
Disadvantage: one value per key




                                  27
What if I Need Multiple Values per Key?




                                          28
Lookup<TKey, TElement> Demo




                              29
Concurrent Collections




                         30
Types of Concurrent Collections
?   ConcurrentBag<T>
?   ConcurrentDictionary<T>
?   ConcurrentQueue<T>
?   ConcurrentStack<T>
?   OrderablePartitioner<T>
?   BlockingCollection<T>.



                                      31
Key Characteristics
? New .Net 4.0
? Guards against multi-thread collection conflicts
? Implements IProducerConsumerCollections<T>
   ¨C TryAdd()
      ? Tries to add item to collection returns success bool
   ¨C TryTake()
      ? Tries to remove and return item returns success bool
          ¨C Returns the item in an out param.

? Always check the return value before moving on.

                                                               32
Do I Have To Check Every Time?!
? BlockingCollection<T>
  ¨C Blocks and waits until task completes
  ¨C Uses Add() and Take() methods
     ? Block the thread and wait until task completes
     ? Add() has an overload to pass a CancellationToken
     ? Add() may also block if bounding capacity was used.




                                                             33
But I Don¡¯t Want it to Wait For Ever!
? So we don¡¯t want to wait forever
? Nor do we want to cancel the Add() from
  outside
? TryAdd() and TryTake() are offered too
? Where you can specify a timeout.




                                            34
Summary
? List is a good general purpose collection
    ¨C Construct to size if possible
    ¨C Construct to upper threshold then trim
    ¨C Prefer AddRange() over Add()
    ¨C Be aware of ¡°Quicksort Killers¡±
?   Use LinkedList if you need fast insert/remove
?   Use Dictionary if you need fast lookup
?   Use Lookup if you need multi values
?   Use concurrent collections for thread safety.   35
Questions
? gary.short@gibraltarsoftware.com
? @garyshort
? Facebook.com/TheOtherGaryShort




                                     36

More Related Content

.Net Collection Classes Deep Dive - Rocksolid Tour 2013

  • 1. Collection Classes Deep Dive By Gary Short Head of Gibraltar Labs Gibraltar Software 1
  • 2. Introduction ? Gary Short ? Head of Gibraltar Labs ? C# MVP ? @garyshort ? gary.short@gibraltarsoftare.com ? facebook.com/TheOtherGaryShort 2
  • 3. Why do we Care About This Stuff? 3
  • 4. 4
  • 5. Let¡¯s Start With Something We Know 5
  • 7. What we Learned ? Don¡¯t add elements in a loop ? Add causes capacity growths ? Capacity growths uses Array.Copy() ? Array.Copy() is a O(n) operation ? O(n) is sloooooowwwwwww. ? ? Use AddRange() instead ? Or set ¡°large enough¡± initial capacity. 7
  • 8. How Slow is Slow? 8
  • 9. Performance: Add Versus AddRange 30000 25000 20000 Number of Ticks 15000 Add AddRange 10000 5000 0 10 100 1000 10000 100000 1000000 Number of Elements Added
  • 10. What About Removing Stuff? 10
  • 11. Demo 11
  • 12. What we Learned Prefer RemoveAt() as there¡¯s no IndexOf() step 12
  • 13. List<T> - Sorting ? Uses QuickSort under the hood ? Fastest general purpose sort algorithm ? O(n log n) in best case ? O(n log n) in average case ? Though worst case is O(n^2) ? 13
  • 14. Performance: O(n log n) Vs O(n^2) 120 100 80 Effort 60 O(n log n) O(n^2) 40 20 0 1 2 3 4 5 6 7 8 9 10 Elements to be Sorted
  • 16. So What is the Worst Case? ? If the list is already sorted ¨C First partition has lower = 0, upper = n ¨C Then calls Partition(n-1); ¨C This happens a further n-2 times 16
  • 18. Can we Mitigate the Worst Case? ? Median of Three ¨C Take an element from the ¡°top¡± of the array ¨C Take an element from the ¡°middle¡± of the array ¨C Take an element from the ¡°bottom¡± of the array ¨C Find the median value of the three ¨C Pivot on the median ? Let¡¯s see if Microsoft uses this algorithm. 18
  • 19. Disadvantage: O(n) Add, Insert, Remove 19
  • 20. What if we Need Fast Add, Insert & Remove? 20
  • 21. LinkedList<T> ? Double linked ¨C Each item points to the previous and next items ¨C This means it¡¯s super fast ? Add, insert and remove are all O(1) operations 21
  • 22. Demo 22
  • 24. What if we Need Fast Lookups? 24
  • 25. Dictionary<TKey, TValue> ? Performance depends on key.GetHashCode() ¨C Hash codes must be evenly distributed across int ? If two keys return hashes that give the same index ¨C Dictionary must look for nearest free location to store item ¨C Must search later to return the item ¨C This hurts performance ¨C Use your own type, then this is on you. ? 25
  • 26. Dictionary<TKey, TValue> ? Objects used as keys must also implement IEquatable.Equals() ? Or override Equals() ? Why? ¨C Different keys may return the same hashcode ¨C Equals() is used by the dictionary comparing keys ¨C So you must ensure the following ? If A.Equals(B) then A.HashCode() and B.HashCode() return the same HashCode() ? Override Equals() but not GetHashCode() == compile error. 26
  • 28. What if I Need Multiple Values per Key? 28
  • 31. Types of Concurrent Collections ? ConcurrentBag<T> ? ConcurrentDictionary<T> ? ConcurrentQueue<T> ? ConcurrentStack<T> ? OrderablePartitioner<T> ? BlockingCollection<T>. 31
  • 32. Key Characteristics ? New .Net 4.0 ? Guards against multi-thread collection conflicts ? Implements IProducerConsumerCollections<T> ¨C TryAdd() ? Tries to add item to collection returns success bool ¨C TryTake() ? Tries to remove and return item returns success bool ¨C Returns the item in an out param. ? Always check the return value before moving on. 32
  • 33. Do I Have To Check Every Time?! ? BlockingCollection<T> ¨C Blocks and waits until task completes ¨C Uses Add() and Take() methods ? Block the thread and wait until task completes ? Add() has an overload to pass a CancellationToken ? Add() may also block if bounding capacity was used. 33
  • 34. But I Don¡¯t Want it to Wait For Ever! ? So we don¡¯t want to wait forever ? Nor do we want to cancel the Add() from outside ? TryAdd() and TryTake() are offered too ? Where you can specify a timeout. 34
  • 35. Summary ? List is a good general purpose collection ¨C Construct to size if possible ¨C Construct to upper threshold then trim ¨C Prefer AddRange() over Add() ¨C Be aware of ¡°Quicksort Killers¡± ? Use LinkedList if you need fast insert/remove ? Use Dictionary if you need fast lookup ? Use Lookup if you need multi values ? Use concurrent collections for thread safety. 35

Editor's Notes

  • #5: Why should we care?It¡¯s all about performance.Performance is the most important thing¡­ apart from everything elsePerformance is like currency, the more you have, the more stuff you can buy.