Monday, November 21, 2011

ADVANTAGES OF GENERICS RATHER THAN COLLECTIONS

As with any new technology, it is helpful to ask just why it's useful. Those of you who are familiar with templates in C++ will find that generics serve a similar purpose in managed code. However, I hesitate to draw too much of a comparison between CLR generics and C++ templates because generics have some additional benefits, including the absence of two common problems: code bloat and developer confusion.
Some of the strengths of CLR generics are compile-time type safety, binary code reuse, performance, and clarity. I'll briefly describe these benefits, and as you read the rest of this column, you'll understand them in more detail. As an example let's take two hypothetical collection classes: SortedList, a collection of Object references, and GenericSortedList, a collection of any type.
Type safety When a user adds a String to a collection of type SortedList, there is an implicit cast to Object. Similarly, if a String object is retrieved from the list, it must be cast at run time from an Object reference to a String reference. This lack of type safety at compile time is both tedious for the developer and prone to error. In contrast, a use of GenericSortedList where T is typed to String causes all add and lookup methods to work with String references. This allows for specifying and checking the type of the elements at compile time rather than at run time.
Binary code reuse For maintenance purposes, a developer may elect to achieve compile-time type safety using SortedList by deriving a SortedListOfStrings from it. The problem with this approach is that new code has to be written for every type for which you want a type-safe list, which can quickly become laborious. With GenericSortedList, all you need to do is instantiate the type with the desired element type as T. As an added value, generics code is generated at run time, so two expansions over unrelated element types such as GenericSortedList and GenericSortedList are able to reuse most of the same just-in-time (JIT)-compiled code. The CLR simply works out the details—no more code bloat!
Performance The bottom line is this: if type checking is done at compile time rather than at run time, performance improves. In managed code, casts between references and values incur both boxings and unboxings, and avoiding such casts can have an equally negative impact on performance. Current benchmarks of a quick-sort of an array of one million integers shows the generic method is three times faster than the non-generic equivalent. This is because boxing of the values is avoided completely. The same sort over an array of string references resulted in a 20 percent improvement in performance with the generic method due to the absence of a need to perform type checking at run time.
Clarity Clarity with generics comes in a number of forms. Constraints are a feature in generics that have the effect of making incompatible expansions of generic code impossible; with generics, you're never faced with the cryptic compiler errors that plague C++ template users. In the GenericSortedList example, the collection class would have a constraint that makes it only work with types for T that can be compared and therefore sorted. Also, generic methods can often be called without employing any special syntax by using a feature called type inference. And, of course, type safety at compile time increases clarity in application code. I will look at constraints, type inference, and type safety in detail throughout this column.

No comments:

Post a Comment