A New Internet Library: Add Your Website/Blog or Suggest A Website/Blog to our Free Web Directory http://anil.myfunda.net.

Its very simple, free and SEO Friendly.
Submit Now....

Saturday, February 14, 2009

Redesigning System.Object/java.lang.Object

I've had quite a few discussions with a colleague about some failures of Java and .NET. The issue we keep coming back to is the root of the inheritance tree. There's no doubt in my mind that having a tree with a single top-level class is a good thing, but it's grown a bit too big for its boots.

Pretty much everything in this post applies to both .NET and Java, sometimes with a few small changes. Where it might be unclear, I'll point out the changes explicitly - otherwise I'll just use the .NET terminology.

What's in System.Object?

Before we work out what we might be able to change, let's look at what we've got. I'm only talking about instance methods. At the moment:

Life-cycle and type identity

There are three members which I believe really need to be left alone.

We need a parameterless constructor because (at least with the current system of chaining constructors to each other) we have to have some constructor, and I can't imagine what parameter we might want to give it. I certainly find it hard to believe there's a particular piece of state which really deserves to be a part of every object but which we're currently missing.

I really don't care that much about finalizers. Should the finalizer be part of Object itself, or should it just get handled automatically by the CLR if and only if it's defined somewhere in the inheritance chain? Frankly, who cares. No doubt it makes a big difference to the implementation somewhere, but that's not my problem. All I care about when it comes to finalizers is that when I have to write them it's as easy as possible to do it properly, and that I don't have to write them very often in the first place. (With SafeHandle, it should be a pretty rare occurrence in .NET, even when you're dealing directly with unmanaged resources.)

GetType() or (getClass() in Java) is pretty important. I can't see any particular alternative to having this within Object, unless you make it a static method somewhere else with an Object parameter. In fact, that would have the advantage of freeing up the name for use within your own classes. The functionality is sufficiently important (and really does apply to every object) that I think it's worth keeping.

Comparison methods

Okay, time to get controversial. I don't think every object should have to be able to compare itself with another object. Of course, most types don't really support this anyway - we just end up with reference equality by default.

The trouble with comparisons is that everyone's got a different idea of what makes something equal. There are some types where it really is obvious - there's only one natural comparison. Integers spring to mind. There are other types which have multiple natural equality comparisons - floating point numbers (exact, within an absolute epsilon, and within a relative epsilon) and strings (ordinal, culture sensitive and/or case sensitive) are examples of this. Then there are composite types where you may or may not care about certain aspects - when comparing URLs, do I care about case? Do I care about fragments? For http, if the port number is explicitly specified as 80, is that different to a URL which is still http but leaves the port number implicit?

.NET represents these reasonably well already, with the IEquatable<T> interface saying "I know how to compare myself with an instance of type T, and how to produce a hashcode for myself" and IEqualityComparer<T> interface saying "I know how to compare two instances of T, and how to produce a hashcode for one instance of T." Now suppose we didn't have the (nongeneric!) Equals() method and GetHashCode() in System.Object. Any type which had a natural equality comparison would still let you compare it for equality by implementing IEquatable<T>.Equals - but anything else would either force you to use reference equality or an implementation of IEqualityComparer<T>.

Some of the principle consumers of equality comparisons are collections - particularly dictionaries (which is why it's so important that the interfaces should include hashcode generation). With the current way that .NET generics work, it would be tricky to have a constraint on a constructor such that if you only specified the types, it would only work if the key type implemented IEquatable<T>, but it's easy enough to do with static methods (on a non-generic type). Alternatively you could specify any type and an appropriate IEqualityComparer<T> to use for the keys. We'd need an IdentityComparer<T> to work just with references (and provide the equivalent functionaliy to Object.GetHashCode) but that's not hard - and it would be absolutely obvious what the comparison was when you built the dictionary.

Monitors and threading

This is possibly my biggest gripe. The fact that every object has a monitor associated with it was a mistake in Java, and was unfortunately copied in .NET. This promotes the bad practice of locking on "this" and on types - both of which are typically publicly accessible references. I believe that unless a reference is exposed explicitly for the purpose of locking (like ICollection.SyncRoot) then you should avoid locking on any reference which other code knows about. I typically have a private read-only variable for locking purposes. If you're following these guidelines, it makes no sense to be able to lock on absolutely any reference - it would be better to make the Monitor class instantiable, and make Wait/Pulse/PulseAll instance members. (In Java this would mean creating a new class and moving Object.wait/notify/notifyAll members to that class.)

This would lead to cleaner, more readable code in my view. I'd also do away with the "lock" statement in C#, making Monitor.Enter return a token implementing IDisposable - so "using" statements would replace locks, freeing up a keyword and giving the flexibility of having multiple overloads of Monitor.Enter. Arguably if one were redesigning all of this anyway, it would be worth looking at whether or not monitors should really be reentrant. Any time you use lock reentrancy, you're probably not thinking hard enough about the design. Now there's a nice overgeneralisation with which to end this section...

String representations

This is an interesting one. I'm genuinely on the fence here. I find ToString() (and the fact that it's called implicitly in many circumstances) hugely useful, but it feels like it's attempting to satisfy three different goals:

  • Giving a developer-readable representation when logging and debugging
  • Giving a user-readable representation as part of a formatted message in a UI
  • Giving a machine-readable format (although this is relatively rare for anything other than numeric types)

It's interesting to note that Java and .NET differ as to which of these to use for numbers - Java plumps for "machine-readable" and .NET goes for "human-readable in the current thread's culture". Of course it's clearer to explicitly specify the culture on both platforms.

The trouble is that very often, it's not immediately clear which of these has been implemented. This leads to guidelines such as "don't use ToString() other than for logging" on the grounds that at least if it's implemented inappropriately, it'll only be a log file which ends up with difficult-to-understand data.

Should this usage be explicitly stated - perhaps even codified in the name: "ToDebugString" or something similar? I will leave this for smarter minds to think about, but I think there's enough value in the method to make it worth keeping.

MemberwiseClone

Again, I'm not sure on this one. It would perhaps be better as a static (generic!) method somewhere in a class whose name indicated "this is for sneaky runtime stuff". After all, it constructs a new object without calling a constructor, and other funkiness. I'm less bothered by this than the other items though.

Conclusion

To summarise, in an ideal world:

  • Equals and GetHashCode would disappear from Object. Types would have to explicitly say that they could be compared
  • Wait/Pulse/PulseAll would become instance methods in Monitor, which would be instantiated every time you want a lock.
  • ToString might be renamed to give clearer usage guidance.
  • MemberwiseClone might be moved to a different class.

Obviously it's far too late for either Java or .NET to make these changes, but it's always interesting to dream. Any more changes you'd like to see? Or violent disagreements with any of the above?

...Full Article.

No comments:

Post a Comment

Post your comments here:

Originals Enjoy