A Joke
September 26, 2008 humor, programming No CommentsWhy do computer scientists often confuse Christmas and Halloween?
Because Oct 31 = Dec 25
I’m not going to explain this one. If you don’t get it, you’re not going to find it funny anyway. ![]()
Why do computer scientists often confuse Christmas and Halloween?
Because Oct 31 = Dec 25
I’m not going to explain this one. If you don’t get it, you’re not going to find it funny anyway. ![]()
Stack Overflow is a new site that aims to be the Wikipedia of programming questions and topics. It takes the basic concept of a question/answer messageboard on programming topics, adds in voting and wiki-editing, puts a very good user interface on it, and makes it available for free. The result is an excellent site to find solutions to software development problems of all sorts.
Joel Spolsky is one of the founders; he’s written his own post about the launch.
I’ve been in the beta for about a month now, and listening to the podcasts since their inception. I’m really impressed with the quality of the site and how well they’ve achieved their goals. Their aim is a lofty one: be the best site on the entire Internet for finding answers to programming questions. Even at this early stage, I think they’ve already accomplished that; it’s now just a matter of time before Google confirms it.
If you develop software, it’s in your best interest to familiarize yourself with the site and discover its capabilities. As time goes on it could easily become your best resource on the Web for solving tricky problems effectively and efficiently.
By the way, here’s my user page, for your enjoyment.
Let’s say that you have a class that takes a generic type parameter:
public class Foo<T> {
public void accept(T someObject) { }
public void doSomething() { }
}
It’s difficult to doSomething() to a collection of Foo<T>s when you don’t know what T is and/or there’s multiple types used for T. To work around this, you can create a typeless interface:
public interface FooTypeless {
void doSomething();
}
The declaration of Foo<T> now becomes:
public class Foo<T> : FooTypeless
Now you can have this method:
public void doSomethingOnAllFoos(
IEnumerable<FooTypeless> foos) {
foreach( FooTypeless foo in foos ) {
foo.doSomething();
}
}
Also, say that you have the following extension method:
public static IEnumerable<T> toEnumerable<T>(this <T> obj ) {
// Any IEnumerable implementation should work here;
// but we'll get to that in a bit
return new T[] { obj };
}
Type inference would let you write this:
FooaFoo = new Foo (); doSomethingOnAllFoos( aFoo.toEnumerable() );
However, this code does not work; you get a compile error:
cannot convert from ‘System.Collections.Generic.IEnumerable<Foo<string>>’ to ‘System.Collections.Generic.IEnumerable<FooTypeless>’
This occurs even though Foo<string> inherits from FooTypeless. Even though the generic type parameters are compatible, the generic-enabled reference isn’t, at least according to the compiler. (We humans could probably see that IEnumerable is an interface that could safely be converted, but the existing compiler cannot).
Now, you can do this:
IEnumerable<FooTypeless> someFoos
= new Foo<String>[] { foo };
…but not this:
IEnumerable<Foo<String>> someStringFoos
= new Foo<String>[] { foo };
IEnumerable<FooTypeless> someFoos
= someStringFoos;
You may be tempted to try casting to work around this. For example, this works:
IEnumerable<Foo<String>> someStringFoos
= new Foo<String>[] { foo };
IEnumerable<FooTypeless> someFoos
= (IEnumerable<FooTypeless>) someStringFoos;
However, this only works if someStringFoos is assigned an array. The following does not work:
IEnumerable<Foo<String>> someStringFoos
= new List<Foo<String>> { foo };
IEnumerable<FooTypeless> someFoos
= (IEnumerable<FooTypeless>) someStringFoos;
If you try this, you’ll get an InvalidCastException. Furthermore, this generates the compiler error:
IEnumerable<FooTypeless> someFoos
= new List<Foo<String>> { foo };
Currently I have no idea why the array works but the List doesn’t; it doesn’t make a lot of sense to me.
I originally discovered this by using type inference with my toEnumerable() extension method. An easy way around these problems is to avoid the type inference completely and explicitly specify the type of IEnumerable you want:
Foo<String> aFoo = new Foo<String>(); doSomethingOnAllFoos( aFoo.toEnumerable<FooTypeless>() );
This works fine, even if toEnumerable() uses a non-array type (such as List) as its IEnumerable implementation. Execution-wise, nothing has changed, but the generic type used in the call can make or break the code.
Stuff like this starts to make Java’s type erasure look not so ugly in comparison. ![]()
In Java, making a field “final” means that you can only assign a value to it once. It’s an important part of making a class immutable. It helps to prevent bugs too: make a field final and you’ll get a compiler error if you leave it unassigned or try to reassign it anywhere.
.NET has a similar concept with the “readonly” keyword for fields. However, there’s one important difference compared to Java: a “readonly” field can only be assigned in the class’s constructor, but it can be assigned multiple times within that constructor. The only restriction it places is that the field can’t be reassigned outside of the constructor. You don’t even get a compiler error for not assigning it at all; you only get a compiler warning (which can be turned off).
This has encouraged bugs at least twice in my code: I assigned a value to a field twice within the constructor and then got unexpected results due to the incorrect object being used. One of these was caused by a conflict resolution from a Subversion Merge (and thus it was less-than-obvious that it had been introduced).
Disclaimer: There may, in fact, be a good reason for the following .NET behavior. However, if there is, it’s certainly not clear to me. I pose a question at the bottom; dear lazyweb, please explain.
.NET’s type inference lets you avoid specifying the type of a generic parameter on a method call some, but not all, of the time. Consider the following infrastructure:
public String getSomeString() {
return "Some String";
}
public T returnTheValue<T>(T value) {
return value;
}
public delegate T Factory<T>();
public T createSomeValue<T>(Factory<T> factory) {
return factory();
}
The following code works fine:
Console.WriteLine(returnTheValue("A String"));
.NET is smart enough to know that, since you’re passing in a String for T, it can assume that String is used for T elsewhere in the method (including the return type), and so it doesn’t require you to explicitly state that returnTheValue is using String for T, like so:
Console.WriteLine(returnTheValue<String>("A String"));
The above code does work, and if you specify a non-compatible type (ex: int), then you get an error. However, it’s not necessary due to type inference.
The following code also works:
Console.WriteLine( returnTheValue( getSomeString() ) );
There’s no difference between using a literal and calling a method (they’re both dealing with values).
The following does not work:
Console.WriteLine( createSomeValue( getSomeString ) );
Here we’re not passing a value, but passing a method that will create/return a value. In this case, the .NET compiler returns the following error message:
The type arguments for method ‘createSomeValue<T>( Factory<T> )’ cannot be inferred from the usage. Try specifying the type arguments explicitly.
“Specifying the type arguments explicitly” means doing this, which works:
Console.WriteLine( createSomeValue<String>( getSomeString ) );
What I don’t understand is why the type inference works in one case and not the other. We know that getSomeString() satisfies Factory<String>, and thus we know that the return value is a String. If returnSomeValue() can infer its generic type parameter is String based on the String argument, why can’t createSomeValue() infer that it’s generic type parameter is String based on its Factory<String> argument? It already knows for sure that it’s getting a Factory; it just needs to know what type of Factory.
Is there some inherent limitation to the .NET type inference system that makes it impossible to know this? Or is this simply that the .NET implementors didn’t carry the concept one step further?
Java’s generic system (with all of its type erasure horror) is generally regarded as being bad, but it does have one feature that is very useful: wildcard generic types. The lack of wildcards in .NET trips me up time and time again.
Today, the culprit was Nullable<T>. I want to have a method that will effectively do a ToString() of the Nullable’s value, but return some other string if the Nullable does not contain a value. (I happen to want to make it an extension method, but that’s irrelevant here).
I want to do this:
int foo? = null;
foo.ToStringNullable("No Value"); // returns "No Value"
foo = 42;
foo.ToStringNullable("No Value"); // returns "42";
The code I want to write is this:
public static String ToStringNullable(
this Nullable<?> nullable,
String inCaseOfNull) {
String value;
if (nullable == null || !nullable.HasValue) {
value = inCaseOfNull;
}
else {
value = nullable.ToString();
}
return value;
}
Note the <?>: this is a Java-ism that does not work in .NET. Java has the wildcard type “?” for generics: it means accept any type. .NET doesn’t have this. Instead, I’d be forced to write:
public static String ToStringNullable<T>( this Nullable<T> nullable, String inCaseOfNull) where T : struct
…which in turn means I’d have to redundantly specify the type of the first parameter, like so:
int foo? = null;
// returns "No Value"
foo.ToStringNullable<int>("No Value");
…which in turn gives me the opportunity to makes mistakes like:
int foo? = null;
foo.ToStringNullable<bool>("No Value");
(although this does generate a compiler error, so it’s not so bad — just tiresome).
(Also note: The “where T : struct” at the end is an additional requirement for the use of Nullable; it’s not important to the argument.)
One would legitimately think that you could use <Object> in place of <T>. However, this is not allowable in .NET. The problem is that some, but not all operations are necessarily typesafe when doing this. Consider:
// This is conceptually sound, but problematic List<Object> aListOfObjects = new List<String>(); // This won't ever work aListOfObjects.Add( -1 ); // However, there's nothing conceptually wrong with this aListOfObjects.Contains( -1 );
This unsolved problem is why I didn’t tag this post as a “Stupid .NET Trick.” The .NET language team made a reasonable decision to forgo wildcards so that they could avoid the erasure mess of Java. It was a tradeoff, and probably a good one overall — but that doesn’t mean that it doesn’t cause problems.
I’ve come up with one way around this problem:
Thus, I would do something like this:
public interface NullableTypeless {
bool HasValue { get; }
// Not necessary, but provided for example purposes
String ToString();
}
public struct Nullable<T> : NullableTypeless {}
Note that NullableTypeless does not have any type-specific members such as <T> Value { get; }.
Then my extension method could be written like so:
public static String ToStringNullable(
this NullableTypeless nullable,
String inCaseOfNull ) {
String value;
if (nullable == null || !nullable.HasValue) {
value = inCaseOfNull;
}
else {
value = nullable.ToString();
}
return value;
}
// This would work fine:
int foo? = null;
foo.ToStringNullable("No Value");
This works acceptably well in my own code. Unfortunately, since I can’t rewrite .NET, this option isn’t available in the case of Nullable.
Ideally I’d like to see some sort of modifier to methods that says “this method is safe to use if the value supplied to a generic argument is higher up on the inheritance tree than the type that was actually used to create the class”. You could use this on a method-by-method basis. For example, List<T>.Contains(<T>) would be tagged with it (since it doesn’t matter what the argument is; everything has Equals()) but List<T>.Add(<T>) would not (since it can only accept members that are the same or lower down than the List’s inherent type). I don’t think this ever will happen in .NET though.
This came dangerously close to a “Stupid .NET Trick”, but I’ve figured out an elegant solution. As far as I can see though, it wasn’t documented on Google, so I decided to post it here.
In Java (and, I suspect, many other languages), a method signature is determined by the method’s name and its parameters, but not by it’s return type. Thus, you cannot have two methods that differ only by their return type:
public boolean convertTo(String value) { ... }
public int convertTo(String value) { ... }
Here, the method signatures are identical, and thus the compiler sees two duplicate methods, not an overload. The usual way around this is to suffix the return type to the end of the method name.
C# nearly has the same restriction; you cannot write two (simple) methods in the same class that differ only by return type. However: the .NET runtime does not have this restriction. C# (and VB, and presumably the other CLR languages) take advantage of this by allowing overloading based on return types when implementing different interfaces.
For example: Consider these interfaces:
public interface Foo {
void perform();
}
public interface Bar {
int perform();
}
This doesn’t work:
public class NoInterfaces {
public void perform() {
}
public int perform() {
return -1;
}
}
If you declare a class that implements both Foo and Bar, you must provide both perform() methods. C# gives you a way past the signature conflict by allowing you to specify the source interface of the duplicate method. However, you cannot make both methods public: you must choose at most one (and potentially zero) method to make public, and the others must have no access modifier at all:
public class BothInterfaces : Foo, Bar {
public void perform() {
}
int Bar.perform() {
return -1;
}
}
Note that the Bar version of BotherInterfaces.perform() (or any other method without an access modifier) is very nearly uncallable, even from within the class itself (that is: it’s even less accessible than a private method). There is a way to call it though, which I’ll discuss below.
To further complicate things: you can in fact declare overloaded public methods with differing return types if they’re declared at different levels of inheritance:
public class ImplementsFoo : Foo {
public void perform() {
throw new NotImplementedException();
}
}
public class ExtendsFooImplementsBar
: ImplementsFoo, Bar {
public new int perform() {
return -1;
}
}
ExtendsFooImplementsBar gains “public void perform()” by virtue of subclassing ImplementsFoo, and also must declare “public int perform()” when implementing Bar. The trick is that ExtendsFooImplementsBar.perform() must hide (not overload/override) ImplementsFoo.perform() (hence the “new” keyword in the ExtendsFooImplementsBar declaration). The hidden ImplementsFoo.perform() is not accessible by external callers; however it is accessible from within ExtendsFooImplementsBar by calling base.perform().
I’ve seen this crop up in two places:
System.Collections.IEnumerable declares GetEnumerator(), which returns System.Collections.IEnumerator. System.Collections.Generic.IEnumerable<T> declares GetEnumerator(), which returns System.Collections.Generic.IEnumerator<T>. Since System.Collections.Generic.IEnumerable<T> also inherits from System.Collections.IEnumerable, it also must declare System.Collections.IEnumerator GetEnumerator() (without the generic type parameter).
This isn’t much of a problem, as:
Iesi.Collections.Generic.Set<T> (and it’s subclasses) inherits from System.Collections.Generic.ICollection<T>, and thus inherits ICollection<T>’s void Add(<T>). However, it also implements Iesi.Collections.Generic.ISet<T>, which in turn declares bool Add(<T>).
Both methods have good reasons for declaring their respective return types. Set’s bool Add() returns true or false depending on whether the value passed in is already in the Set. (Keep in mind that Sets can only contain a particular value once. Knowing whether the Add() call actually added the value can be useful, especially in sorted implementations of Set.) On the other hand, that functionality is not useful in ICollection (as determining the results of the Add() may be meaningless or expensive for other collection implementations), so returning void from Add() is appropriate for the most part. Set definitely should be implementing ICollection for interoperability purposes.
A problem occurs when you need to access the void version of Set.Add() instead of the bool version. This is most common when using the method with a delegate. Consider this:
// This is a standard .NET delegate
public delegate void Action<T>(<T> target);
public void performOnAll(Action<Foo> action) {
// Perform some action on multiple objects in
// some sort of data structure
// (perhaps the results of a database query?)
}
List<Foo> aList = new List<T>();
// This works fine
performOnAll(aList.Add);
HashedSet<T> aSet = new HashedSet<T>();
// This doesn't work because HashedSet.Add()
// returns bool, not void
performOnAll(aSet.Add);
// Instead, you're forced to do this:
performOnAll(delegate(Foo foo){
aSet.Add(foo);
});
// (Or use a lambda in .NET 3.5)
Iesi could have avoided this problem by renaming “bool Add()” to something like “bool AddIfPossible()”… but they didn’t. Fortunately, there’s a way around this.
There’s a relatively simple solution to this complicated problem: cast the reference to one of its superclasses/interfaces and you can access the methods from that interface.
int result; BothInterfaces bothInterfaces = new BothInterfaces(); // This method returns void and thus // can't be assigned to result ((Foo)bothInterfaces).perform(); result = ((Bar)bothInterfaces).perform(); ExtendsFooImplementsBar extendsFooImplementsBar = new ExtendsFooImplementsBar(); // This method returns void and thus // can't be assigned to result ((Foo)extendsFooImplementsBar).perform(); result = ((Bar)extendsFooImplementsBar).perform();
All of these calls access the method from the respective interface as expected.
This also works for my Set/delegate problem:
performOnAll( ( (ICollection<Foo>) aSet).Add);
Lastly, it also allows you to call the otherwise uncallable zero-access-modifier method when implementing multiple interfaces:
public class BothInterfaces : Foo, Bar {
public void perform() {
}
int Bar.perform() {
return -1;
}
public void test() {
// Can call the void version normally
perform();
// Can call Bar's perform via casting
int result = ( (Bar) this).perform();
}
}
All in all, it’s a pretty good solution… but understanding the problem it solves is quite sticky.
TextWriter is an abstract class that has heavily-overloaded Write() and WriteLine() methods for all of the more basic types such as Int32, Boolean, and String. The idea is that it takes these values, converts them to character representations (given a specified Encoding), and then outputs the results (usually to a StringBuilder or a Stream). It’s equivalent to Java’s PrintWriter. However, unlike PrintWriter, there’s no underlying output source / OutputStream: you’re free to do whatever you like with the resulting characters.
The stupidity lies in the interface and it’s accompanying documentation. TextWriter is an abstract class, but the only abstract member is the Encoding property’s get accessor; you get to decide how and when you want the Encoding to be specified. All of the Write() and WriteLine() methods, however, are not abstract: they’re simply overrideable. That’s not necessarily bad, but there’s little to no documentation on what any of these 36 methods actually do (specifically, whether they call any of the other Write() methods and thus don’t need to be overridden in any subclass). For example:
The documentation for the TextWriter class itself is no help either:
A derived class must minimally implement the Write method to make a useful instance of TextWriter.
(The link points to the list of all the Write() overloads, not one specific method.)
All Microsoft needed to do was:
Then, a TextWriter implementation would only have to override one method (ie: Write(char)) and reliably get all of the conversion behavior for free. That’s almost exactly what PrintWriter already does (except that it defers to an aggregated Writer/OutputStream to write its characters rather than deferring to its own abstract method). Instead, we get this mess.
Bonus Points: Write(Object) simply does a .ToString() on the given object, which is reasonable. Most of the other overloads (for Boolean, Double, Int32, Int64, Single, UInt32, UInt64, and probably Decimal) state that they do exactly the same thing… thus making them unnecessary. It’s clear that the .NET API designers were trying to emulate Java’s PrintWriter, which is forced to create the equivalent overloads due to the separation between primitive and Object types. Of course, if you’re implementing your own TextWriter, you still have to override all of the superfluous methods (and don’t forget the WriteLine()s!).
Extra credit: There’s Write(String, Object) to format the Object argument given the formatting String argument (which is a very common way to write out dates and numbers in varying ways). There’s also a Write(String, Object, Object) method that does the exact same thing but with two Object arguments, and then a Write(String, Object, Object, Object) that does the same thing but with three Object arguments. There’s no method that takes four arguments though; I guess they considered that excessive.
I saw “24 hours”, “Must Have”, and “Crap”, and immediately thought “software development rush job.” That’s not quite the intent this Indexed card had, but I think it’s still appropriate.
Update: Hrm, I can’t hotlink to the image, and I’m not going to make a copy on my server… so I guess you’ll have to click the link if you want to see the card. That lessens the impact. ![]()
Some .NET classes (System.Data.DataSet is the one I’m currently using) define a property called “Namespace”. In VB, “Namespace” (note the capital “N”) is a reserved word used to associate a class with a particular namespace. Since it’s a reserved word, it’s not a valid identifier in VB… meaning that the MS classes that contain the property “Namespace” were not written in VB (at least without some sort of compiler hack).
Currently I’m trying to reimplement a (legacy) class that subclasses DataSet; I want to break the DataSet inheritance and replace it with custom members. Due to the “Namespace” conflict, I can’t do this in VB, although it wouldn’t be a problem in C#, and isn’t a problem for all of the other properties in the old class.
So much for language independence.
Update: I spoke too soon; VB does have a mechanism for specifying identifier names that match keywords: just enclose the identifier in brackets. It’s unfortunate that it’s necessary, but it’s easy enough to implement once you know the trick.