Monday, September 28, 2009

.Net Design Mistakes - Streams

For the most part, .Net is a great framework. But it has its share of "WTF?" classes and design decisions. In this series I will cover (vent) these objective (subjective) faults. Today I am covering the mess better known as System.IO.Stream: a class trying to be five interfaces at once. Let's dive right in and cover the major problems.

Problem #1: Tedious to Implement

Suppose I want to implement a stream that flips the bits of a substream. Sounds pretty simple, but here is what comes out:

Public Class BinaryNotStream
Inherits IO.Stream
Private ReadOnly substream As IO.Stream

Public Sub New(ByVal substream As IO.Stream)
If substream Is Nothing Then Throw New ArgumentNullException("substream")
If Not substream.CanRead Then Throw New ArgumentException("substream")
Me.substream = substream
End Sub

Public Overrides Function Read(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer) As Integer
Dim n = substream.Read(buffer, offset, count)
For i = offset To offset + n - 1
buffer(i) = Not buffer(i)
Next i
Return n
End Function

Public Overrides ReadOnly Property CanRead As Boolean
Get
Return True
End Get
End Property

Protected Overrides Sub Dispose(ByVal disposing As Boolean)
If disposing Then
substream.Dispose()
End If
MyBase.Dispose(disposing)
End Sub

Public Overrides ReadOnly Property CanSeek As Boolean
Get
Return False
End Get
End Property
Public Overrides ReadOnly Property CanWrite As Boolean
Get
Return False
End Get
End Property
Public Overrides Sub Flush()
End Sub
Public Overrides ReadOnly Property Length As Long
Get
Throw New NotSupportedException()
End Get
End Property
Public Overrides Property Position As Long
Get
Throw New NotSupportedException()
End Get
Set(ByVal value As Long)
Throw New NotSupportedException()
End Set
End Property
Public Overrides Function Seek(ByVal offset As Long, ByVal origin As System.IO.SeekOrigin) As Long
Throw New NotSupportedException()
End Function
Public Overrides Sub SetLength(ByVal value As Long)
Throw New NotSupportedException()
End Sub
Public Overrides Sub Write(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer)
Throw New NotSupportedException()
End Sub
End Class

Allow me to summarize my reaction to this: AAAAAAAAAAAAAAAUUUUUUUUUUUUUUUUUUUUUUGGGHHH! Half of the code is throwing NotSupportedExceptions (which is appropriate!)! Pick almost any stream class and you'll find this "NotSupported" boilerplate code. It is a pain to write, a pain to read, and guarantees even dead simple stream class are at least a few dozen lines.

Problem #2: Guess the Type

It boggles the mind that a class must call itself a 'stream' when it can't actually do two thirds of the things a 'stream' does. There is no way to tell the type system a class is a "readable" stream, and no way to require a readable stream parameter. When you implement a function that accepts a stream, you must check that the stream supports reading by querying CanRead. Otherwise, an exception will be thrown when you try to use the stream (possibly much later). Technically, a stream could even change from readable to writable during execution.

Many .net streams change their readability based on arguments given to their constructor. For example, the file stream is not a readable stream and it is not a writable stream. The file stream is a "readable and/or writable" stream (depending on the open type specified). Similarly, the compression streams are "readable xor writable" streams (when compressing you may only write and when decompressing you may only read).

Because of this type flexibility, many of the stream classes have too many responsibilities. Poor DeflateStream has to both compress and decompress data, even though those operations have very different behaviours. Similarly, FileStream has to enforce read/write permissions at runtime instead of just relying on the type system.

Problem #3: Redundant Redundancy

First, streams have a Position property and a Seek method and both have to be implemented, despite the fact that Seek can be implemented entirely in terms of Position and Length. Every time you implement a seekable stream, you re-implement Seek exactly the same way as last time. What a waste of code, and what a great way to introduce blind copy-paste bugs.

Second, streams have an overridable Close method and a Dispose method. Here are their implementations in IO.Stream according to Reflector:

public virtual void Close() {
this.Dispose(true);
GC.SuppressFinalize(this);
}
public void Dispose() {
this.Close();
}

I think they assumed programmers would find it easier to understanding "closing" a stream as opposed to "disposing" a stream. This was a bad idea. Not only does it create confusion (is disposing equivalent to closing?) and add cruft, it implies you can later re-"open" the stream (you can't). Actually, this issue was mentioned in a Microsoft channel 9 show about the upcoming Linq-to-Events/Reactive-framework (as one of the reasons they decided to only use Dispose).

Problem #4: Not an Interface

Extension methods didn't exist when the stream class was designed and implemented, so the only way to include derived methods (eg. ReadByte uses Read) 'within' a type was an overridable class (I am guessing at Microsoft's reasoning, of course).

I don't really fault Microsoft for this problem: they probably chose the best option at the time. However, times have changed and there are downsides to using classes instead of interfaces. Classes don't allow multiple inheritance, which is probably why there is a single stream type instead of multiple (eg. ReadableStream, WriteableStream). It also means implementers can't be derived from a non-stream base class, which is annoying because sometimes "oh, and you can use it like a stream" is a feature you want to tack onto an existing class.

Proposed Solution

Split IO.Stream's various responsibilities into interfaces:

Public Interface IReadableStream
Inherits IDisposable
Function Read(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer) As Integer
End Interface
Public Interface IWritableStream
Inherits IDisposable
Sub Write(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer)
Sub Flush()
End Interface
Public Interface ISeekableStream
Inherits IDisposable
Property Position As Long
Property Length As Long
End Interface

IO.Stream can implement the interfaces for forwards compatibility. Extension methods, which wrap stream interface instances inside an IO.Stream, provide backwards compatibility. Extension methods can implement the derived methods from IO.Stream:

<Extension()>
Public Function ReadByte(ByVal this As IReadableStream) As Integer
If this Is Nothing Then Throw New ArgumentNullException("this")
Dim buffer(0 To 0) As Byte
Dim n = this.Read(buffer, 0, 1)
If n = 0 Then Return -1
Return buffer(0)
End Function

Eventually, IO.Stream can be deprecated.

Furthermore, define interfaces for combinations of responsibilities:

Public Interface IRereadableStream
Inherits IReadableStream
Inherits ISeekableStream
End Interface
Public Interface IRewritableStream
Inherits IWritableStream
Inherits ISeekableStream
End Interface
...


The result is the same functionality we have now, except type safe. You no longer need to check if a stream is readable, you no longer need to throw NotSupportExceptions, and you no longer need to write blog posts about how bad streams are.

1 comment:

  1. "You no longer need to check if a stream is readable" - This is only partially true. Streams can change their CanRead behavior on-the-fly which would still mandate such a property.

    ReplyDelete