Monday, September 28, 2009

.Net Design Mistakes - Streams

For the most part, .Net is a great framework. But it has its share of "WTF?" classes and design decisions. In this series I will cover (vent) these objective (subjective) faults. Today I am covering the mess better known as System.IO.Stream: a class trying to be five interfaces at once. Let's dive right in and cover the major problems.

Problem #1: Tedious to Implement

Suppose I want to implement a stream that flips the bits of a substream. Sounds pretty simple, but here is what comes out:

Public Class BinaryNotStream
Inherits IO.Stream
Private ReadOnly substream As IO.Stream

Public Sub New(ByVal substream As IO.Stream)
If substream Is Nothing Then Throw New ArgumentNullException("substream")
If Not substream.CanRead Then Throw New ArgumentException("substream")
Me.substream = substream
End Sub

Public Overrides Function Read(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer) As Integer
Dim n = substream.Read(buffer, offset, count)
For i = offset To offset + n - 1
buffer(i) = Not buffer(i)
Next i
Return n
End Function

Public Overrides ReadOnly Property CanRead As Boolean
Get
Return True
End Get
End Property

Protected Overrides Sub Dispose(ByVal disposing As Boolean)
If disposing Then
substream.Dispose()
End If
MyBase.Dispose(disposing)
End Sub

Public Overrides ReadOnly Property CanSeek As Boolean
Get
Return False
End Get
End Property
Public Overrides ReadOnly Property CanWrite As Boolean
Get
Return False
End Get
End Property
Public Overrides Sub Flush()
End Sub
Public Overrides ReadOnly Property Length As Long
Get
Throw New NotSupportedException()
End Get
End Property
Public Overrides Property Position As Long
Get
Throw New NotSupportedException()
End Get
Set(ByVal value As Long)
Throw New NotSupportedException()
End Set
End Property
Public Overrides Function Seek(ByVal offset As Long, ByVal origin As System.IO.SeekOrigin) As Long
Throw New NotSupportedException()
End Function
Public Overrides Sub SetLength(ByVal value As Long)
Throw New NotSupportedException()
End Sub
Public Overrides Sub Write(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer)
Throw New NotSupportedException()
End Sub
End Class

Allow me to summarize my reaction to this: AAAAAAAAAAAAAAAUUUUUUUUUUUUUUUUUUUUUUGGGHHH! Half of the code is throwing NotSupportedExceptions (which is appropriate!)! Pick almost any stream class and you'll find this "NotSupported" boilerplate code. It is a pain to write, a pain to read, and guarantees even dead simple stream class are at least a few dozen lines.

Problem #2: Guess the Type

It boggles the mind that a class must call itself a 'stream' when it can't actually do two thirds of the things a 'stream' does. There is no way to tell the type system a class is a "readable" stream, and no way to require a readable stream parameter. When you implement a function that accepts a stream, you must check that the stream supports reading by querying CanRead. Otherwise, an exception will be thrown when you try to use the stream (possibly much later). Technically, a stream could even change from readable to writable during execution.

Many .net streams change their readability based on arguments given to their constructor. For example, the file stream is not a readable stream and it is not a writable stream. The file stream is a "readable and/or writable" stream (depending on the open type specified). Similarly, the compression streams are "readable xor writable" streams (when compressing you may only write and when decompressing you may only read).

Because of this type flexibility, many of the stream classes have too many responsibilities. Poor DeflateStream has to both compress and decompress data, even though those operations have very different behaviours. Similarly, FileStream has to enforce read/write permissions at runtime instead of just relying on the type system.

Problem #3: Redundant Redundancy

First, streams have a Position property and a Seek method and both have to be implemented, despite the fact that Seek can be implemented entirely in terms of Position and Length. Every time you implement a seekable stream, you re-implement Seek exactly the same way as last time. What a waste of code, and what a great way to introduce blind copy-paste bugs.

Second, streams have an overridable Close method and a Dispose method. Here are their implementations in IO.Stream according to Reflector:

public virtual void Close() {
this.Dispose(true);
GC.SuppressFinalize(this);
}
public void Dispose() {
this.Close();
}

I think they assumed programmers would find it easier to understanding "closing" a stream as opposed to "disposing" a stream. This was a bad idea. Not only does it create confusion (is disposing equivalent to closing?) and add cruft, it implies you can later re-"open" the stream (you can't). Actually, this issue was mentioned in a Microsoft channel 9 show about the upcoming Linq-to-Events/Reactive-framework (as one of the reasons they decided to only use Dispose).

Problem #4: Not an Interface

Extension methods didn't exist when the stream class was designed and implemented, so the only way to include derived methods (eg. ReadByte uses Read) 'within' a type was an overridable class (I am guessing at Microsoft's reasoning, of course).

I don't really fault Microsoft for this problem: they probably chose the best option at the time. However, times have changed and there are downsides to using classes instead of interfaces. Classes don't allow multiple inheritance, which is probably why there is a single stream type instead of multiple (eg. ReadableStream, WriteableStream). It also means implementers can't be derived from a non-stream base class, which is annoying because sometimes "oh, and you can use it like a stream" is a feature you want to tack onto an existing class.

Proposed Solution

Split IO.Stream's various responsibilities into interfaces:

Public Interface IReadableStream
Inherits IDisposable
Function Read(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer) As Integer
End Interface
Public Interface IWritableStream
Inherits IDisposable
Sub Write(ByVal buffer() As Byte, ByVal offset As Integer, ByVal count As Integer)
Sub Flush()
End Interface
Public Interface ISeekableStream
Inherits IDisposable
Property Position As Long
Property Length As Long
End Interface

IO.Stream can implement the interfaces for forwards compatibility. Extension methods, which wrap stream interface instances inside an IO.Stream, provide backwards compatibility. Extension methods can implement the derived methods from IO.Stream:

<Extension()>
Public Function ReadByte(ByVal this As IReadableStream) As Integer
If this Is Nothing Then Throw New ArgumentNullException("this")
Dim buffer(0 To 0) As Byte
Dim n = this.Read(buffer, 0, 1)
If n = 0 Then Return -1
Return buffer(0)
End Function

Eventually, IO.Stream can be deprecated.

Furthermore, define interfaces for combinations of responsibilities:

Public Interface IRereadableStream
Inherits IReadableStream
Inherits ISeekableStream
End Interface
Public Interface IRewritableStream
Inherits IWritableStream
Inherits ISeekableStream
End Interface
...


The result is the same functionality we have now, except type safe. You no longer need to check if a stream is readable, you no longer need to throw NotSupportExceptions, and you no longer need to write blog posts about how bad streams are.

Tuesday, September 15, 2009

Exceptions to Die By - The InvalidStateException

Introduction

When your program fails, you want to throw the right type of exception. You want to provide useful information in the type so the caller can react appropriately to the problem. Lack of type information is one reason why throwing an exception of type Exception is such a bad idea. The .net base class library includes exceptions for most common cases (ArgumentException, InvalidOperationException, IOException, etc.). This post covers a class of custom exceptions I like to use in my own projects, which I call invalid state exceptions.

Description

Most bugs immediately cause an exception to be thrown: if you try to pass null into a non-null method you get an ArgumentNullException, and if you try to pop an empty stack and you get an InvalidOperationException. But the difficult bugs don't throw exceptions immediately: they occur, corrupt the program state, and at some later point this causes an exception to be thrown.

Enter the InvalidStateException. An invalid state exception occurs when the program fails because a state-corrupting bug has already occurred. For example, a list enumeration failing because the list was modified should be an InvalidStateException instead of an InvalidOperationException.

'''<summary>Indicates an invalid program state has been detected (eg. due to a fault).</summary>
Public Class InvalidStateException
Inherits InvalidOperationException
Public Sub New(Optional ByVal message As String = Nothing,
Optional ByVal innerException As Exception = Nothing)
MyBase.New(If(message, "Reached an unexpected state."), innerException)
End Sub
End Class

Invalid state exceptions are generally irrecoverable because it is already too late to fix the problem. They tend to indicate bugs in the interaction of multiple components, and are usually thrown by assumption-checking code. I have found that adding this exception type has increased the amount of assumption checking code I write, which is definitely a good thing.

Derived Types

Usually we know more than just "the program is in an invalid state" when throwing an invalid state exception. The most common assumption I check is that a switch statement hits a defined case, so I have a derived type called ImpossibleValueException. Here are the three derived types I have used:

1. UnreachableException: I usually use this one to silence the compiler's reachability analyzer or to indicate to readers that a case is impossible.


'''<summary>Indicates a path expected to be unreachable has been executed.</summary>
Public Class UnreachableException
Inherits InvalidStateException
Public Sub New(Optional ByVal message As String = Nothing,
Optional ByVal innerException As Exception = Nothing)
MyBase.New(If(message, "Reached a state which was expected to not be reachable."), innerException)
End Sub
End Class

2. ImpossibleValueException: I often throw this one in the default cases of switch statements over enums. I even defined an extension method for constructing it from any value.


'''<summary>Indicates an internal value expected to be impossible has been encountered.</summary>
Public Class ImpossibleValueException(Of T)
Inherits InvalidStateException
Public ReadOnly Value As T
Public Sub New(ByVal value As T,
Optional ByVal message As String = Nothing,
Optional ByVal innerException As Exception = Nothing)
MyBase.new(If(message, "The {0} value ""{1}"" was not expected.".Frmt(GetType(T).Name, String.Concat(value))),
innerException)
Me.Value = value
End Sub
End Class

3. InfiniteLoopException: I defined this one when making a single linked list iterator with cycle detection.


'''<summary>Indicates the program entered an infinite loop but the problem was caught and the loop aborted.</summary>
Public Class InfiniteLoopException
Inherits InvalidStateException
Public Sub New(Optional ByVal message As String = Nothing,
Optional ByVal innerException As Exception = Nothing)
MyBase.New(If(message, "Detected and aborted an infinite loop."), innerException)
End Sub
End Class

Conclusion

The invalid state exception is not well covered by any of the default exception types, which is why I defined it. Once defined, I found myself using it in all kinds of (appropriate!) places. The moral: don't be afraid to define generic exceptions when you find yourself shoehorning a weird exception into the existing types. (Just don't go crazy.)

Concurrency #4: Using the Call Queue

Introduction

Now that we have implemented a lock free queue, futures, and a call queue, we will start using them to perform concurrent programming. For example, consider a bank account class synchronized using locks:

Public Class BankAccount
Private balance As UInteger
Private ReadOnly lock As New Object()
Public Function Deposit(ByVal value As UInteger) As UInteger
SyncLock lock
balance += value
Return balance
End SyncLock
End Function
Public Function TryWithdraw(ByVal value As UInteger) As UInteger?
SyncLock lock
If balance < style="color: blue;">Then Return Nothing
balance -= value
Return balance
End SyncLock
End Function
End Class

Rewriting the class to use queues and futures is dead simple, because the locking is at the method level. Just replace SyncLock blocks with QueueFunc blocks.

Public Class BankAccount
Private balance As UInteger
Private queue As ICallQueue = New ThreadPooledCallQueue()
Public Function FutureDeposit(ByVal value As UInteger) As IFuture(Of UInteger)
Return queue.QueueFunc(
Function()
balance += value
Return balance
End Function)
End Function
Public Function FutureTryWithdraw(ByVal value As UInteger) As IFuture(Of UInteger?)
Return queue.QueueFunc(
Function() As UInteger?
If balance < style="color: blue;">Then Return Nothing
balance -= value
Return balance
End Function)
End Function
End Class

Message Passing

Look closely at the bank account example. Calling FutureTryWithdraw is like sending a message to the bank account object (saying "please try to withdraw X dollars"), and the future returned by the method is like a reply saying either "Ok, you now have X dollars!" or "Not enough funds!". We've implemented a form of message passing, which is what highly concurrent languages like Erlang are based on.

Message passing is not the holy grail of concurrency, but it is an extremely powerful abstraction. For example, it would be much easier to message-pass to a remote machine than do some type of lock sharing. Message passing also makes it impossible to cause a deadlock (the closest analogue is creating a never-ending message cycle; much easier to debug). Most importantly, it is easier to reason about messages than locks. Proving a lock is used correctly requires looking at the entire program. Proving a message is used correctly only requires looking at part of the program (usually one or two functions).

Messages are not superior to locks in every way, of course. Different methods work better in different situations. For example, there is no message-passing analogue for holding two locks at the same time. In addition, it is natural to lock only part a method but it is not always clear what that would mean in terms of passing messages.

Those Darn Philosophers

I'm going to finish with an implementation of the dining philosophers done with message passing. Put it in a command line project (alongside the other classes introduced in this series). Notice how each class is simple on its own. In fact, the most complex parts are related to writing readable information to the console! In the next post in this series, I will supply implementations of futures for various asynchronous tasks.

Public Module Main
Public Sub Main()
Const NumPhilosphers As Integer = 5

'Setup table
Dim chopSticks(0 To NumPhilosphers - 1) As Chopstick
Dim philosophers(0 To NumPhilosphers - 1) As DiningPhilosopher
For i = 0 To NumPhilosphers - 1
chopSticks(i) = New Chopstick()
Next i
For i = 0 To NumPhilosphers - 1
philosophers(i) = New DiningPhilosopher(New Random(Environment.TickCount + i), chopSticks(i), chopSticks((i + 1) Mod NumPhilosphers))
Next i

'Give a bit of stabilizing time for measurements
Threading.Thread.Sleep(5000)

'Print status updates
Dim n = 0
Do
n += 1
System.Console.WriteLine("Update #" + n.ToString())
For i = 0 To NumPhilosphers - 1
System.Console.WriteLine("Philosopher #" + i.ToString() + " is " + philosophers(i).Status + ".")
Next i
Threading.Thread.Sleep(1000)
Loop
End Sub
End Module

'''Represents a shared resource.
Public Class Chopstick
Public queue As ICallQueue = New ThreadPooledCallQueue()
Private inUse As Boolean

Public Function QueueTryPickUp() As IFuture(Of HeldChopStick)
Return queue.QueueFunc(Function()
If inUse Then Return Nothing
inUse = True
Return New HeldChopStick(AddressOf QueuePutDown)
End Function)
End Function
Private Function QueuePutDown() As IFuture
Return queue.QueueAction(Sub()
inUse = False
End Sub)
End Function
End Class

'''Represents an acquired resource.
Public Class HeldChopStick
Private wasDropped As Boolean
Private ReadOnly drop As Func(Of IFuture)
Public Sub New(ByVal drop As Func(Of IFuture))
Me.drop = drop
End Sub
Public Function PutDown() As IFuture
If wasDropped Then Throw New InvalidOperationException()
wasDropped = True
Return drop()
End Function
End Class

'''Represents a philosopher trying to share chopsticks for eating.
Public Class DiningPhilosopher
Private ReadOnly leftChopStick As Chopstick
Private ReadOnly rightChopStick As Chopstick
Private feedings As Integer
Private lastTime As Date
Private weightedRate As Double

Public Sub New(ByVal rng As Random, ByVal leftChopStick As Chopstick, ByVal rightChopStick As Chopstick)
Me.lastTime = Now
Me.leftChopStick = leftChopStick
Me.rightChopStick = rightChopStick
Call TryGrabChopsticks(rng)
End Sub

'This method isn't thread-safe, but the errors will be marginal [race condition on increments and resets of 'feeding']
Public ReadOnly Property Status As String
Get
'Measure
Dim t = Now
Dim dt = t - lastTime
Dim rate = feedings / dt.TotalSeconds

'Estimate
If weightedRate = 0 Then
weightedRate = rate
Else
'Exponential approach w.r.t. time
Dim lambda = 0.9 ^ dt.TotalSeconds
weightedRate *= lambda
weightedRate += rate * (1 - lambda)
End If

'Categorize
Dim category = ""
Select Case weightedRate
Case 0 : category = "Empty"
Case 0 To 1 : category = "Starving"
Case 1 To 2 : category = "Hungry"
Case 2 To 3 : category = "Satisfied"
Case Else : category = "Plump"
End Select

'Finish
lastTime = t
feedings = 0
Return category + " (~" + weightedRate.ToString("0.00") + "/s)"
End Get
End Property

Private Sub TryGrabChopsticks(ByVal rng As Random)
'Try to grab both chopsticks
Dim futureLeft = leftChopStick.QueueTryPickUp()
Dim futureRight = rightChopStick.QueueTryPickUp()

'React once both chopsticks have been attempted
futureLeft.CallWhenValueReady(
Sub(left) futureRight.CallWhenValueReady(
Sub(right)
'Call on a new thread because there are sleeps
Call New Threading.Thread(Sub() TryHoldChopsticks(rng, left, right)).Start()
End Sub))
End Sub
Private Sub TryHoldChopsticks(ByVal rng As Random, ByVal left As HeldChopStick, ByVal right As HeldChopStick)
If right Is Nothing OrElse left Is Nothing Then
'Oh no! Put down and wait a bit before trying again.
If left IsNot Nothing Then left.PutDown()
If right IsNot Nothing Then right.PutDown()
Threading.Thread.Sleep(10 + rng.Next(10))
Else
'Yay! Food!
feedings += 1
Threading.Thread.Sleep(50 + rng.Next(50))
left.PutDown()
right.PutDown()

'Snooze
Threading.Thread.Sleep(50 + rng.Next(50))
End If

'Hungry! Repeat!
Call TryGrabChopsticks(rng)
End Sub
End Class