Aspects and Garbage Collection

Earlier today I wrote about an Assert method which can be used in unit tests to check whether or not an object is collected during Garbage Collection (GC). I wrote this method because I was suspicious about a certain PostSharp Aspect I just implemented. The aspect works great, and I’ll surely write about it later; however, since it’s an aspect which is meant to be applied on an entire assembly by using multicasting, I thought twice before calling it a day.

After being skeptical and writing unit tests to ensure my IsGarbageCollected() method was actually working as expected, I came to the unfortunate conclusion that my aspects were keeping my objects in memory!

It only makes sense really; why wouldn’t aspects be able to keep objects alive? It might not be the first thing you think about, but hopefully this post makes you aware of the hidden dangers. As an example I’ll show you a very basic aspect, and how to fix it.

[Serializable]
public class SampleAspect : IInstanceScopedAspect
{
	[NonSerialized]
	object _instance;

	public object CreateInstance( AdviceArgs adviceArgs )
	{
		_instance = adviceArgs.Instance;
		return MemberwiseClone();
	}

	public void RuntimeInitializeInstance()	{ }
}

As any other memory leak, it suffices to leave one reference lying around to an object, preventing the GC from collecting it. There are two mistakes in the previous code, can you find them?

The first one was most obvious to me. _instance is holding a reference to the object, keeping it alive. Luckily .NET offers us an easy solution, WeakReference. A WeakReference references an object while still allowing that object to be reclaimed by garbage collection.

[Serializable]
public class SampleAspect : IInstanceScopedAspect
{
	[NonSerialized]
	WeakReference _instance;

	public object CreateInstance( AdviceArgs adviceArgs )
	{
		_instance = new WeakReference( adviceArgs.Instance );
		return MemberwiseClone();
	}

	public void RuntimeInitializeInstance()	{ }
}

However, Gael was kind enough to quickly point out this wasn’t the real issue. PostSharp uses prototypes to initialize its aspects. An existing prototype object is cloned to create new aspects. The CreateInstance() method is called on the prototype object, and you are expected to return a new aspect. The real error in my code was I was setting _instance on the prototype, while I should be setting it on the cloned object instead. The prototype object is kept in memory by PostSharp, and as a result the last created object was also held in memory. Not a big issue, but the following implementation is a bit cleaner.

[Serializable]
public class SampleAspect : IInstanceScopedAspect
{
	[NonSerialized]
	object _instance;

	public object CreateInstance( AdviceArgs adviceArgs )
	{
		var newAspect = (InitializeEventHandlerAspect)MemberwiseClone();
		newAspect._instance = adviceArgs.Instance;

		return newAspect;
	}

	public void RuntimeInitializeInstance()	{ }
}

No need to use WeakReference any longer, although I imagine it might still be useful in other scenarios.

Garbage Collection Unit Test

Unit tests are useful in order to guarantee code works as it is expected to work. Today my spidey sense told me I might have caused a memory leak. Let’s not consider unmanaged code for now, but wouldn’t it be great if you could write a unit test to prove that a particular object can be garbage collected? I found some interesting ideas online, but no reusable solution. What follows is my attempt at an Assert method which can be used during unit testing.

object empty = new object();
AssertHelper.IsGarbageCollected( ref empty );

It works by using a WeakReference, which references an object while still allowing that object to be reclaimed by garbage collection. The object to test is passed using the ref keyword, which allows the Assert method to set it to null. This allows it to be garbage collected.

public static void IsGarbageCollected<TObject>( ref TObject @object )
	where TObject : class
{
	Action<TObject> emptyAction = o => { };
	IsGarbageCollected( ref @object, emptyAction );
}

public static void IsGarbageCollected<TObject>( ref TObject @object, Action<TObject> useObject )
	where TObject : class
{
	if ( typeof( TObject ) == typeof( string ) )
	{
		// Strings are copied by value, and don't leak anyhow.
		return;
	}

	int generation = GC.GetGeneration( @object );
	useObject( @object );
	WeakReference reference = new WeakReference( @object );
	@object = null;

	GC.Collect( generation, GCCollectionMode.Forced );
	GC.WaitForPendingFinalizers();

	Assert.IsFalse( reference.IsAlive );
}

As it turns out I did write a memory leak in the code I wanted to test. 😦 At least I know about it now! As usual, I added the total code and some unit tests to my FCL library.

WARNING: Due to differences when using the Release configuration (I assume compiler optimizations), some of these unit tests will fail. Be sure to run them in Debug!

Massive-scale Online Software Development

Cutting right to the chase: Would it be possible to create software, entirely developed and moderated by an open community?

Call it democratic software development, or open source on steroids if you will. While discussing this the default answer I usually get is “it can’t be done”, which is why I gladly filed this post under the newly created category “Crazy Ideas”. Nevertheless, I find it a valuable exercise to discuss any nutcase ideas, in order to evaluate how far-fetched they actually are.

A person with a new idea is a crank until the idea succeeds. – Mark Twain

So what would such a system look like? What would be some of the requirements?

Easily accessible.
And when I say easy, I mean it. You shouldn’t have to download the repository first. You shouldn’t have to set up a development environment. It should run as a web service on the cloud (more buzzwords coming up!). A user account and an internet connection is all you need to get going.

Motivate people to participate.
Ever heard of gamification? “Gamification is the use of game design techniques and mechanics to solve problems and engage audiences.” If you are a software developer, chances are you ended up on Stack Overflow at some point. It’s a Q&A site for programmers which is quickly becoming one of the main resources for help for professional programmers. Stack Overflow incorporates many aspects of gamification, and it’s mere existence shows the power of it. A significant amount of developers is prepared to share and learn in this fun environment. Quality content is pushed to the top via a voting system, while erroneous posts are addressed by the community.

Divide work in small enough tasks.
The key to dividing work across many people is to divide it in such a way that any person only has to implement one small aspect of it at a time. Traditional software development where somebody develops a feature from a to z won’t work. One programming paradigm which at first sight seems extremely suitable for this is functional programming. A person could implement functions, and define new functions on which he relies. Combining this with aspects from test driven development where the caller has to comment and write tests for the desired function would result in automated testing.

Without worrying about the specifics too much (it’s just a nutcase idea after all) consider what the possibilities would be. In Luis von Ahn’s great TED talk the CAPTCHA inventor discusses how he re-purposed CAPTCHA in order to digitize books. Around 2,5 million books a year can be digitized through this massive-scale collaborative effort. Their next project indicates this doesn’t have to be limited to really mundane tasks. They are now working on translating the web!

Moderation guided by conventions.
Conventions are important in a group effort. Unfortunately, when discussing programming conventions people most often discuss naming and formatting conventions, while there are plenty of other important conventions to agree on. This will most likely be the topic of one of my future posts. Conventions should be as unambiguous as possible in order to know where to expect a certain piece of code, or where to place it. Conventions like these could be agreed upon through a democratic process, which seems to be working pretty well for Stack Overflow through its meta site. This allows for community moderation following the guidelines established by the community.

Couple all the separate work together into one entity.
Going from a set of loosely coupled functions to a working library would result in plenty of extra challenges, but also opportunities. Since nobody wants an all encompassing library just to use part of its functionality, the system should allow you to extract just the functionality you are interested in.

Beyond the idea

Well, … I went a bit further and attempted to start a small proof of concept. I figured the Stack Exchange platform on which Stack Overflow runs already encompasses much of the desired functionality, and creating a small scale library on it would be possible.  The idea was to create an extension method library for C#, which exists primarily out of a set of functions. Requesting new Stack Exchange sites is possible through Area 51. Not unexpectedly, my idea got shot down since it doesn’t fit the intended Q&A format. Oh well, … one can only try.

UPDATE:

At the UIST 2014 conference, Thomas D. LaToza presented “Microtask Programming: Building Software with a Crowd”a system encompassing many of these ideas and actually evaluating them, resulting in usable code. The paper is available on ACM.

List of Tuples

I still had a few posts which I was planning to write about first, but the simplicity and usefulness of the following discovery prompted me to go ahead and cut in line of my post queue.

.NET 4.0 introduced a set of very interesting new data structures, called tuples. They allow you to easily group a set of strongly typed variables together, without having to create a struct or class for it. A set of easy factory methods allows you to instantiate them with ease.

// The following is a Tuple<int, string>.
var data = Tuple.Create( 1, "apple" );

Now how do you go about creating a list of them? Initially I attempted an approach as follows, which made my DRY brain hurt.

var groceryList = new List<Tuple<int, string>>
{
    Tuple.Create( 1, "kiwi" ),
    Tuple.Create( 5, "apples" ),
    Tuple.Create( 3, "potatoes" ),
    Tuple.Create( 1, "tomato" )
};

When Dictionary has such a nice initializer, why can’t we? Well, we can!

The { } syntax of the collection initializer works on any IEnumerable type which has an Add method with the correct amount of arguments. Without bothering how that works under the covers, that means you can simply extend from List, add a custom Add method to initialize your T, and you are done! Here we go …

public class TupleList<T1, T2> : List<Tuple<T1, T2>>
{
    public void Add( T1 item, T2 item2 )
    {
        Add( new Tuple<T1, T2>( item, item2 ) );
    }
}

… and you have your easily initializable list of tuples!

var groceryList = new TupleList<int, string>
{
    { 1, "kiwi" },
    { 5, "apples" },
    { 3, "potatoes" },
    { 1, "tomato" }
};

If that doesn’t make you want to scroll over your existing codebase and start refactoring, this blog probably isn’t for you. 🙂

Attribute metabehavior

Attributes in .NET can be used to add metadata to language elements. In combination with reflection they can be put to use in plenty of powerful scenarios. A separate parser can process the data added and act upon the annotations in any way desired. A common example of this is a serializer which knows how to serialize an instance of a class based on attributes applied to its members. E.g. a NonSerialized attribute determines the member doesn’t need to be serialized.

In my previous post I promised I would show some interesting scenarios where the mandatory enum values in my solution, needed to identify different dependency properties (DPs), can be put to good use. It is not required to grasp the entire subject previously discussed, but it does help to know about DPs and their capabilities. Simply put, they are special properties (used by XAML) which can notify whenever they are changed, allowing to bind other data to them. They can also be validated (check whether a value assigned to them is valid or not) and coerced (adjusting the value so it satisfies a certain condition). I consider these capabilities to be metadata, describing particular behavior associated with a given property. After extensive tinkering I found a way to express this behavior in an attribute, but it wasn’t easy. Ordinarily WPF requires you to pass callback methods along when ‘registering’ the DPs, which implement the desired behavior.

Two solutions can be considered:

  1. The parser knows how to interpret the metadata and acts accordingly.
  2. The attribute implements the actual behavior itself, and is simply called by the parser.

Solution 1 is the easiest to implement, and is how attributes are used most often. Solution 2 however has a great added benefit. It allows you to apply the strategy pattern. Unlike solution 1, the parser doesn’t need to know about the specific implementation, but only the interface. Additional behaviors can be implemented and used without having to modify the parser. In contrast to simple metadata, an actual behavior is attached, hence I am dubbing this metabehavior. I will discuss the loopholes you have to jump through to achieve this in a next post. For now, consider the following examples to see how it could be used.

A regular expression validation of a dependency property can be used as follows:

[DependencyProperty( Property.RegexValidation, DefaultValue = "test" )]
[RegexValidation( "test" )]
public string RegexValidation { get; set; }

Coercing a value within a certain range, defined by two other properties can be used as follows:

[DependencyProperty( Property.StartRange, DefaultValue = 0 )]
public int StartRange { get; set; }

[DependencyProperty( Property.EndRange, DefaultValue = 100 )]
public int EndRange { get; set; }

[DependencyProperty( Property.CurrentValue )]
[RangeCoercion( typeof( int ), Property.StartRange, Property.EndRange )]
public int CurrentValue { get; set; }

The greatest thing about all this, is that new behaviors can be implemented by extending from ValidationHandlerAttribute and CoercionHandlerAttribute respectively, albeit with some added complexities due to the limitations of attributes which I will discuss later.

Aspect powered WPF

In some of my first posts – as it turns out I can almost celebrate one year of blogging! – I wrote about a factory approach to solve the complexities and overhead of creating dependency properties (DPs) for a WPF control; and notify properties and commands for a viewmodel. Both are essential parts when using WPF and following the MVVM pattern. To summarize: the approach was basically to delegate the creation of these complex components to a factory contained within the class which uses annotations on fields (attributes) to clarify what should be created. To create DPs, this looked as follows:

public class HelloWorldControl : UserControl
{
    [Flags]
    public enum Properties
    {
        Hello,
        World
    {

    static readonly DependencyPropertyFactory<Properties> PropertyFactory
        = new DependencyPropertyFactory<Properties>( false );

    [DependencyProperty( Properties.Hello )]
    public string Hello
    {
        get { return (string)PropertyFactory.GetValue( this, Properties.Hello ); }
        set { PropertyFactory.SetValue( this, Properties.Hello, value ); }
    }

    [DependencyProperty( Properties.World )]
    public string World
    {
        get { return (string)PropertyFactory.GetValue( this, Properties.World); }
        set { PropertyFactory.SetValue( this, Properties.World, value ); }
    }
}

Nice and concise, but as mentioned in my conclusion there is still room for improvement:

  • Requirement of an enum.
  • Having to manually add the factory to the class.
  • Every property has almost the exact same implementation.

As it turns out, PostSharp is the perfect candidate to leverage the concept of Aspect Oriented Programming (AOP) to solve the last two problems, resulting in the following solution:

[WpfControl( typeof( Properties ) )]
public class HelloWorldControl : UserControl
{
    [Flags]
    public enum Properties
    {
        Hello,
        World
    {

    [DependencyProperty( Properties.Hello )]
    public string Hello { get; set; }

    [DependencyProperty( Properties.World )]
    public string World { get; set; }
}

Where does the magic happen?

Simply put, PostSharp does a post-compilation step, inserting the required code where needed. Which code needs to be inserted where is determined by applying aspects (exposed as attributes) to the relevant elements. The WpfControlAttribute in the example above applies an aspect to HelloWorldControl. PostSharp makes it relatively easy by allowing you to implement these aspects by using plain C# classes. It’s a mature framework with an extensive feature set as will become evident from the following more in-depth description.

A core used feature in the implementation is the ability to dynamically ‘provide’ aspects to the elements of your choice. The solution actually is composed of 2 aspects, WpfControlAspect and DependencyPropertyAspect, being applied to the user control and the dependency properties respectively.

  1. The WpfControlAttribute is actually an IAspectProvider, creating a WpfControlAspect and applying it to its target (HelloWorldControl in the example). This extra step is only required for generic aspects as a workaround since C# doesn’t support generic attributes. The generic type arguments are passed as an argument to the attribute, and reflection is used to instantiate the actual generic aspect.
  2. The WpfControlAspect introduces the factory into the user control, and in its turn provides a DependencyPropertyAspect to all members which have the DependencyPropertyAttribute applied to them.
  3. The aspects applied to the properties create the correct call to the DependencyPropertyFactory, where the actual logic is implemented.

This solution represents a personal guideline I prefer to follow when writing aspects.

Only resort to using aspects after applying encapsulation to its fullest extent.

It’s a gut feeling I have, and I can’t quite formulate thorough arguments for it yet. When I can, I’ll be sure to write about it! Perhaps the main advantage is you are less dependant on aspects, and let’s face it, minimizing dependencies is almost always a good thing. Basically I prefer to rely on .NET and its excellent debugging and testing tools as much as possible, and only use PostSharp where .NET fails to satisfy my DRY needs.

Why still use enums?

As explained before, they are a necessary replacement to identify the associated element which is created. They aren’t all that bad. As I will demonstrate in a next post, the enums can even be put to use in some interesting use cases. If you can’t wait, all source code and a particularly interesting unit test is already available for you to try out. If you haven’t installed PostSharp 2.1 RC2 or higher yet, you will need it! Actually I already discussed another use case in a previous post: binding from XAML to a command in the viewmodel by using a custom ‘CommandBinding‘ markup extension and passing the enum as a parameter.

It could be worthwhile to investigate an opt-in approach where only those elements which need to be identified from elsewhere need to be assigned an ID. At first sight it looks like the limitations for attribute arguments are a big deal breaker.

You’re breaking the conventions!

The only convention I am still breaking is that I’m not adding the DependencyProperty fields. As discussed before, I see no need why these are required. The sole stated problem is that tools could rely on this convention to access the properties. Opening a simple user control enhanced with aspect goodiness shows that both Blend and the Visual Studio designer work as expected. That’s good enough for me!

Casting to less generic types

… because yes, there are valid use cases for it! Finally I found the time to write some unit tests, followed by fixing the remaining bugs, and am now excited to report on the result which effectively allows you to break type safety for generic interfaces, if you so please. Previously I discussed the variance limitations for generics, and how to create delegates which overcome those limitations. Along the same lines I will now demonstrate how to overcome those limitations for entire generic interfaces.

Consider the following interface which is used to check whether a certain value is valid or not.

public interface IValidation<in T>
{
    bool IsValid( T value );
}

Notice how T is made contravariant by using the in keyword? Not only values of type T can be validated, but also any extended type. This recent feature of C# won’t help a bit in the following scenario however. During reflection you can only use this interface when you know the complete type, including the generic type parameters. In order to support any type, you would have to check for any possible type and cast to the correct corresponding interface.

object validator;  // An object known to implement IValidation
object toValidate; // The object which can be validated by using the validator.

if ( validator is IValidation<string> )
{
    IValidation<string> validation = (IValidation<string>)validator;
    validation.IsValid( (string)toValidate );
}
else if ( validator is IValidation<int> )
{
    IValidation<int> validation = (IValidation<int>)validator;
    validation.IsValid( (int)toValidate );
}
else if ...

Hardly entertaining, nor maintainable. What we actually need is covariance in T, or at least something that looks like it. We want to treat a more concrete IValidation<T> as IValidation<object>. For perfectly good reasons covariance is only possible when the type parameter is only used as output, otherwise objects of the wrong type could be passed. In the given scenario however, where we know only the correct type will ever be passed, this shouldn’t be a problem.

The solution is using a proxy class which implements our less generic interface and delegates all calls to the actual instance, doing the required casts where necessary.

public class LessGenericProxy : IValidation<object>
{
    readonly IValidation<string> _toWrap;

    public LessGenericProxy( IValidation<string> toWrap )
    {
        _toWrap = toWrap;
    }

    public bool IsValid( object value )
    {
        return _toWrap.IsValid( (string)value );
    }
}

With the power of Reflection.Emit, such classes can be generated at runtime! RunSharp is a great library which makes writing Emit code feel like writing ordinary C#. It’s relatively easy (compared to using Emit) to generate the proxy class. The end result looks as follows:

object validator;  // An object known to implement IValidation
object toValidate; // The object which can be validated by using the validator.

IValidation<object> validation
    = Proxy.CreateGenericInterfaceWrapper<IValidation<object>>( validator );

validation.IsValid( toValidate ); // This works! No need to know about the type.

// Assuming the validator validates strings, this will throw an InvalidCastException.
//validation.IsValid( 10 );

Of course the proxy should be cached when used multiple times. Originally I also attempted to proxy classes instead of just interfaces by extending from them. This only works properly for virtual methods. Since non-virtual methods can’t be overridden there is no way to redirect the calls to the required inner instance.

Source code can be found in my library: Whathecode.System.Reflection.Emit.Proxy.