Sunday, August 30, 2009

Incremental find with Reactive Framework

This time we will implement "incremental find" using Reactive Framework. This is a very good example showing what the Rx is all about.

Lets say we have a Windows Forms application with a single TextBox. When the user stops typing, the application immediately sends the request to the remote web service to find all words that containt the text from the TextBox. We use Rx to handle two important issues:

  • how to find the moment when user has just stopped typing ?
  • how to ensure that the correct results will be displayed when calling web service is implemented as asynchronous operation (results for the most recent input should discard responses for all previous requests) ?

For the sake of simplicity we simulate calling a web service by a simple asynchronous operation returning results after 3 or 6 seconds. This allows us to easily check whether the correct results are displayed every time. Just type 'iobserv' then wait for about 2 seconds (user stopped writing) and append letter 'e'. After about 3 second the results for 'ibserve' text should be displayed and then never changed.

const string RxOverview =
@"
http://channel9.msdn.com/shows/Going+Deep/Expert-to-Expert-Brian-Beckman-and-Erik-Meijer-Inside-the-NET-Reactive-Framework-Rx/

Now, what is Rx?

The .NET Reactive Framework (Rx) is the mathematical dual of LINQ to Objects. It consists of a pair 
of interfaces IObserver / IObservable that represent push-based, or observable, collections, plus a 
library of extension methods that implement the LINQ Standard Query Operators and other useful 
stream transformation functions.
... 
Observable collections capture the essence of the well-known subject/observer design pattern, and 
are tremendously useful for dealing with event-based and asynchronous programming, i.e. AJAX-style 
applications. For example, here is the prototypical Dictionary Suggest written using LINQ query 
comprehensions over observable collections:

IObservable<Html> q = from fragment in textBox
               from definitions in Dictionary.Lookup(fragment, 10).Until(textBox)
               select definitions.FormatAsHtml();

q.Subscribe(suggestions => { div.InnerHtml = suggestions; })
";

var textBox = new TextBox();
var label = new Label
                {
                    Text = "results...",
                    Location = new Point(0, 40),
                    Size = new Size(300, 500),
                    BorderStyle = BorderStyle.FixedSingle,                                
                };
var form = new Form { Controls = { textBox, label } };

Func<string, IObservable<string[]>> search = (s) =>
{
    var subject = new Subject<string[]>();

    ThreadPool.QueueUserWorkItem((w) =>
    {
        Thread.Sleep(s.Length % 2 == 0 ? 3000 : 6000);

        var result = RxOverview.
            Split(new[] { " ", "\n", "\t", "\r" }, StringSplitOptions.RemoveEmptyEntries).
            Where(t => t.ToLower().Contains(s.ToLower())).
            ToArray();

        subject.OnNext(result);
        subject.OnCompleted();
    }, null);

    return subject;
};

IObservable<Event<EventArgs>> textChanged = textBox.GetObservableTextChanged();

var q =
    from e in textChanged
    let text = (e.Sender as TextBox).Text
    from x in Observable.Return(new Unit()).Delay(1000).Until(textChanged)  // first issue
    from results in search(text).Until(textChanged)                         // second issue
    select new { text, results };

var a1 = q.Send(SynchronizationContext.Current).Subscribe(r =>
{
    label.Text = string.Format(" Text: {0}\n Found: {1}\n Results:\n{2}",
        r.text,
        r.results.Length,
        string.Join("\n", r.results));
});


form.ShowDialog();

Now try to implement the same functionality without the Rx Framework.

Sources

"Drag and drop" with Reactive Framework

Just look how easily we can implement "drag and drop" functionality using Reactive Framework.

var form = new Form
{
    Controls =
       {
           new Label {Text = "label1", BorderStyle = BorderStyle.FixedSingle},                       
           new Button {Text = "button1"},
           new Label {Text = "label2", BorderStyle = BorderStyle.FixedSingle},
       }
};

Func<Control, IObservable<Event<MouseEventArgs>>> mouseDown = c => c.GetObservableMouseDown();
Func<Control, IObservable<Event<MouseEventArgs>>> mouseUp = c => c.GetObservableMouseUp();
Func<Control, IObservable<Event<MouseEventArgs>>> mouseMove = c => c.GetObservableMouseMove();

var q =
    from Control con in form.Controls
    select
    (
        from d in mouseDown(con)
        from u in mouseMove(con).Until(mouseUp(con))
        select u
    );

q.Merge().Subscribe(args =>
{
    var control = args.Sender as Control;
    control.Location = Point.Add(control.Location,
        new Size(args.EventArgs.X, args.EventArgs.Y));
});

form.ShowDialog();
Now all controls on the form can be moved from one place to another :)

drag_drop 
Sources.

Inside LINQ queries

This time we will dig a little bit deeper inside LINQ queries. We will see what C# 3.0 compiler is doing behind the scenes when it meets LINQ query in code and how we can use it. Lets look at simple LINQ query:

var q =
    from i in new[] {1, 2, 3, 4, 5, 6}
    where i < 5
    select i.ToString("C");

Compiler handles LINQ query in two phases. Firstly it translates the query into a chain of method calls, each operator in LINQ syntax like where, select, ... is translated respectively into the methods Where, Select, ... .

var q =
    new[] {1, 2, 3, 4, 5, 6}.
    Where(i => i < 5).
    Select(i => i.ToString("C"));

So operator where was translated into Where method and select into Select method. As you can see lambda expressions have been also generated and put as a methods' parameters. The second phase is a discovery phase. It's a phase of looking for the marching methods. In our sample the type "array of int" does not have any instance method named Where so compiler is analyzing suitable extensions methods. The closest extension method with matching parameters is defined in System.Linq.Enumerable class so this is the one that will be used. Where method is a generic method and in our case it returns IEnumerable<int> type. The implementation of Where method is quite simple and looks like this:

public static class Enumerable
{
    public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source,
        Func<TSource, bool> predicate)
    {
        foreach (var i in source)
            if(predicate(i))
                yield return i;
    }        
}

Now the compiler is looking for Select method. Once again type IEnumerable<int> does not contain instance Select method but but an Enumerable class contains matching one. Finally, our query returns type IEnumerable<string> and the code builds succeeded because all necessary methods has been found. If you are interested how more complicated queries are translated and you are using ReSharper, just point your mouse over a LINQ query, press Alt + Enter and choose "Convert LINQ to methods chain" option. ReSharper will automatically generate method chain for you.

Now we can see that LINQ query is just a chain of method calls, nothing more. The real power of LINQ is the implementation of these methods. .Net Framework 3.5 gives us Enumerable which contains about 50 methods with many different overloads such as Where, Select, Join, GoupBy, OrderBy, etc. Almost all methods extend IEnumerable<T> type so they can be used with all types implementing this interface, almost each collection in .Net Framework implementing this interface, which applies to almost all collections in the .NET Framework. I truly encourage you to familiarize with those methods because they are really powerful and only few of them are used in simple from ... where ... select LINQ queries.

Once we know how LINQ queries are handled by C# compiler we are ready to do some tricky stuff.

Lets say we would like to use the second overload of the Where method from Enumerable class inside LINQ query. This overload allows us to use the index of the iterated item and the the implementation looks like this:

public static class Enumerable
{
    public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source,
        Func<TSource, int, bool> predicate)
    {
        int index = 0;
        foreach (var i in source)
            if (predicate(i, index++))
                yield return i;
    }    
}

It means that we can write a condition based on the index of the items stored in queried sequence. Although we cannot extend syntax of the LINQ query but we can extend the IEnumerable<T> type with our own Where method implementation changing the second parameter's type.

static class EnhancedEnumerable
{
    public static IEnumerable<T> Where<T>(this IEnumerable<T> collection,
        Func<T, Func<int, bool>> superPredicate)
    {
        int index = 0;
        foreach (var item in collection)
            if (superPredicate(item).Invoke(index++))
                yield return item;
    }
}

Now we are able to write a query based on the index of an item.

var q =
    from i in new[] { 1, 2, 3, 4, 5, 6 }
    where index  => 
        {
            Console.WriteLine("{0}. {1}", index, i);
            return (index % 2 == 0) && (i < 5);
        }
    select i.ToString("C");

Another interesting thing we can do with LINQ query is that we can provide a set of objects describing some domain with appropriate "LINQ methods" called Where, Select, OrderBy and so on, so that we could write LINQ query over them. This approach give us very strong control over LINQ query syntax because just during compile time we can chose witch LINQ operators we support and which we don't. Lets look at the example.

Lets assume that we have a company and we have a simple application storing data about our employees. The data are stored in files on the disk, each employee is stored in separate file. Additionally files are grouped in folders by some common feature. For example employees working in particular department are placed in the same folder. Lets assume also that the main types describing our domain look like this:

Company

Lets start with a simple LINQ query:

Company company= new Company();
var q =
    from e in company
    select e.Name;

This code will not compile because compiler cannot find matching Select method so lets provide it.

public class Company
{
    public IEnumerable<T> Select<T>(Func<Employee, T> p) { ... }
}

Is it worth to mention that variable e in our query is an instance of Employee type because matched Select method takes an Employee type as a first parameter of lambda expression. Now lets provider support for where operator.

var q =
    from e in company
    where e.Salary > 100
    select e.Name;

public class Company
{
    public Employee Where(Func<Employee, bool> p)) 
}

public abstract class Employee<T>
{
    public IEnumerable<T2> Select<T2>(Func<T, T2> p) { ... }
}

public sealed class Employee : Employee<Employee> { }

As you can see this is a very powerful mechanism which gives us ability to provide only those operators that are supported in our "query language". We could say that it is an alternative mechanism to writing a custom LINQ provider when we want to integrate LINQ queries with our specific domain. LINQ provider is much more flexible during process of analyzing and execution of queries but this approach has one very important feature. We can choose which operators we provide and what parameters the particular operator can take. Our query is validated at compile time, which means that if we don't support for instance sort operator and some query contains that operator, compilation will fail with error "SortBy method can not be found". When writing custom LINQ provider we have to validate LINQ query at runtime ourselves. To make this clear see next queries.

var q =
    from Manager m in company
    select m.ManagedProjects;
    
// this query is translated into: q = company.Cast<Manager>().Select(m => m.ManagedProjects);

public class Company
{
    public T Cast<T>() where T : ICast { ... }
}

public interface ICast
{ }
    
public sealed class Developer : Employee<Developer>, ICast { }
public sealed class Manager : Employee<Manager>, ICast { }

We provide Cast operator for the Company type so we can use Developer or Manager type in from clause (if fact each type implementing ICast). Please notice that the type of variable m is Manager not Employee in our query. Next query will show even more interesting feature.

var q =
   from e in company
   where e.Salary > 100     
   where e.IsManager
   where e.ManagedProjects > 1
   select new { e.Name, e.ManagedProjects };
   
public sealed class Employee
{
    public Manager IsManager { get { return null; } }

    public Manager Where(Func<Employee, Manager> p) { ... }
}  

Here the type of variable e in the first part of the query is Employee, but after line "where e.IsManager" the type of e is Manager. As we saw we may provide support for any subset of LINQ operators but that's not all. We can also provide the particular operator only for specified type, for example "only managers can be sorted" or "we can group employees only by Department or Localization":

var q =
    from Manager m in company                  
    orderby m.ManagedProjects descending, m.Name
    select new { m.Name, m.ManagedProjects };

// q = company.Cast<Manager>().
//    OrderByDescending(m => m.ManagedProjects).ThenBy(m => m.Name).
//    Select(m => new {m.Name, m.ManagedProjects});

public sealed class Manager
{
    public Manager OrderBy<T>(Func<Manager, T> p) { ... }
    public Manager OrderByDescending<T>(Func<Manager, T> p) { ... }
    public Manager ThenBy<T>(Func<Manager, T> p) { ... }
    public Manager ThenByDescending<T>(Func<Manager, T> p) { ... }
}


var q =
    from Developer e in company
    group e by e.Location;
    
// q = company.Cast<Developer>().GroupBy(e => e.Location);

public abstract class Employee<T>    
{
    public Location Location { get; protected set; }
    public Department Department { get; protected set; }
    
    public IEnumerable<Group<Location,T>> GroupBy(Func<T, Location> p) { ... }
    public IEnumerable<Group<Department, T>> GroupBy(Func<T, Department> p) { ... }
}  

public class Group<TKey, T>
{
    public TKey Key { get; private set;}
    public IEnumerable<T> Items { get; private set;}
    
    internal Group(TKey key, IEnumerable<T> items)
    {
        Key = key;
        Items = items;
    }
}

Finally I'd like to show you how smart the C# 3.0 compiler is. When compiler meets a LINQ query with a simple select operator like this:

var q =
    from i in new[] {1, 2, 3, 4, 5, 6}
    where i < 5
    select i;

The small small optimization takes place. Because our query after translation looks like this:

var q =
    new[] {1, 2, 3, 4, 5, 6}.
    Where(i => i < 5).
    Select(i => i);

so probably calling Select method does not change anything (converting i into i) so this call is not generated and the final code looks like this:

var q =
    new[] {1, 2, 3, 4, 5, 6}.
    Where(i => i < 5);  

And of course in this particular case (LINQ to objects) it is true. But if we provide our own Select method returning DateTime type and if we place EnhancedEnumerable class "closer" to the above query (for instance in the same namespace) then our method will be used instead of standard LINQ operator (System.Linq.Enumerable.Where).

static class EnhancedEnumerable
{
    public static DateTime Select<TSource, TResult>(this IEnumerable<TSource> source,
        Func<TSource, TResult> predicate)
    {
        return DateTime.Now;
    }
}

In such case will the optimization take place or not ? Yes, it will. The optimization is performed in the first phase of query translation when we don't know yet which Select method will be used or even if any exits. There are more such optimizations so we should be careful when writing more advanced queries.

Treat this post as a prerequisite for next few posts where we will be talking more about LINQ internals.

The whole "queryable" solution with classes: Company, Employee, Developer and so on can be downloaded from here.

Saturday, August 15, 2009

Generating observable events with T4 template for Reactive Framework

[New version (2011.05.06)]

Reactive Framework (Rx) is what you definitely need to play with. Firstly watch this, this and this, then download Silverlight Toolkit where you can find System.Reactive.dll. If you don't want to use the Silverlight's version of Rx, convert it to the full .Net framework's version this way. Next read this, this, this and this.

Now you are ready to write you own LINQ queries over events :) Lets see a very simple query listening to 3 event of the System.Windows.Forms.Form class.

var form = new Form();

IObservable<Event<KeyEventArgs>> keyDown =
    Observable.FromEvent<KeyEventArgs>(form, "KeyDown");
IObservable<Event<KeyEventArgs>> keyUp =
    Observable.FromEvent<KeyEventArgs>(form, "KeyUp");
IObservable<Event<MouseEventArgs>> mouseMove =
    Observable.FromEvent<MouseEventArgs>(form, "MouseMove");

var downMoveUp =
    from key in keyDown
    from mouse in mouseMove.Until(keyUp)
    select " X = " + mouse.EventArgs.X + " Y = " + mouse.EventArgs.Y;

downMoveUp.Subscribe(x => Console.WriteLine(x));

Console.WriteLine("Press any key and move mouse until key up ... ");

form.ShowDialog();

When you create an event observer you need to put a hardcoded event name and appropriate event arguments type. This is quite uncomfortable, especially when you want to use this event in many different places so lets wrap this code into an extension methods.

public static class FormExtenions
{
    public static IObservable<Event<KeyEventArgs>> GetObservableKeyDown(this Form form)
    {
        return Observable.FromEvent<KeyEventArgs>(form, "KeyDown");
    }
    public static IObservable<Event<KeyEventArgs>> GetObservableKeyUp(this Form form)
    {
        return Observable.FromEvent<KeyEventArgs>(form, "KeyUp");
    }
    public static IObservable<Event<MouseEventArgs>> GetObservableMouseMove(this Form form)
    {
        return Observable.FromEvent<MouseEventArgs>(form, "MouseMove");
    }
}

var keyDown = form.GetObservableKeyDown();
var keyUp = form.GetObservableKeyUp();
var mouseMove = form.GetObservableMouseMove();

Writing extension methods for all events of all controls is obviously a very boring and error-prone task so lets use T4 templates to generate that code.

ObservableEventsT4

This produces:

using System.Linq;
using System.Collections.Generic;

namespace ObservableEvents
{
    public static class ExtensionMethods
    {
        public static IObservable<Event<System.Windows.Forms.ControlEventArgs>> GetObservableControlAdded(this System.Windows.Forms.Control source)
        {
            return Observable.FromEvent<System.Windows.Forms.ControlEventArgs>(source,"ControlAdded" );
        }
        public static IObservable<Event<System.Windows.Forms.ControlEventArgs>> GetObservableControlRemoved(this System.Windows.Forms.Control source)
        {
            return Observable.FromEvent<System.Windows.Forms.ControlEventArgs>(source,"ControlRemoved" );
        }
        public static IObservable<Event<System.Windows.Forms.DragEventArgs>> GetObservableDragDrop(this System.Windows.Forms.Control source)
        {
            return Observable.FromEvent<System.Windows.Forms.DragEventArgs>(source,"DragDrop" );
        }
        public static IObservable<Event<System.Windows.Forms.DragEventArgs>> GetObservableDragEnter(this System.Windows.Forms.Control source)
        {
            return Observable.FromEvent<System.Windows.Forms.DragEventArgs>(source,"DragEnter" );
        }
        public static IObservable<Event<System.Windows.Forms.DragEventArgs>> GetObservableDragOver(this System.Windows.Forms.Control source)
        {
            return Observable.FromEvent<System.Windows.Forms.DragEventArgs>(source,"DragOver" );
        }

Templates work also in Silverlight's projects. Get source code for this post from here and enjoy the Rx!.