Deserializing/Serializing XML that contains xsi:type attributes (and other XML adventures)

Published Jan 31, 2014

Posted in
programming
C#
XML

I wanted to take an arbitrary XML format and turn it into C# classes. I even considered for a while to write my own IXmlSerializable implementation of the classes, but quickly gave up because of their large number and heavy imbrication. Before we proceed you should know that there are several ways in which to turn XML into C# classes. Here is a short list (google it to learn more):

In Visual Studio 2012, all you have to do is copy the XML, then go to Edit -> Paste Special -> Paste XML as classes. There is an option for pasting JSON there as well.
There is the xsd.exe option. This is usually shipped with the Windows SDK and you have to either add the folder to the PATH environment variable so that the utility works everywhere, or use the complete path (which depends on which version of SDK you have).
xsd2Code is an addon for Visual Studio which gives you an extra menu option when you right click an .xsd file in the Solution Explorer to transform it to classes
Other zillion custom made tools that transform the XML into whatever

Anyway, the way to turn this XML into classes manually (since I didn't like the output of any of the tools above and some were even crashing) is this:

Create a class that is decorated with the XmlRoot attribute. If the root element has a namespace, don't forget to specify the namespace as well. Example:
[XmlRoot(ElementName = "RootElement", Namespace = "http://www.somesite.org/2005/someSchema", IsNullable = false)]
For each descendant element you create a class. You add a get/set property to the parent element class, then you decorate it with the XmlElement (or XmlAttribute, or XmlText, etc). Specify the ElementName as the exact name of the element in the source XML and the Namespace url if it is different from the namespace of the document root. Example:
[XmlElement(ElementName = "Integer", Namespace = "http://www.somesite.org/2005/differentSchemaThanRoot")]
If there are supposed to be more children elements of the same type, just set the type of the property to an array or a List of the class type representing one element
Create an instance of an XmlSerializer using the type of the root element class as a parameter. Example:
var serializer = new XmlSerializer(typeof(RootElementEntity));
Create an XmlSerializerNamespaces instance and add all the namespaces in the document to it. Example:
var ns = new XmlSerializerNamespaces(); ns.Add("ss", "http://www.somesite.org/2005/someSchema"); ns.Add("ds", "http://www.somesite.org/2005/differentSchemaThanRoot");
Use the namespaces instance to serialize the class. Example: serializer.Serialize(stream, instance, ns);

The above technique serializes a RootElementEntity instance to something similar to:
<ss:RootElement xmlns:ss="http://www.somesite.org/2005/someSchema" xmlns:ds="http://www.somesite.org/2005/differentSchemaThanRoot">
<ds:Integer>10</ds:Integer>
</ss:RootElement>

Now, everything is almost good. The only problem I met doing this was trying to deserialize an XML containing xsi:type attributes. An exception of type InvalidOperationException was thrown with the message "The specified type was not recognized: name='TheType', namespace='http://www.somesite.org/2005/someschema', at " and then the XML element that caused the exception. (Note that this is an internal exception of the first InvalidOperationException thrown that just says there was an error in the XML)

I finally found the solution, even if it is not the most intuitive. You need to create a type that inherits from the type you want associated to the element. Then you need to decorate it (and the original element) with an XmlRoot attribute specifying the namespace (even if the namespace is the same as the one of the document root element). And then you need to decorate the base type with the XmlInclude attribute. Here is an example.

The XML:

<ss:RootElement xmlns:ss="http://www.somesite.org/2005/someSchema" xmlns:ds="http://www.somesite.org/2005/differentSchemaThanRoot">
<ds:Integer>10</ds:Integer>
<ss:MyType xsi:type="ss:TheType">10</ss:MyType>
</ss:RootElement>

You need to create the class for MyType then inherit TheType from it:

[XmlRoot(Namespace="http://www.somesite.org/2005/someSchema")]
[XmlInclude(typeof(TheType))]
public class MyTypeEntity {}

[XmlRoot(Namespace="http://www.somesite.org/2005/someSchema")]
public class TheType: MyTypeEntity {}

Removing any of these attributes makes the deserialization fail.

Hope this helps somebody.

T-SQL Convert and Cast turn empty string to default value, NOT null

Published Jan 24, 2014

Posted in
database
programming

and has 0 comments

It is a bit embarrassing not knowing this at my level of software development, but I was stunned to see other people, even more experienced than I, had the same lack of knowledge. Apparently Microsoft SQL Server converts empty or whitespace strings to default values when using CONVERT or CAST. So CONVERT(INT,''), equivalent to CAST('' as INT), equals 0. DATETIME conversion leads to a value of 1900-01-01. And so on. That means that a good practice for data conversion when you don't know what data you may be getting is to always turn whitespace to null before using CONVERT or CAST. Also, in related news, newline is NOT whitespace in T-SQL so LTRIM(CHAR(10)) and LTRIM(CHAR(13)) is not empty string!

Bottom line: instead of CONVERT(<type>,<unknown string value>) use the cumbersome CONVERT(<type>,CASE WHEN LTRIM(RTRIM(<unknown string value>))!='' THEN <unknown string value> END). Same with CAST.

Here is a table of conversions for some values converted to FLOAT:

Value	Normal CONVERT	Cumbersome CONVERT	TRY_CONVERT
NULL	NULL	NULL	NULL
'' (empty string)	0	NULL	0
' ' (whitespace)	0	NULL	0
' ' (whitespace and newlines)	Conversion error	Conversion error	NULL
'123'	123	123	123

You might think this is not such a big deal, but in Microsoft SQL 2012 they introduced TRY_CONVERT and the similar TRY_CAST, which return null if there is a conversion error. This means that for an incorrect string value the function would return null for most but empty string, where it would return the default value of the type chosen, thus resulting in an inconsistent behavior.

Comparing the content of two similar web pages

Published Jan 23, 2014

Posted in
.NET
programming

and has 0 comments

For a personal project of mine I needed to gather a lot of data and condense it into a newsletter. What I needed was to take information from selected blogs, google queries and various pages that I find and take only what was relevant into account. Great, I thought, I will make a software to help me do that. And now, proverbially, I have two problems.

The major issue is that after getting all the info I needed, I was stuck on reading thousands of web pages to get to the information I needed. I was practically spammed. The thing is that there aren't even so many stories, it's just the same content copied from news site to news site, changing only the basic structure of the text, maybe using other words or expanding and collapsing terms in and out of abbreviations and sometimes just pasting it exactly as it was in the source, but displayed in a different web page, with a different template.

So the challenge was to compare two or more web pages for the semantic similarity of the stories. While there is such theory as semantic text analysis, just google for semantic similarity and you will get mostly PDF academic white papers and software that is done in Python or some equally disgusting language used only in scientific circles. And while, true, I was intrigued and for a few days I entertained the idea of understanding all that and actually building a C# library up to the task, I did not have the time for it. Not to mention that the data file I was supposed to parse was growing day by day while I was dallying in arcane algorithms.

In conclusion I used a faster and more hackish way to the same end. Here is how I did it.

The first major hurdle was to clear the muck from the web page and get to the real information. A simple html node innerText would not do. I had to ignore not only HTML markup, but such lovely things as menus, ads, sidebars with blog information, etc. Luckily there is already a project that does that called Boilerpipe. And before you jump at me for linking to a Java project, there is also a C# port, which I had no difficulties to download and compile.

At the time of the writing, the project would not compile well because of its dependency to a Mono.Posix library. Fortunately the library was only used for two methods that were never used, so I just removed the reference and the methods and all was well.

So now I would mostly have the meaningful text of both web pages. I needed an algorithm to quickly determine their similarity. I skipped the semantic bit of the problem altogether (trying to detect synonyms or doing lexical parsing) and I resorted to String Kernels. Don't worry if you don't understand a lot of the Wikipedia page, I will explain how it works right away. My hypothesis was that even if they change some words, the basic structure of the text remains the same, so while I am trying to find the pages with the same basic meaning, I could find them by looking for pages with the same text structure.

In order to do that I created for each page a dictionary with string keys and integer values. The keys would be text n-grams from the page (all combinations of three characters that are digits and letters) and the values the count of those kernels in the Boilerpipe text. At first I also allowed spaces in the character list of kernels, but it only complicated the analysis.

To compare a page to others, I would take the keys in the kernel dictionary for my page and look for them in the dictionaries of other pages, then compute a distance out of the counts. And it worked! It's not always perfect, but sometimes I even get pages that have a different text altogether, but reference the same topic.

You might want to know what made me use 3-grams and not words. The explanation comes mostly from what I read first when I started to look for a solution, but also has some logic. If I would have used words, then abbreviations would have changed the meaning of the text completely. Also, I did not know how many words would have been in a few thousand web pages. Restricting the length to three characters gave me an upper limit for the memory used.

Conclusion: use the .Net port of Boilerpipe to extract text from the html, create a kernel dictionary for each page, then compute the vector distance between the dictionaries.

I also found a method to compare the dictionaries better. I make a general kernel dictionary (for all documents at once) and then the commonality of a bit of text is the number of times it appears divided by the total count of kernels. Or the number of documents in which it is found divided by the total number of documents. I chose commonality as the product of these two. Then, one computes the difference between kernel counts in two documents by dividing the squared difference for each kernel by its commonality and adding the result up. It works much better like this. Another side effect of this method is that one can compute how "interesting" a document is, by adding up the counts of all kernels divided by their commonality, then dividing that to the length of the text (or the total count of kernels). The higher the number, the less common its content would be.

A compressed string class

Published Jan 13, 2014

Posted in
.NET
programming
C#

and has 0 comments

I admit this is not a very efficient class for my purposes, but it was a quick and dirty fix for a personal project, so it didn't matter. The class presented here stores a string in a compressed byte array if the length of the string exceeds a value. I used it to solve an annoying XmlSerializer OutOfMemoryException when deserializing a very large XML (400MB) in a list of objects. By objects had a Content property that stored the content of html pages and it went completely overboard when putting in memory. The class uses the System.IO.Compression.GZipStream class that was introduced in .Net 2.0 (you have to add a reference to System.IO.Compression.dll). Enjoy!

    public class CompressedString
    {
        private byte[] _content;
        private int _length;
        private bool _compressed;
        private int _maximumStringLength;

        public CompressedString():this(0)
        {
        }

        public CompressedString(int maximumStringLengthBeforeCompress)
        {
            _length = 0;
            _maximumStringLength = maximumStringLengthBeforeCompress;
        }

        public string Value
        {
            get
            {
                if (_content == null) return null;
                if (!_compressed) return Encoding.UTF8.GetString(_content);
                using (var ms = new MemoryStream(_content))
                {
                    using (var gz = new GZipStream(ms, CompressionMode.Decompress))
                    {
                        using (var ms2 = new MemoryStream())
                        {
                            gz.CopyTo(ms2);
                            return Encoding.UTF8.GetString(ms2.ToArray());
                        }
                    }
                }
            }
            set
            {
                if (value == null)
                {
                    _content = null;
                    _compressed = false;
                    _length = 0;
                    return;
                }
                _length = value.Length;
                var arr = Encoding.UTF8.GetBytes(value);
                if (_length <= _maximumStringLength)
                {
                    _compressed = false;
                    _content = arr;
                    return;
                }
                using (var ms = new MemoryStream())
                {
                    using (var gz = new GZipStream(ms, CompressionMode.Compress))
                    {
                        gz.Write(arr, 0, arr.Length);
                        gz.Close();
                        _compressed = true;
                        _content = ms.ToArray();
                    }
                }
            }
        }

        public int Length
        {
            get
            {
                return _length;
            }
        }
    }

Joining the rows of a table to the best row of another table in T-SQL

Published Nov 25, 2013

Posted in
database
programming

and has 2 comments

This is something I have been hitting my head on from the beginning of my programming career: just find the best match in a table for each row in another table through a single query.

There are solutions, but they are all very inefficient. To demonstrate the issue I will start with a simple structure: tables A and B, having the same columns id, x and y. I want to get, for each point in table A defined by the (x,y) coordinates, the closest point in table B. I only need one and it doesn't need to be exclusive (other points in A might be closest to the same point). It doesn't even have to be one row in B for each row in A, in case there are two points at the exact same distance to a point in A. The creation of the structure is done here:

CREATE TABLE A(id INT PRIMARY KEY IDENTITY(1,1), x FLOAT, y FLOAT)
INSERT INTO A (x,y) VALUES(10,20),(20,30),(20,10),(30,20),(30,20),(10,30)

CREATE TABLE B(id INT PRIMARY KEY IDENTITY(1,1), x FLOAT, y FLOAT)
INSERT INTO B (x,y) VALUES(11,20),(20,31),(21,10),(31,21),(30,20),(11,30)

To find the distance from A to the closest point in B is trivial:

SELECT a.id, 
       a.x, 
       a.y, 
       Min(( a.x - b.x ) * ( a.x - b.x ) + ( a.y - b.y ) * ( a.y - b.y )) AS 
       dist 
FROM   a 
       CROSS JOIN b 
GROUP  BY a.id, 
          a.x, 
          a.y

To get the id of the closest B point, not so easy.

The first naive solution would be to just find the row in B that corresponds to each row in A using nested selects, like this:

SELECT * 
FROM   a 
  JOIN b 
    ON b.id = (SELECT TOP 1 b.id 
                    FROM   b 
                    ORDER  BY ( a.x - b.x ) * ( a.x - b.x ) + ( a.y - b.y ) *  ( a.y - b.y ) ASC)

Looking at the execution plan we see what is going on: 86% of the query is spent on "Top N Sort".
Let's get some other solutions so we can compare them in the end in the same execution plan.

Another solution is to just use the result of the query that computes the distance and just join again on the distance. That means we would compare each row in A with each row in B twice, once for the computation of the MIN function and the other for the join:

SELECT j.*, 
       b2.* 
FROM   (SELECT 
a.id, 
a.x, 
a.y, 
Min(( a.x - b.x ) * ( a.x - b.x ) + ( a.y - b.y ) * ( a.y - b.y )) AS m 
        FROM   a 
               CROSS JOIN b 
        GROUP  BY a.id, 
                  a.x, 
                  a.y) j 
       INNER JOIN b b2 
               ON j.m = ( j.x - b2.x ) * ( j.x - b2.x ) + ( j.y - b2.y ) * ( j.y - b2.y )

Something that does the same thing, essentially, but looks a little better is joining the table A with B and then again with B on having the point from B2 be closer to the one in B1, but then adding a condition that there is no B2 (in other words, B1 is closest):

SELECT a.*, 
       b1.* 
FROM   a 
       CROSS JOIN b b1 
       LEFT JOIN b b2 
              ON ( a.x - b1.x ) * ( a.x - b1.x ) + ( a.y - b1.y ) * 
                                                   ( a.y - b1.y ) > 
                 ( a.x - b2.x ) * ( a.x - b2.x ) + ( a.y - b2.y ) * 
                                                   ( a.y - b2.y ) 
WHERE  b2.id IS NULL

None of these solutions scan B only once for each row in A. Their relative complexity is this: 75%, 11% and 14%, respectively. In other words, finding the minimum distance and then joining with the B table again on the points that are in exactly that distance is the best solution. However, given some assumptions and a weird structure, we can get to something that runs in half that time:

SELECT id      AS Aid, 
       x, 
       y, 
       m % 100 AS bId 
FROM   (SELECT a.id, 
               a.x, 
               a.y, 
               Min(Cast( ( ( a.x - b.x ) * ( a.x - b.x ) + ( a.y - b.y ) * ( a.y - b.y ) ) AS BIGINT) * 100 + b.id) AS m
        FROM   a 
               CROSS JOIN b 
        GROUP  BY a.id, 
                  a.x, 
                  a.y) j

These are the assumptions that must be true in order for this to work:

The function value can be converted to a BIGINT without problems. (if the distance between points would have been subunitary, this would have lost precision)
The maximum ID in table B is under a certain value (in this case 100)
The converted function multiplied by this maximum number doesn't cause an overflow

Basically I am mathematically creating a container for the value of the function and the id of the point in B, computing the minimum, then extracting the id back from the value. Neat.

Another solution, one that makes most apparent sense, is using a feature that was introduced in SQL Server 2005: RANK. We rank the points in B to each point in A, based on our function, then we only get the first by selecting on the rank being 1. Unfortunately, this doesn't work as expected. First of all, you cannot use RANK in the WHERE clause, so you must select the rank first, then select from that selection to add the condition. This might mean horrid temporary data tables if tables A and B are huge. Also, after running the query, it appears it is slower than the one that joins on the minimum distance. Here it is:

SELECT aid, 
       bid 
FROM   (SELECT a.id                                                      AS aId, 
               a.x, 
               a.y, 
               b.id                                                      AS bId, 
               Rank() 
                 OVER( 
                   partition BY a.id 
                   ORDER BY (a.x-b.x)*(a.x-b.x)+(a.y-b.y)*(a.y-b.y) ASC) AS rnk 
        FROM   a 
               CROSS JOIN b) x 
WHERE  rnk = 1

Comparing all the solutions so far, without the first naive one, with the nested selects, we get these values:

Mathematical container of function value and id: 14%
Selection of the minimum distance to each point and then joining with table B for the second time to look for the point that is at that distance: 21%
Joining twice on the same table with the condition that one is better than the other and that the better one doesn't actually exist: 29%
Using RANK: 36%, most surprisingly the worst solution

The final solution, adding some more computation in order to get rid of constants and some assumptions thus becomes:

DECLARE @MaxId BIGINT 

SELECT @MaxId = Isnull(Max(id) + 1, 1) 
FROM   B;

WITH q AS (SELECT A.id, 
               A.x, 
               A.y, 
               Min(Cast(Power(A.x-B.x, 2) + Power(A.y-B.y, 2) AS BIGINT) * @MaxId + B.id) AS m 
        FROM   A 
               CROSS JOIN B 
        GROUP  BY A.id, 
                  A.x, 
                  A.y)
SELECT id         AS aId, 
       x, 
       y, 
       m % @MaxId AS bId 
FROM   q;

I am still looking and there is now a question on StackOverflow that attempts to get the answer from the community, so far with limited results.

Prevent a Windows computer from going idle even when the Group Policy prevents you from installing any software

Published Nov 22, 2013

Posted in
misc
rant
programming

and has 7 comments

I work in this silly place where everything must be done according to some plan or procedure. They aren't even very good at it, but they are very proud of this bureaucracy. For example I don't have Outlook installed on my work machine, but on a virtual one which is in a different network and can be accessed only by remote desktop protocol. Some admin with a God complex thought it was a good idea to make the computer lock itself after a few minutes of idleness and even close the entire virtual machine when no one accesses it for a while. This might have some sick sense in the admin's head, but I need to know when an email arrives and so I would like to have this virtual machine open on the second monitor without having to enter the password every 5 minutes. To add hurt to offence, I cannot install any software on the virtual machine or using Powershell to prevent the computer going idle or anything useful like that. Good, a challenge! I need to find a way to keep the remote desktop session alive.

Enter Windows Script Hosting. I've created a small Javascript file that gets executed by the machine and occasionally ~~moves the mouse and~~ simulates pressing Shift. No more idleness and no need to access Group Policy or install anything. Just create a text file and paste the following code and then save it with a .js extension, then run it. It will keep the computer from going idle.

var WshShell = WScript.CreateObject("WScript.Shell");
for (var i=0; i<60; i++) // 60 minutes
{
WshShell.SendKeys('+');
WScript.Sleep (60000);
}

Step by step instructions for non technical people:

Press the Windows key and E to start the Windows Explorer
In the Explorer, navigate to Desktop
Remove the setting for "Hide extensions for known file types" - this is done differently from Windows version to Windows version, so google it
Create a new text file on the desktop by right clicking in it and choosing "New Text Document"
Paste the code above in it
Save the file (if you successfully removed the setting at point 3, you should not only see the name, but also the .txt extension for the file)
Rename the file to busybee.js (or any name, as long as it ends with .js
Double click it

The script will run 60 times at every minute (so for an hour) and keep the machine on which it runs on from going idle. Enjoy!

Displaying a message from a T-SQL query in real time

Published Nov 22, 2013

Posted in
database
programming

and has 1 comment

The preferred method to display anything in Transact-SQL is PRINT. You can print a string, a variable, an expression. However, as anyone soon finds out, the message get all cached in a buffer and displayed after the entire query ends. So if you have several long running queries in a single batch and you want to get real time messages from them, PRINT doesn't work. A quick search directs you to another MS SQL directive: RAISERROR (note the creative spelling that makes one think more of hearing Katy Perry RROR rather than a proper error raising directive). Also note that Microsoft recommends using a new construct called THROW, introduced in SQL2012. However, it only looks like a lamer version of RAISERROR. They both send a message to the client instantly, but the problem they have is that they do not, as PRINT does, accept an expression. So if you want to print something like 'The computed key from the query is: '+@Key you are out of luck as you need to declare a new nvarchar variable, fill it with the message value then use it in RAISERROR.

But there is a better solution. RAISERROR not only throws something at the client, it also flushes the message cache. So something like this works: PRINT 'The computed key from the query is: '+@Key; RAISERROR('',0,1) WITH NOWAIT;.

Of course, being the dev that I am, I wanted to encapsulate this into something that would be reusable and also get rid of the need do use plus signs and conversion to NVARCHAR, so I created this procedure that almost works like PRINT should have:

CREATE PROCEDURE Write (@P1  NVARCHAR(max)=NULL, 
                        @P2  NVARCHAR(max)=NULL, 
                        @P3  NVARCHAR(max)=NULL, 
                        @P4  NVARCHAR(max)=NULL, 
                        @P5  NVARCHAR(max)=NULL, 
                        @P6  NVARCHAR(max)=NULL, 
                        @P7  NVARCHAR(max)=NULL, 
                        @P8  NVARCHAR(max)=NULL, 
                        @P9  NVARCHAR(max)=NULL, 
                        @P10 NVARCHAR(max)=NULL) 
AS 
    PRINT Isnull(@P1, '') + Isnull(@P2, '') 
          + Isnull(@P3, '') + Isnull(@P4, '') 
          + Isnull(@P5, '') + Isnull(@P6, '') 
          + Isnull(@P7, '') + Isnull(@P8, '') 
          + Isnull(@P9, '') + Isnull(@P10, '') 

    RAISERROR('',0,1)

And you use it as DECLARE @now DATETIME = GetDate()
Write 'The date today is ',@now. Nice, huh? Of course what you would have liked to do is Write 'The date today is ',GetDate(), but apparently stored procedures do not accept functions as parameters, and functions do not accept PRINT inside them.

Rotating text with GDI Graphics within given bounds

Published Oct 30, 2013

Posted in
.NET
programming
C#

and has 2 comments

An unlikely blog post from me, about graphics; and not any kind of graphics, but GDI graphics. It involves something that may seem simple at first: rotating a text in a rectangle container so that it is readable when you turn the page to the left. It is useful to write text in containers that have a height that is bigger than the width. This is not about writing vertically, that's another issue entirely.
So, the bird's eye view of the problem: I had to create a PDF that contains some generated images, a sort of chart with many colored rectangles that contain text. The issue is that some of them are a lot higher than they are wide, which means it is better to write text that is rotated, in this case to -90 degrees, or 270 degrees, if you like it more. To the left, as Beyoncé would put it.

I created the image, using the Bitmap class, then got a Graphics instance from it, then starting drawing things up. It's trivial to draw a line, fill a rectangle, or draw an arc. Just as easy it is to write some text, using the DrawString method of the Graphics object. I half expected there to be a parameter that would allow me to write rotated, but there wasn't. How hard could it be?

Let's start with the basics. You want to draw a colored rectangle and write some text into it. This is achieved by:

var rectangle=new Rectangle(x,y,width,height); // reuse the dimensions
g.FillRectangle(new SolidBrush(Color.Blue),rectangle); // fill the rectangle with the blue color
g.DrawRectangle(new Pen(Color.Black),rectangle); // draw a black border
g.DrawString("This is my text",new Font("Verdana",12,GraphicsUnit.Pixel),new SolidBrush(Color.Black),rectangle, new StringFormat {
    LineAlignment=StringAlignment.Center,
    Alignment=StringAlignment.Center,
    Trimming = StringTrimming.None
}); // this draws a string in the middle of the rectangle, wrapping it up as needed

All very neat. However, you might already notice some problems. One of them is that there is no way to "overflow" the container. If you worked with HTML you know what I mean. If you use the method that uses a rectangle as a parameter, then the resulting text will NOT go over the edges of that rectangle. This is usually a good thing, but not all the time. Another issue that might have jumped in your eyes is that there is no way to control the way the text is wrapped. In fact, it will wrap the text in the middle of words or clip the text in order to keep the text inside the container. If you don't use the container function, there is no wrapping around. In other words, if you want custom wrapping you're going to have to go another way.
Enter TextRenderer, a class that is part of the Windows.Forms library. If you decide that linking to that library is acceptable, even if you are using this in a web or console application, you will see that the parameters given to the TextRenderer.DrawText method contain information about wrapping. I did that in my web application and indeed it worked. However, besides drawing the text really thick and ugly, you will see that it completely ignores text rotation, even if it has a specific option to not ignore translation tranforms (PreserveGraphicsTranslateTransform).

But let's not get into that at this moment. Let's assume we like the DrawString wrapping of text or we don't need it. What do we need to do in order to write at a 270 degrees angle? Basically you need to use two transformations, one translates and one rotates. I know it sounds like a bad cop joke, but it's not that complicated. The difficulty comes in understanding what to rotate and how.
Let's try the naive implementation, what everyone probably tried before going to almighty Google to find how it's really done:

// assume we already defined the rectangle and drew it
g.RotateTransform(-270);
g.DrawString("This is my text",new Font("Verdana",12,GraphicsUnit.Pixel),new SolidBrush(Color.Black),rectangle, new StringFormat {
    LineAlignment=StringAlignment.Center,
    Alignment=StringAlignment.Center,
    Trimming = StringTrimming.None
}); // and cross fingers
g.ResetTranformation();

Of course it doesn't work. For once, the rotation transformation applies to the Graphics object and, in theory, the primitive drawing the text doesn't know what to rotate. Besides, how do you rotate it? On a corner, on the center, the center of the text or the container?
The trick with the rotation transformation is that it rotates on the origin, always. Therefore we need the translate transformation to help us out. Here is where it gets tricky.

g.TranslateTransform(rectangle.X+rectangle.Width/2,rectangle.Y+rectangle.Height/2); // we define the center of rotation in the center of the container rectangle
g.RotateTransform(-270); // this doesn't change
var newRectangle=new Rectangle(-rectangle.Height/2,-rectangle.Width/2,rectangle.Height,rectangle.Width);  // notice that width is switched with height
g.DrawString("This is my text",new Font("Verdana",12,GraphicsUnit.Pixel),new SolidBrush(Color.Black),newRectangle, new StringFormat {
    LineAlignment=StringAlignment.Center,
    Alignment=StringAlignment.Center,
    Trimming = StringTrimming.None
});
g.ResetTranformation();

So what's the deal? First of all we changed the origin of the entire graphics object and that means we have to draw anything relative to it. So if we would not have rotated the text, the new rectangle would have had the same width and height, but the origin in 0,0.
But we want to rotate it, and therefore we need to think of the original bounding rectangle relative to the new origin and rotated 270 degrees. That's what newRectangle is, a rotated original rectangle in which to limit the drawn string.

So this works, but how do you determine if the text needs to be rotated and its size?
Here we have to use MeasureString, but it's not easy. It basically does the same thing as DrawString, only it returns a size rather than drawing things. This means you cannot measure the actual text size, you will always get either the size of the text or the size of the container rectangle, if the text is bigger. I created a method that attempts to get the maximum font size for normal text and rotated text and then returns it. I do that by using a slightly larger bounding rectangle and then going a size down when I find the result. But it wasn't nice.

We have a real problem in the way Graphics wraps the text. A simple, but incomplete solution is to use TextRenderer to measure and Graphics.DrawString to draw. But it's not exactly what we need. The complete solution would determine its own wrapping, work with multiple strings and draw (and rotate) them individually. One interesting question is what happens if we try to draw a string containing new lines. And the answer is that it does render text line by line. We can use this to create our own wrapping and not work with individual strings.

So here is the final solution, a helper class that adds a new DrawString method to Graphics that takes the string, the font name, the text color and the bounding rectangle and writes the text as large as possible, with the orientation most befitting.

using System;
using System.Collections.Generic;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;

namespace GraphicsTextRotation
{
    public static class GraphicsExtensions
    {
        public static void DrawString(this Graphics g, string text, string fontName, Rectangle rect, Color textColor, int minTextSize=1)
        {
            var textInfo = getTextInfo(g, text, fontName, rect.Width, rect.Height); // get the largest possible font size and the necessary rotation and text wrapping
            if (textInfo.Size < minTextSize) return;
            g.TranslateTransform(rect.X + rect.Width / 2, rect.Y + rect.Height / 2); // translate for any rotation
            Rectangle newRect;
            if (textInfo.Rotation != 0) // a bit hackish, because we know the rotation is either 0 or -90
            {
                g.RotateTransform(textInfo.Rotation);
                newRect = new Rectangle(-rect.Height / 2, -rect.Width / 2, rect.Height, rect.Width); //switch height with width
            }
            else
            {
                newRect = new Rectangle(-rect.Width / 2, -rect.Height / 2, rect.Width, rect.Height);
            }
            g.DrawString(textInfo.Text, new Font(fontName, textInfo.Size, GraphicsUnit.Pixel), new SolidBrush(textColor), newRect, new StringFormat
            {
                Alignment = StringAlignment.Center,
                LineAlignment = StringAlignment.Center,
                Trimming = StringTrimming.None
            });
            g.ResetTransform();
        }

        private static TextInfo getTextInfo(Graphics g, string text, string fontName, int width, int height)
        {
            var arr = getStringWraps(text); // get all the symmetrical ways to split this string
            var result = new TextInfo();
            foreach (string s in arr) //for each of them find the largest size that fits in the provided dimensions
            {
                var nsize = 0;
                Font font;
                SizeF size;
                do
                {
                    nsize++;
                    font = new Font(fontName, nsize, GraphicsUnit.Pixel);
                    size = g.MeasureString(s, font);
                } while (size.Width <= width && size.Height <= height);
                nsize--;
                var rsize = 0;
                do
                {
                    rsize++;
                    font = new Font(fontName, rsize, GraphicsUnit.Pixel);
                    size = g.MeasureString(text, font);
                } while (size.Width <= height && size.Height <= width);
                rsize--;
                if (nsize > result.Size)
                {
                    result.Size = nsize;
                    result.Rotation = 0;
                    result.Text = s;
                }
                if (rsize > result.Size)
                {
                    result.Size = rsize;
                    result.Rotation = -90;
                    result.Text = s;
                }
            }
            return result;
        }

        private static List<string> getStringWraps(string text)
        {
            var result = new List<string>();
            result.Add(text); // add the original text
            var indexes = new List<int>();
            var match = Regex.Match(text, @"\b"); // find all word breaks
            while (match.Success)
            {
                indexes.Add(match.Index);
                match = match.NextMatch();
            }
            for (var i = 1; i < indexes.Count; i++)
            {
                var pos = 0;
                string segment;
                var list = new List<string>();
                for (var n = 1; n <= i; n++) // for all possible splits 1 to indexes.Count+1
                {
                    var limit = text.Length / (i + 1) * n;
                    var index = closest(indexes, limit); // find the most symmetrical split
                    segment = index <= pos
                                ? ""
                                : text.Substring(pos, index - pos);
                    if (!string.IsNullOrWhiteSpace(segment))
                    {
                        list.Add(segment);
                    }
                    pos = index;
                }
                segment = text.Substring(pos);
                if (!string.IsNullOrWhiteSpace(segment))
                {
                    list.Add(segment);
                }
                result.Add(string.Join("\r\n", list)); // add the split by new lines to the list of possibilities
            }
            return result;
        }

        private static int closest(List<int> indexes, int limit)
        {
            return indexes.OrderBy(i => Math.Abs(limit - i)).First();
        }

        private class TextInfo
        {
            public int Rotation { get; set; }
            public float Size { get; set; }
            public string Text { get; set; }
        }
    }
}

I hope you like it.

Transposition of ranges in SQL and charts

Published Oct 25, 2013

and has 0 comments

I had this database table containing ranges (a start value and an end value). The challenge was creating a query that overlaps and transposes those ranges so I can say how many ranges are at any point in the total interval or values. As an example, "SELECT * FROM Ranges" would result in a table like:

Start	End
10	20
10	30
25	35
20	40

and I am looking for something like this:

Value	Count
0	0
1	0
...	...
10	2
11	2
...	...
24	2
25	3
26	3

A naive implementation would get the minimum Start (or start with 0, as I did) and the maximum End, create an in memory or temporary table (Values) from min to max using an ugly WHILE block, then join it with the Ranges tables something like:

SELECT v.Val,Count(1) as Nr
FROM #Values v
INNER JOIN Ranges r
ON r.Start<=v AND r.[End]>=v

This kind of works, but for large ranges it becomes difficult. It takes a lot to create the Values table and the join and for extreme cases, like I had with values from 0 to 6 billion, it becomes impossible. The bottleneck here is this Values table, which is pretty much a horror to create and maintain. But what if you don't need all the values?

Before I tell you the solution I found, be warned that you have to properly define what a range is. Is a range 10-20 actually 10-19? In my case it was so, so that is why there are some subtractions with 1 or less than rather than less or equal conditions.

The solution is this:

SELECT DISTINCT Val
INTO #Values
FROM (
  SELECT 0 as Val
    UNION ALL
  SELECT Start FROM Ranges
    UNION ALL
  SELECT [End]-1 FROM Ranges
) x
ORDER BY Val

The idea is that after you compute the ranges count per each of the start and end values you know that between one and the next the count of ranges will remain the same. The join is significantly faster, there is no ugly WHILE block and you don't need a 6 billion value table. It's easier to plot on a chart as well, with either of these variations:
See the variations:

SELECT v.Val,
       Count(r.Start) as Nr
FROM #Values v
LEFT JOIN Ranges r
  ON r.Start<=v.Val AND r.[End]>v.Val
GROUP BY v.Val

The result of the query above will be:

Value	Nr
0	0
10	2
19	2
20	2
25	3
29	3
34	2
39	1

Hope this helps you

OpenLayers & AngularJS - add features, choose their appearance and behaviour with clustering

Published Oct 10, 2013

and has 0 comments

Being a beginner in both OpenLayers and AngularJS it took me a long while to do this simple thing: add stuff on a map and make it show as I wanted. There were multiple gotchas and I intend to chronicle each and every one of those bastards.
First, while creating a map and doing all kinds of stuff with it using OpenLayers is a breeze, doing it "right" with AngularJS is not as simple. I thought I would not reinvent the wheel and looked for some integration of the two technologies and I found AzimuthJS. In order to add a map with Azimuth all you have to do is:

<div ol-map controls="zoom,navigation,layerSwitcher,attribution,mousePosition" control-opts="{navigation:{handleRightClicks:true}}">
    <az-layer name="Street" lyr-type="tiles"></az-layer>
    <az-layer name="Airports" lyr-type="geojson" lyr-url="examples/data/airports.json" projection="EPSG:4326"></az-layer>
</div>

You may notice that it has a simple syntax, it offers the possibility of multiple layers and one of them is even loading features dynamically from a URL. Perfect so far.
First problem: the API that I am using is not in the GeoJSON format that Azimuth know how to handle and I cannot or will not change the API. I've tried a lot of weird crap, including adding a callback on the loadend layer event for a GeoJson layer in order to reparse the data and configure what I wanted. It all worked, but it was incredibly ugly. I've managed to add the entire logic in a Javascript file and do it all in that event. It wasn't any different from doing it from scratch in Javascript without any Angular syntax, though. So what I did was to create my own OpenLayers.Format. It wasn't so complicated, basically I inherited from OpenLayers.Format.JSON and added my own read logic. Here is the result:

OpenLayers.Format.RSI = OpenLayers.Class(OpenLayers.Format.JSON, {

    read: function(json, type, filter) {
        type = (type) ? type : "FeatureCollection";
        var results = null;
        var obj = null;
        if (typeof json == "string") {
            obj = OpenLayers.Format.JSON.prototype.read.apply(this,
                                                              [json, filter]);
        } else { 
            obj = json;
        }    
        if(!obj) {
            OpenLayers.Console.error("Bad JSON: " + json);
        }

        var features=[];
        for (var i=0; i<obj.length; i++) {
            var item=obj[i];
            var point=new OpenLayers.Geometry.Point(item.Lon,item.Lat).transform('EPSG:4326', 'EPSG:3857');
            if (!isNaN(point.x)&&!isNaN(point.y)) {
                var feature=new OpenLayers.Feature.Vector(point,item);
                features.push(feature);
            }
        }
        
        return features;
    },
    

    CLASS_NAME: "OpenLayers.Format.RSI" 

});

All I had to do is load this in the page. But now the problem was that Azimuth only knows some types of layers based on a switch block. I've not refactored the code to be plug and play, instead I shamelessly changed it to try to use the GeoJson code with the format I provide as the lyr-type, if it exists in the OpenLayers.Format object. That settled that. By running the code so far I see the streets layer and on top of it a lot of yellow circles for each of my items.
Next problem: too many items. The map was very slow because I was adding over 30000 items on the map. I was in need of clustering. I wasted almost an entire day trying to figure out what it wouldn't work until I realised that it was an ordering issue. Duh! But still, in this new framework that I was working on I didn't want to add configuration in a Javascript event, I wanted to be able to configure as much as possible via AngularJS parameters. I noticed that Azimuth already had support for strategy parameters. Unfortunately it only supported an actual strategy instance as the parameter rather than a string. I had, again, to change the Azimuth code to first search for the name of the strategy parameters in OpenLayers.Strategy and if not found to $parse the string. Yet it didn't work as expected. The clustering was not engaging. Wasting another half an hour I realised that, at least in the case of this weirdly buggy Cluster strategy, I not only needed it, but also a Fixed strategy. I've changed the code to add the strategy instead of replacing it and suddenly clustering was working fine. I still have to make it configurable, but that is a detail I don't need to go into right now. Anyway, remember that the loadend event was not fired when only the Cluster strategy was in the strategies array of the layer; I think you need the Fixed strategy to load data from somewhere.
Next thing I wanted to do was to center the map on the features existent on the map. The map also needed to be resized to the actual page size. I added a custom directive to expand a div's height down to an element which I styled to be always on the bottom of the page. The problem now was that the map was getting instantiated before the div was resized. This means that maybe I had to start with a big default height of the div. Actually that caused a lot of problems since the map remained as big as first defined and centering the map was not working as expected. What was needed was a simple map.updateSize(); called after the div was resized. In order to then center and zoom the map on the existent features I used this code:

        var bounds={
            minLon:1000000000,
            minLat:1000000000,
            maxLon:-1000000000,
            maxLat:-1000000000
        };
    
        for (var i=0; i<layer.features.length; i++) {
            var feature=layer.features[i];
            var point=feature.geometry;
            if (!isNaN(point.x)&&!isNaN(point.y)) {
                bounds.minLon=Math.min(bounds.minLon,point.x);
                bounds.maxLon=Math.max(bounds.maxLon,point.x);
                bounds.minLat=Math.min(bounds.minLat,point.y);
                bounds.maxLat=Math.max(bounds.maxLat,point.y);
            }
        }
        map.updateSize();
        var extent=new OpenLayers.Bounds(bounds.minLon,bounds.minLat,bounds.maxLon,bounds.maxLat);
        map.zoomToExtent(extent,true);

Now, while the clustering was working OK, I wanted to show stuff and make those clusters do things for me. I needed to style the clusters. This is done via:

        layer.styleMap=new OpenLayers.StyleMap({
                "default": defaultStyle,
                "select": selectStyle
            });

        layer.events.on({
            "featureselected": clickFeature
        });

        var map=layer.map;

        var hover = new OpenLayers.Control.SelectFeature(
            layer, {hover: true, highlightOnly: true}
        );
        map.addControl(hover);
        hover.events.on({"featurehighlighted": displayFeature});
           hover.events.on({"featureunhighlighted": hideFeature});
        hover.activate();

        var click = new OpenLayers.Control.SelectFeature(
            layer, {hover: false}
        );
        map.addControl(click);
        click.activate();

I am adding two OpenLayers.Control.SelectFeature controls on the map, one activates on hover, the other on click. The styles that are used in the style map define different colors and also a dynamic radius based on the number of features in a cluster. Here is the code:

        var defaultStyle = new OpenLayers.Style({
            pointRadius: "${radius}",
            strokeWidth: "${width}",
            externalGraphic: "${icon}",
            strokeColor: "rgba(55, 55, 28, 0.5)",
            fillColor: "rgba(55, 55, 28, 0.2)"
        }, {
            context: {
                width: function(feature) {
                    return (feature.cluster) ? 2 : 1;
                },
                radius: function(feature) {
                    return feature.cluster&&feature.cluster.length>1
                        ? Math.min(feature.attributes.count, 7) + 2
                        : 7;
                }
            }
        });

You see that the width and radius are defined as dynamic functions. But here we have an opportunity that I couldn't let pass. You see, in these styles you can also define the icons. How about defining the icon dynamically using canvas drawing and then toDataURL? And I did that! It's not really that useful, but it's really interesting:

        function fIcon(feature,type) {
                    var iconKey=type+'icon';
                    if (feature[iconKey]) return feature[iconKey];
                    if(feature.cluster&&feature.cluster.length>1) {
                        var canvas = document.createElement("canvas");
                        var radius=Math.min(feature.cluster.length, 7) + 2;
                        canvas.width = radius*2;
                        canvas.height = radius*2;
                        var ctx = canvas.getContext("2d");
                            ctx.fillStyle = this.defaultStyle.fillColor;
                        ctx.strokeStyle = this.defaultStyle.strokeColor;
                        //ctx.fillRect(0,0,canvas.width,canvas.height);
                        ctx.beginPath();
                        ctx.arc(radius,radius,radius,0,Math.PI*2);
                        ctx.fill();
                        ctx.stroke();
                            ctx.fillStyle = this.defaultStyle.strokeColor;
                        var bounds={
                            minX:1000000000,
                            minY:1000000000,
                            maxX:-1000000000,
                            maxY:-1000000000
                        };
                        for(var c = 0; c < feature.cluster.length; c++) {
                              var child=feature.cluster[c];
                            var x=feature.geometry.x-child.geometry.x;
                            var y=feature.geometry.y-child.geometry.y;
                                bounds.minX=Math.min(bounds.minX,x);
                                bounds.minY=Math.min(bounds.minY,y);
                                bounds.maxX=Math.max(bounds.maxX,x);
                                bounds.maxY=Math.max(bounds.maxY,y);
                        }
                        var q=0;
                        q=Math.max(Math.abs(bounds.maxX),q);
                        q=Math.max(Math.abs(bounds.maxY),q);
                        q=Math.max(Math.abs(bounds.minX),q);
                        q=Math.max(Math.abs(bounds.minY),q);
                        q=radius/q;
                        var zoom=2;
                        for(var c = 0; c < feature.cluster.length; c++) {
                              var child=feature.cluster[c];
                            var x=-(feature.geometry.x-child.geometry.x)*q+radius;
                            var y=(feature.geometry.y-child.geometry.y)*q+radius;
                                ctx.fillRect(parseInt(x-zoom/2), parseInt(y-zoom/2), zoom, zoom);
                        }
                        feature[iconKey] = canvas.toDataURL("image/png");
                    } else {
                        feature[iconKey] = OpenLayers.Marker.defaultIcon().url;
                    }
                    return feature[iconKey];
                };

        defaultStyle.context.icon=function(feature) {
            return fIcon.call(defaultStyle,feature,'default');
        }
        selectStyle.context.icon=function(feature) {
            return fIcon.call(selectStyle,feature,'select');
        }

This piece of code builds a map of the features in the cluster, zooms it to the size of the cluster icon, then also draws a translucent circle as a background.
I will not bore you with the displayFeature and clickFeature code, enough said that the first would set the html title on the layer element and the other would either zoom and center or display the info card for one single feature. There is a gotcha here as well, probably caused initially by the difference in size between the map and the layer. In order to get the actual pixel based on latitude and longitude you have to use map.getLayerPxFromLonLat(lonlat), not map.getPixelFromLonLat(lonlat). The second will work, but only after zooming or moving the map once. Pretty weird.

There are other issues that come to mind now, like making the URL for the data dynamic, based on specific parameters, but that's for another time.

Clustering (or any other Strategy) not working in OpenLayers

Published Oct 8, 2013

Posted in
programming

and has 0 comments

As promised, even if somewhat delayed, I am starting programming posts again. This time is about working with OpenLayers, a free wrapper over several Javascript mapping frameworks. The problem: after successfully creating and displaying a map and adding "features" to it (in this case a yellow circle for each physical location of a radio station), I wanted to add clustering; it didn't work. Quick solution: add the layer to the map BEFORE you add the features to the layer.

For the longer version, I have to first explain how OpenLayers operates. First order of business is creating an OpenLayers.Map object which receives the html element that will contain it and some options. To this map you add different layers, which represent streets, satellite imagery and so on. One can also add to any of these layers a number of features which will be displayed on the map. A common problem is that for many data points the map becomes cluttered with the rendering of these features, making for an ugly and slow interface. Enter "clustering" which means using a layer strategy of type OpenLayers.Strategy.Cluster to clump close features together into a larger one. It should be as easy as setting the layer 'strategies' array or adding to it an instance of a Cluster strategy. But for me it did not work at all.

After many attempts I finally realized that the clustering mechanism was dependent on the existence of an event handler for the 'beforefeaturesadded' event of the layer. Unfortunately for me and my time, this handler is only added when the layer is added to a Map! So all I had to do was to add the layer to the map before adding the features to it. Good thing that I recently shaved my head, or I would have tufts of bloody hair in my fists right now.

Re-boot

Published Aug 25, 2013

and has 1 comment

I wrote a bunch of posts regarding my past employment, but said nothing about the new one. In fact, I was a bit superstitious, didn't want to jinx what was going to happen. Now it shall all be revealed! Well, long story short I will be relocating to Italy (re-boot, get it?)and working at the European Commission's Joint Research Centre in Ispra. That's it, cheers!

Just kidding. How does one get to, first, have the opportunity in the first place and, second, actually decide to go? For the first point I would have to say pure blind luck. I happen to have a LinkedIn profile that shows a lot of experience in the field of Microsoft .NET and so they called me, since they needed someone like that, and I turned out not to be a complete wacko (only a partial one) at the interview. The second point is actually the most complicated. Most Romanian developers of my experience are rooted, so to speak. Married, many with children and obligations, relatives and social circles, they often find it too hard or completely impossible to relocate to another country. Luckily for me, I have no children, I don't have any social circles to talk about, I will probably talk to and visit my relatives just as much from Italy as from Bucharest and I have one of the most understanding wives one could want. She stays behind, at least temporarily, to mind the fort, continue her own career and take care of the dog, while I go on to the adventure of my lifetime.

I may be exaggerating, but I will check out several experiences that I have never had before:

living alone - I know it sounds strange, but in 36 years I have never lived alone. I was either living with my parents, with my business partner or with a girlfriend or wife
living in another country - I have worked in Italy before, a few disparate weeks, but never lived in another country for long enough to understand the local culture and experience the way locals see the world
living in a small town - Ispra is a 5000 people enclave, so it's not even a small town, more of a village
working for the European Commission or some other governmental organization like that - I am afraid of the bureaucracy, frankly, I hope there is some sort of separation between devs and that sort of thing
working with actual new technologies - I thought there are some people that inflate their resumes in order to get jobs they don't really deserve, but I never imagined that most companies would misrepresent themselves to appear more attractive as a workplace. I've heard a lot about what great new project I will be working on, only to be relegated to some legacy crap that no manager wants to rewrite even when it's bankrupting their company. Oh, I really hope the JRC people didn't bullshit me about an ASP.Net MVC 4.5 web site with Web API's, AngularJS and Google Maps.
staying separated from my wife, but not being mad at each other - not that I have ever stayed separated from her while being angry, but still. Our relationship started as a long distance one, since we were living in different cities, and only after a year we moved in together. I am curious as to how this reversal will affect us. I believe it will strengthen our bond, but there are alternative scenarios.
working and living in a truly multicultural environment - the place will have Italian, French, German, Swiss, Romanian and who know what many other types of people. I will have the opportunity to relearn all the European languages, express myself in them, learn about other cultures from the horse's mouth, so to speak.

All in all, this is the gist of it. You can see that I am excited enough (setting the stage for future disappointment). My plane leaves Bucharest next Friday, on the 30th of August, while actual work begins on the 2nd of September. Hopefully this will generate a deluge of technical blog posts that will compensate the lack experienced in the last two years.

ASP.NET MVC 4 Recipes, by John Ciliberti

Published Aug 25, 2013

and has 0 comments

When I first opened the ASP.NET MVC 4 Recipes, by John Ciliberti I was amazed. It seemed to transcend the reference book and go into a sort of interactive path thing. You know interactive books, where you read the book and at certain points you get to choose what the characters do by going to read one page or another? This is what Recipes seemed to be. You get to a point where the author tells you which chapters to read and in which order based on your role in the organization. That is and will remain a wonderful concept and I would see more books steal it for themselves. However, the actual content of the book did not feel as great as its presentation, I am afraid to say. This is not to mean it is a bad book, only that I expected a lot more from it from reading its "mission statement". The book is Microsoft centric, obviously, but it says very clear that it will solve problems with Microsoft products as a rule. For example it favours KnockoutJS as a JavaScript framework. But that's not really annoying, though.

I think what bothered me most was that the content was all over the place. There are some chapters in which there are specific problems. The problem is described, then the solution is provided. Very nice. But then there are some problems that are vague and general with a very specific solution, lending a lot of lines to some issues and moving past others in a hurry. Of course, I would have liked all of the problems to have their own book and that was impossible, but the compromise here did not feel as great; I thought some of the problems were not really something someone would have more than once, and sometimes never, so using the book as a reference helps only so much. Some examples of problems to be solved: You would like to begin working with ASP.NET MVC Framework, but you do not understand the MVC pattern and why it is beneficial. - why would you start reading an ASP.Net MVC book if you don't even understand the MVC pattern? You would google something first. Or: You have started using the new .NET asynchronous programming pattern and love its relative simplicity compared to other programming models. However, you would like to have a better understanding of the code generated by the compiler so that you can improve the designs of your asynchronous methods. So you jump from not knowing what MVC is to wanting to read IL. Maybe I am just mean, but it soon turned into a very hard to read book from jumping from one issue to another like that, from level to level. Not to mention some "loaded" problems that have a description several lines long in the form of "you have found that your company strategy sucks, because of 1,2 and 3, and you want 4,5 and 6 because 7,8 and 9". It doesn't sound like my problem at all :)

Bottom line: I have not started working with ASP.Net MVC, yet, nor do I believe that my first job with it would be as an architect, so I will have an opinion on how it works in real life in a few months, probably. The book seems useful now, but not the ASP.Net MVC start to end tutorial that I wanted when I started reading it, and maybe that is why I had such a critical eye for it.

AngularJS, by Brad Green, Shyam Seshadri

Published Aug 11, 2013

and has 1 comment

I have been hearing about the AngularJS library for a few months now, people often praising it as the new paradigm of web development. It is basically a JavaScript MVC framework that makes heavy use of markup language in order to declare the desired behaviour. Invented at Google by Miško Hevery, it uses cacheable templates, databinding and dependency injection to combine the various components that otherwise are independent and testable. It also comes with its own testing framework (unit and end-to-end) and a way to describe unit tests Jasmine (BDD)style.

So I started reading about this new framework in the book intuitively called AngularJS, written by Brad Green and Shyam Seshadri. They start with an anecdote, discussing how they were working on a web application at Google. They have already written 17000 lines of code in about 6 months and it was almost finished, albeit with great frustration related to development speed and testability. This guy, Miško Hevery, tells everyone that by using a framework that he wrote in his spare time (you gotta love devs!) they could rewrite the whole application in two weeks. He was wrong, they did it in three weeks and at the end the whole thing has only 1500 lines of code and was fully testable. This was a great beginning for the book, as it starts with a promise and then (sorry, couldn't help the pun - you will see what I mean if you read the book or know AngularJS already) it describes how to achieve your goals. The book itself is not large, about 160 PDF pages, and can be used as both a primer and a reference. It describes the basic concepts of AngularJS and how they can be put to work, with some small app examples at the end. Of course, you have a link to where to download all their code samples.

What do I think about the book? It was pretty good. It shows the authors' preference towards Linux setups, but it is not annoying. Each chapter is clear and to the point. The framework itself, though, is original enough that after a few chapters it is almost impossible to understand everything without tinkering with the code yourself. Unfortunately I didn't have the time and disposition to do that, so just because I've read the book doesn't mean I know how to work with Angular, but I am confident that when I will actually start working with it, it will all come together in my mind. Also, as I was saying, the book can easily be used as a reference. It is not a complete overview, not every AngularJS feature and gotcha can be found in its pages, but it's good enough.

What do I think about the framework? It seems pretty spectacular. My only experience with JavaScript MVC frameworks is from a short brush off with BackboneJS. At a time I thought I would be working with it a lot and was boasting here that interesting posts would appear. Alas, it was not to be. Sorry about that, maybe better luck with Angular. Backbone was pretty interesting, but it had a horrendous way of working with data models and it was very easy to break something and not realize where it came from. There seems to be a lot more thought put into Angular. An interesting point is that the writers advertise TDD as a way of actually working and claim they do so themselves. I have seen many people trying and giving up, but I have hopes for JavaScript. You don't need to compile things, you don't need complicated servers or time consuming deployment steps: just change stuff and run the tests and/or refresh a page. I like the fact that the creators of AngularJS put this much work into making everything testable.

So go ahead: read the book and try the framework!

Update 24 Aug 2013: I've started reading dev blogs again and I've stumbled upon a 70 minute video by Dan Wahlin presenting AngularJS. His explanations seemed a lot more down to Earth than those in the book so I felt that his video really complements rather well what is written there. Here it is:

ASP.NET MVC 4 and the Web API: Building a REST Service from Start to Finish, by Jamie Kurtz

Published Jul 31, 2013

and has 0 comments

ASP.NET MVC 4 and the Web API, by Jamie Kurtz, is the one of the new breed of technical books that read like a blog entry, albeit a very long one. The book is merely 100 pages long, but to the point, with links to code on GitHub and references to other resources for details that are not the subject of the book. The principles behind the architecture are discussed, explained, the machine setup is described, the configuration, then bam! all the pieces fit together. Even if I don't agree fully with some of Kurtz's recommendations, I have to admit this is probably a very very useful book.

What is it about? It describes how to create a REST web API, complete with authentication, authorization, logging and unit testing. It discusses ORM (with OData), DI, Source control, the basics of REST and MVC, and all other tools required. But what I believe to be the strength of the approach in the book is the clear separation of modules. One can easily find fault with one of the pieces recommended by the author and just as easily replace only that component, leaving the others as is.

The structure of the book is as follows:

Chapter 1 - A quick introduction of ASP.Net MVC4 as a platform for REST services, via the Web API.
Chapter 2 - The basics of REST services. There are very subtle points described there, including the correct HTTP codes and headers in the response and discoverability. It also points to prerequisites of your API in order to be called REST, like the REST Maturity Model.
Chapter 3 - Modelling of an API. This includes the way URLs are formed, the conventions in use and how the API should look to the client.
Chapter 4 - The scaffolding of your Visual Studio project, the logging configuration, the folder structure, the API DTOs.
Chapter 5 - Putting components together: configuring NInject, designing your classes with DI and testability in mind.
Chapter 6 - Security: really simple implementation with a lot of power provided by the default Microsoft Membership Providers.
Chapter 7 - Actually building the API, making some smoke tests, seeing it all work.

The complete source of the project described in the book can be found on GitHub.

My personal opinion of the setup is that, while all seems to fit together, some technologies are a bit over the top. NInject, I had personal experience with it, is very good, but very slow. The ASP.Net Membership scheme is very verbose. While I wouldn't really care about it as implemented in the book, I still cringe at the table names and zillions of columns. Also, I am slightly opposed to ORMs, mostly because they attempt to mould you into a specific frame of thinking, that of CRUD, making any optimization or deviation from the plan rather difficult. I've had the experience of working on a project that had all of its database access in stored procedures. To find what accessed a table and a column was a breeze, without knowing anything about the underlying implementation. But even so, as I was saying above, the fact that the author separates concerns so beautifully makes any component replaceable.

I highly recommend this book, especially now, when the world moves toward HTML and Javascript interfaces built on web APIs.