It is getting close to two years since I've been employed by a large international corporation and I've decided to write a blog entry describing my feelings towards this kind of work. This is not a rant against my employers, mind you, but an attempt to explain to folks what being in a corporation actually means. And the best analogy I could find... is marriage.

Now, while this may seem funny it is also very true, if you ask me. People start their career by looking for a direction, rather than an employer, but they use employment as a tool to find and validate that direction. You may have finished college or university, but it is not clear yet where you would like to work. You still nurture thoughts of starting your own company and selling it for billions. You don't know what exactly you are good for yet. How is that different from when starting your love life? You don't know your "type", you look gratefully to any girl that would look back at you, you still hope you will find that "one true love" that will be pristine, beautiful, smart, good in bed and totally in love with you and your greatest fear is that if you talk to her, all kind of crap is going to spew out because you are not yet sure of yourself, or that in the blessed event you get her to bed, you will suck at it.

With experience comes enlightenment, though. You realize that some companies are not worth working for, that some bosses are just insufferable, that you are actually good at some things and you can pursue a career in that direction (while other directions would be a waste of everybody's time) and that being friends with your employer is nice, but not actually required. You realize you have a "type", a sort of working environment where you would like to work, while making yourself and everybody else happy. You are not working for a large company yet, you are just exploring your options and gaining confidence. "Listen", you say to your boss, "it was a great time working for you, but I need my space. It's not you, it's me. I can do better.". You can do that because you start to know your worth (or at least your minimum worth).

After this period of dating, you start to have longer relationships. You start to like your working place, give your loyalty to it, enjoying when your employers are praised of their product. At this stage, most companies are startups or small branches. Your boss is not far from where you could imagine yourself. Maybe he was a skilled worker too and, after "getting the ropes", he started his own company, thus effectively "graduating" to the next level. It's not so far from the "world as a school" view that many students have. This means if you do something bad, you are likely to get the negative feedback from the clients thrown at you, shouted at, made to feel ashamed for failing "the family". Not different from getting scolded by a parent or big brother or maybe a teacher. When you do something right, your boss might reward you, either by praise or by a bonus. You are also likely to find out what went well. You think of your employer as your peer and you start to get feelings of commitment towards the company, maybe even dream that as you evolve as a worker, you will be more and more appreciated, maybe even become owner, pulling that company up with you.

So, at this stage you think your girlfriend likes you as you like her, that if there is anything wrong she will tell you and that the relationship works great. Great disappointment awaits when you become more and more invested in the relationship and her response is just to smile at you more. At one point you tell her that you need more, time when you face reality and see that she will agree with you and break it off either directly or by remaining indifferent to your efforts. You could, of course, become complacent in this dysfunctional relationship, in which neither will make an effort and the results will be mediocre at best. You will not be happy.

Let's assume that it is over. You took the cold shower, realized that as an employee your role is to do your job and maybe take the blame for the bad things you do. No benevolent employer will come to you, pat you on the back, admit that his company has grown largely because of you and bring you up as a partner. You will become a bit cynical, starting to look less at the interesting work and the smart people and more for objective benefits like salary and working conditions. You still have the passion for what you do, after all, it is the reason you went for this career in the first place: you think you are good at it. You also have a medium large resume that proves it. You may not be your job, but you are your work, still.

This is the moment when the corporation becomes attractive and when they start to want you. The working conditions are great, the money is good, there is a system in place that guarantees quality and that allows you to continuously grow. They will even train you. It's like a new dream, now that the first two (having your own firm or being elevated to partner if you work well) have gone out in smoke. There are people that will pay you more to do the same thing and even support you in becoming better.

Once you go this way, you have a stable place to work, great kitchen, good furniture, console games. You are even allowed to play them if you are doing your job well. Your remuneration is not increasing exponentially anymore, but you have good chances you will get a raise annually, after being reviewed by managers and peers. You hardly hear of any of your clients, unless the entire product line is bad. Occasionally you get whiff of comments from clients that sound wonderful, but just feel fake: lots of "great" and "awesome" for a work you know to be good at best and most likely mediocre. With a corporation the strategy may change at any moment: products abandoned, directions chosen arbitrarily, features appearing out of nowhere. You still remember what it meant to be an important part of a company and so you start suggesting ways of improving the end result, only to be met with condescendent pats on the back. Not only your idea is not very good, it is you who doesn't understand what the company wants. Yes, it would work, but best for you and your colleagues, not for the greater entity. If in a small company your superior was somebody like you, only better or richer or older, now you have people of the same age as you that never did your work and that have completely different sets of values determining what and how you do your job. You really want to please them, but you feel misunderstood and you are incapable of understanding them in return.

You have reached the marriage stage. Your wife looked so good when you first met and she still does, only she walks in loose pyjamas in the house, doesn't smile so much, has completely different habits and a weird value system. Doesn't like what you like, wants you to "compromise" all the time and do things for "the couple". And while you "work on the relationship" the result is boring. You hardly hear her complain, but neither do you say anything, that would be rude, you would hurt her feelings. You read psychology books on how to improve your relationship or how to resolve conflicts without aggression. You do the things that you both like a little instead of doing the things you like a lot. Whenever you feel down, disappointed with your life, you consider all the years you gave to the marriage, the children, the way things might still get better in the future. Maybe it's your fault or you can still fix it somehow.

Just as in a marriage, your company does not pay you for your efforts, but for the fact that any other company would pay the same for a man of your qualities. You don't get to choose the direction of your life, because it is not yours anymore. Your job is to do the things you are told, not less and certainly not more, integrate with the process (maybe with your team, but that's a bonus. Process always beats people in a company), appear pleasant and always consider the consequences of your actions and words. Everybody is polite, even if they fire you or tell you your work sucked. If you somehow get noticed and start a real relationship with your betters you are immediately despised by your peers. You go up in the hierarchy for social reasons. If you are good enough in that, you may join the management track and continually getting raises, bonuses and promotions. Fail to do that and everyone will notice how hard working you are right until the moment they don't need you anymore.

And this is the point when you think "Wow, if I ever get out of this, I will start my own company!" only you spent too much time doing menial work and you lost your edge, you lost contact with any potential clients, you lost contact with your field of work, you got used to the comfort of getting a lot of money on a regular basis. You think "Oh, dear, if I quit this, I will have to go dating again and now I am old and forgot how it is done!". You are not really miserable, you are just not happy. You can live with that.

What do I think? You are not made for a single direction only. You may choose to change it at any time or even walk several at a time. Having chosen a direction, you must never stop. Others keep going forward and will leave you behind. The purpose of money is to allow you to survive. It's for housing, nourishment and maybe a bit of comfort, it is not a "level" you reach, or a unit of your value. Your value is determined by the results of your actions. Shitty actions, shitty value, no matter who's at fault. It's never too late to date, because it is worth finding someone you love, even if you end up marrying her afterwards :-)

Of course, you will all ask me now "Siderite, how come you are not following your own advice? Why do you stay?". There are several reasons. I am a little afraid. There is this economic crisis and no one is actually hiring. There are few startups. If I go somewhere else, I would only be switching one corporation for another. I may also be not "not happy" enough; complacency is in my blood. I certainly have that disgusting habit of watching a movie till the end, even if it is a bad one, just to be able to comment on it in full knowledge. I do feel that I have more things to learn here. Rest assured, though, if I spend too much time considering my options until none remain, you will read it in this blog and know not to follow my example.

and has 0 comments
I was trying to figure out an issue with our product and, in order to understand what was going on, I copied a class from the project into a small sandbox project to see how it would work. Lo and behold, a problem occurred in one of the utility functions that would have made the entire feature unusable. Yet the feature worked fine, except the little detail I was working on. What was going on?

Let me show you the code first (simplified for your pleasure):
Dim al As New ArrayList
al.Add("A")
al.Add("B")
Dim result as String = String.Join(":", al.ToArray(GetType(String))))

What do you think the result will hold?

In our production site the result was "A:B". In my sandbox project the result was "System.String[]". It took me a little to understand what was going on. You see, the sandbox project was .Net 4.0 while the production site still worked with 3.5. New to .Net 4.0 are overloads for the String.Join method, including one that receives a params array of objects. Since ArrayList.ToArray(Type type) returns Array no matter the type boxed inside, this is the overload that is chosen. The list of strings is taken as the first parameter, stringified, and the result is what you saw.

Conclusion: be very careful of the types you send to methods. Even if Visual Basic automatically casts method parameters, you never know for sure which type it will choose to transform into. And if you want to upgrade a VB project from .Net 2.0-3.5 to 4.0, be careful of the new overloads that have appeared.

This has happened to a colleague of mine: when trying to open a solution file with VS 2010 the program would start, it would successfully load the solution, open the few files that were open when last closed, then suddenly restart. No fancy dialog prompting for action, no message of any kind, just a total silent fail. He tried using the /log filename option of Visual Studio to gather more information, it did not help. He tried using the /SafeMode switch to load only the essential bits of Visual Studio, to no avail. The only useful information was in the Windows Application log (run eventvwr.exe in the command line) which pointed to the module cslangsvc.dll failing.

The Windows Application log entry for the crash:
Faulting application name: devenv.exe, version: 10.0.40219.1, time stamp: 0x4d5f2a73
Faulting module name: cslangsvc.dll, version: 10.0.40219.1, time stamp: 0x4d5f3b95
Exception code: 0xc0000005
Fault offset: 0x00249be1
Faulting process id: 0x1af8
Faulting application start time: 0x01ce3c253d1db9e5
Faulting application path: C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\devenv.exe
Faulting module path: C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC#\VCSPackages\cslangsvc.dll
Report Id: f3522e88-a818-11e2-a285-14109fd5a416


The bit of the Visual Studio log that is relevant:
<entry>
<record>437</record>
<time>2013/04/18 11:17:35.042</time>
<type>Information</type>
<source>VisualStudio</source>
<description>Unexpected system error mode before loading package [Visual Studio XML Editor Package]</description>
<guid>{87569308-4813-40A0-9CD0-D7A30838CA3F}</guid>
</entry>

At this point, we thought that it was a problem caused by Visual Studio trying to parse the open files, probably related to Intellisense, which means we need to make the solution open ignoring the files open when using it last time. That means deleting the file with the .suo extension associated to the solution.

This is a common issue, as the first Google search item when looking for cslangsvc.dll is this: Visual Studio 2010 Crashing on Solution Load. The thing is the guy does claim he deleted his .suo file and that the problem was still reproducing. There are also some Microsoft Connect items logged (Vs2010 crashes on cslangsvc.dll) that have no resolution.

Well, deleting the .suo file associated with the solution worked. This, of course, removes more than the last opened files, like the source control associations of the solution, various custom options for said solution, etc, but it shouldn't be a problem.

There is a bit of information in the Visual Studio log which points to the XML Editor Package. This means it could be caused by XML files or aspx/ascx files. However, it might not. We did not pursue the issue any further. Hope it helps other people looking for a resolution.

I was trying to solve a problem on this blog, where the opening of links in their own fancy javascript window would fail if the server did not allow opening their pages in frames. The result would be an ugly empty black window and an ugly javascript error in the browser console in the form of Refused to display '[some URL]' in a frame because it set 'X-Frame-Options' to 'SAMEORIGIN'.

So I started looking for a way to detect these pesky URLs. First attempt was using jQuery.Ajax with method 'HEAD', which inquires the HTTP headers only from a given URL. There is no reason I can see to deny access to 'HEAD' requests, but the browser does it anyway based on... HTTP headers! Not to mention that this solution fails for more links than a frame because of Ajax cross-site scripting issues.

Second attempt: use an adhoc hidden iframe to detect if the URL can be opened. This worked, but at a cost that prohibits me using the solution in the blog. I will publicize it, though, maybe it works for other scenarios. It uses jQuery, so you will have to translate it yourself into the raw Javascript version or to make it use your favorite framework.

The code first:
var Result={
CouldNotLoadUrl:1,
UrlLoadedButContentCannotBeAccessed:2,
UrlLoadedContentCanBeAccessed:3
};


function isAvailable(url, callback, timeout) {
if (!+(timeout)||+(timeout)<0) {
timeout=5000;
}
var timer=setTimeout(function() {
ifr.remove();
callback(Result.CouldNotLoadUrl,url);
},timeout);
var ifr=$('<iframe></iframe>')
.hide()
.appendTo('body');
ifr.on('load',function() {
if (timer) clearTimeout(timer);
var result;
try {
var iframe=ifr[0];
var doc=(iframe.contentWindow||iframe.contentDocument).location.href;
result=Result.UrlLoadedContentCanBeAccessed;
} catch(ex) {
result=Result.UrlLoadedButContentCannotBeAccessed;
alt=ex;
}
ifr.remove();
callback(result,url,alt);
});
ifr.attr('src',url);
}
You use it like this:
isAvailable('https://siderite.dev',function(result,url,alt) {
switch(result) {
case Result.CouldNotLoadUrl:
alert('Could not load '+url+' in an iframe (timeout after '+alt+' milliseconds)');
break;
case Result.UrlLoadedButContentCannotBeAccessed:
alert(url+' loaded in an iframe, but content is innaccessible ('+alt+')');
break;
case Result.UrlLoadedContentCanBeAccessed:
alert(url+' loaded in an iframe and content is accessible');
break;
}
},10000);

You will need to have jQuery loaded and to have a html body loaded in the DOM (so if you copy these into an empty html file to test, make sure you add <body></body> before the script or execute isAvailable on the DOM Ready event.

And now the explanation.
First, it is imperative to first append the iframe element to body before binding the load event. That is because jQuery creates the element in a document fragment and this process fires a load event by itself! Then, different browsers act differently. Google Chrome does not fire a load event for an iframe with an URL that has this problem. Internet Explorer does fire the event, but the iframe's content document is not accessible (and this can be caught in a try/catch block). FireFox does fire the event, but only the leaf properties of the content document throw an exception, like the href of the location. In order to fix all of these, I used a timeout for Chrome, to return a false result after a time, then an access to ifr[0].contentDocument.location.href to make it throw an exception in both Internet Explorer and FireFox.

Finally, the reason why I cannot use it on the blog is that it would force the browser of the viewer to load all the URLs completely in the background in order to add a silly click event on the links. I have one more idea in mind, though, and that is to detect the frame loading problem when I open it and in that case to create the content of the iframe manually to contain a link to the URL. I will attempt it sometime soon.

Update: I found a solution that seems reasonable enough. When creating the iframe in which I want to nicely load the page that the link points to, I am not just creating an empty frame, but I also add content: a link that points to the same page. The SAMEORIGIN problem is still there, so the link opens the URL in target="_blank" and has a click handler that closes the dialog 100 milliseconds later. Thus, when changing the frame src, if the content of the frame does not change, the user will have the option to click the link and see the page open in a new tab/window.

I had this case today when I had to add a new column to a table with a gazillion rows and also create an index on this column. In my mind, having a column with NULL values would make the creation of the index instantaneous. I was wrong! In order to create the index, the SQL engine still scans all the rows and for tables with a lot of rows it takes a long while. But it felt really stupid. I knew that the column was filled with NULL values, I didn't need the computing of any index when I create it, instead only on INSERT/UPDATE/DELETE operations. So I started to look into solutions.

Enter filtered indexes! In Microsoft SQL Server 2008 an option for filters on indexes was introduced. The index must not be clustered and the definition it just the same as before, only with a WHERE clause that applies the filter. This seems to be the right solution to my scenario.

In order to test this I created a table called Test with two columns, a and b, both nvarchar(255). I filled the table with ten million rows having values for a and no values for b. Then I created an index on b; it took about 30 seconds. Then I created an index on a; it took 50 seconds. I removed the indexes and created a filtered index on b on the condition that b is not null. The operation was instantaneous. Success!
And here are the actual operations with more exact values (check out the comments for extra SQL tips on speed):
CREATE TABLE Test(a NVARCHAR(255),b NVARCHAR(255))

DECLARE @i INT = 625000 -- ten million divided by 16

BEGIN TRAN -- if not using a transaction, SQL will create a transaction per insert!

WHILE @i>0
BEGIN

SET @i=@i-1
INSERT INTO Test(a) VALUES(CAST(@i AS NVARCHAR(255)))

END

COMMIT TRAN -- 625000 rows with a filled and b empty [8 seconds]

-- insert selects are a lot faster than the while above
INSERT INTO Test(a)
SELECT a+'x1' FROM Test -- 1250000 total rows [3 seconds to create another 625000 rows]

INSERT INTO Test(a)
SELECT a+'x2' FROM Test -- 2500000 total rows [7 seconds]

INSERT INTO Test(a)
SELECT a+'x3' FROM Test -- 5000000 total rows [18 seconds]

INSERT INTO Test(a)
SELECT a+'x4' FROM Test -- 10000000 total rows [46 seconds]

CREATE INDEX IXa ON Test(a) -- [27 seconds]

DROP INDEX IXa ON Test

CREATE INDEX IXb ON Test(b) -- [25 seconds, approximately the same]

DROP INDEX IXb ON Test

CREATE INDEX IXa ON Test(a) -- there are no null values in the a column [56 seconds, double for a filtered index with a useless filter]
WHERE a IS NOT NULL

DROP INDEX IXa ON Test

CREATE INDEX IXb ON Test(b) -- [0 seconds - instant!]
WHERE b IS NOT NULL

A lot of good things can come from using filtered indexes, like for example a unique index on non-null values (which was pretty much a pain in the ass to do otherwise), but there are also gotchas. One of the things you have to watch out for is using filtered indexes on numeric columns. In this case the SET ARITHABORT ON command must be used (or insure in some other way that the option is on for all SQL sessions - SQL Management Studio and code both!, otherwise errors might occur or the index might be ignored. Also there seem to be some bugs that have not been addressed even in SQL 2012, like when using MERGE or when using filtered indexes on column being or not being null.

An more detailed article on this feature can be found here: SQL University: Advanced Indexing – Filtered Indexes.

A couple a weeks ago I went for a job interview in order to see what is out there. It was a terrible phone interview and I failed to make a connection with the technical interviewers. I think I was just as disappointed in them as they were with me. However, what I believe killed it for them was an experiment I decided to conduct: to the usual question about software patterns I answered boldly that I didn't believe in software patterns and that I believed management techniques were what drove productivity and quality of work, not particular software commonalities. It was partly true, though, I do believe that, and this post is about my thoughts on the matter. Now, be warned: I may offend a few people that religiously pray in UML at Martin Fowler's shrine in the church of the Gang of Four.

Let's start with a brief history of software patterns. It started with inspiration from a building architecture book that explained that for similar problems there are similar solutions in architecture and that listing them would be a boon for the would be architect. Someone applied this to software in the form of common practices to solve common problems. The idea was that, outside the main goal of cataloguing best practices, these software patterns would provide a sort of common language for software architects.

The problem is, of course, practice. The good part of a software pattern is that it provides a tested solution to a common problem. The bad part of a software pattern is that there are not that many common problems and most of the time software patterns are applied badly in practice. Invariably, at some point, the application of a software pattern leads to the Golden Hammer "antipattern". If the software pattern is well thought to apply to as many of the situations where a certain problem is met, then it is defined by a lot of flexibility. That may sound good, but a flexible architecture is usually low performing, overly complex or simply hard to understand in order to use them in very specific circumstances. That is why for most requirements there isn't one software library, but many, each attempting to juggle the right amount of performance, complexity and ease of use. And, of course, if a pattern is not well thought, why use it at all?

I guess the point I am trying to make is that current software patterns try to catalogue small issues, things that are, really, of little consequence, and that other things are way more important to behold, like long term vision. What is the point of using Inversion of Control if you don't plan to ever make components modular? Why would you create an MVC application if the code monkeys that you have hired will riddle the view with business logic? In fact, why would you make any effort of standardizing your application if you don't plan anything? And that is the basis of my contention: planning an application is the bottleneck. I would go for Software Planning Patterns way before I even consider mid level software patterns. The planning is where the need of the technician does battle with the need of the business owner. One strategy might be perfect when chosen only to become obsolete during implementation, I agree, but then you have an initial strategy, a current strategy and the techs must find the way to transition from one to another. Planning is where all the interested parties come together and need to reach a decision; the technical implementation, let's face it, must just work and then, hopefully, be reasonably maintainable.

And I dare say that in building architecture the long term plan for the building is already there. It must be, as it will last for decades. You don't start a skyscraper only to change your mind in the middle of the work and go for a stadium. You know the purpose of the building, you know how you will use it, you know the needs it has to cover, and all that is left is to determine the technical way to achieve this plan. Software is way more elastic than this and I believe this is why the concept of pattern does not easily transfer from the domain of construction to the one of software development. In a way, forcing these patterns on the software world is in itself like using a Golden Hammer: they don't fit exactly. Moreover, the word of the day in software is Agile, the management technique that assumes right from the word go that there will be change in the plans for the project and that the team must be ready for it. I submit that the current state of software patterns is too rigid, too inflexible, based on the assumption that there is a plan and that it will not change. Or worse, based on the assumption that there is no plan and that anything must be enabled by the software architecture. They either force you to lose flexibility or add so much of it that it makes the end product bloated and unproductive.

The answer is somewhere in the middle and that middle is different from project to project. No matter how well software patterns are designed and applied, in the end they must conform (or end up hindering) the strategic plans for the software project, which are, in my view, the true bottleneck of software development. As a domain specific language between software architects, software patterns are good, but one has to acknowledge the extreme minority of architects in software. Even in that small guild I don't find there are a lot of discussions where the lingo of software patterns is used much. The complex patterns are invalidated by the many "flavours" that unavoidably appear to handle that complexity, while the simple patterns are invalidated by components that encapsulate them and relieve the developer from having to implement them. My conclusion is that the importance of software patterns is being exaggerated. Little more than a miniatlas of common software practices, it serves as a pretty picture book, rather than an instrument that promotes understanding the field of software development.

We all know that the best way to prevent SQL injection is to use parameters, either in stored procedures or in parameterized queries. Yet on some occasions we meet code that replaces every single quote with two single quotes. The question arises: OK, it's ugly, but isn't it enough?

I have recently found out of something called "Unicode Smuggling" which uses the database against itself to bypass protection as described above. More details here: SQL Smuggling , but the basic idea is this: if the replacement scheme is implemented in the database and uses VARCHAR or maybe the code uses some non-unicode string, then the protection is vulnerable to this by leveraging what is known as Unicode Homoglyphs. If you feel adventurous and want to examine thoroughly the ways Unicode can be used maliciously, check out UTR#36.

Here is an example:
CREATE PROC UpdateMyTable
@newtitle NVARCHAR(100)
AS
/*
Double up any single quotes
*/
SET @newtitle = REPLACE(@newtitle, '''','''''')

DECLARE @UpdateStatement VARCHAR(MAX)

SET @UpdateStatement = 'UPDATE myTable SET title=''' + @newtitle + ''''

EXEC(@UpdateStatement)

Note the use of VARCHAR as the type of @UpdateStatement. This procedure receives a string, doubles all single quotes, then creates an SQL string that then is executed. This procedure would be vulnerable to this:
EXEC UpdateMyTable N'ʼ;DROP TABLE myTable--'

The first character in the provided string is not a single quote, but the Unicode character U+02BC . SQL will silently convert this into a single quote when stored in a VARCHAR. The injection will work.

Small demo in MS-SQL:
DECLARE @nhack NVARCHAR(100) = N'ʼ;DROP TABLE myTable--'
DECLARE @hack VARCHAR(100) = N'ʼ;DROP TABLE myTable--'
SELECT UNICODE(@nhack),UNICODE(@hack) -- Results: 700 39

More discussing this here: Can I protect against SQL Injection by escaping single-quote and surrounding user input with single-quotes?

I've often encountered this situation: a stored procedure needs to display a list of records ordered by a dynamic parameter. In Transact SQL, the Microsoft SQL server, one cannot do this elegantly in any way. I will list them all and tell you what the problem with each is.

First of all, let's start with an example. Assume we have a table called Test with a lot of rows, which has a datetime column which has an index on it. Let's call that TheDate to avoid any SQL keywords. We want to do something like this:
SELECT TOP 10 * FROM Test ORDER BY TheDate ASC

Notice that I want to get the top 10 rows, which means I only need a small part of the total. I also order directly by TheDate. In order to release a piece of code we also need to test it for performance issues. Let's look at the execution plan:


Now, let's try to order it dynamically on a string parameter which determines the type of the sort:
SELECT TOP 10 * FROM Test ORDER BY CASE WHEN @sort='ASC' THEN TheDate END ASC, TheDate DESC

As you see, I've used CASE to determine the sort order. There is no option to give a parameter as the sort order. The execution plan is this:


Surprise! The execution plan for the second query shows it is ten times slower. What actually happens is that the entire table is sorted by the case expression in a intermediate table result, then 10 items are extracted from it.

There must be a solution, you think, and here is an ingenious one:
DECLARE @intSort INT = CASE WHEN @sort='ASC' THEN 1 ELSE -1 END
SELECT TOP 10 * FROM Test ORDER BY CAST(TheDate AS FLOAT)*@intSort ASC

I transform the datetime value into a float and then I use a mathematical expression on it, multiplying it with 1 or -1. It is the simplest expression possible under the circumstances. The execution plan is:


Bottom line, there is no exception to the rule: when you order by an expression, SQL Server does not use indexes, even if the expression is easily decompilable. Don't get mislead by the apparent functional programming style of SQL syntax. It doesn't really optimize the execution plan in that way.. Even if the column is an integer, it will not work. Ordering by TheInteger is fundamentally faster than ordering by -TheInteger.

And now the solution, ugly as it may be (imagine the select is a large one, with joins and where conditions):
IF @sort='ASC' 
BEGIN
SELECT TOP 10 * FROM Test ORDER BY TheDate ASC
END
ELSE
BEGIN
SELECT TOP 10 * FROM Test ORDER BY TheDate DESC
END

Yes, the dreaded duplication of code. But the execution plans are now equivalent: 50%/50%.

This post was inspired by real events, where the production SQL server went into overdrive trying to create and manage many temporary result tables from a stored procedure that wanted to avoid duplication using the CASE method.

Update: there is, of course, another option: creating an SQL string in the stored procedure that is dynamically modified based on the sort parameter, then the SQL executed. But I really dislike that method, for many reasons.

More information from another blogger: Conditional Order By. They also explore rank using windows functions and in one of the comments there is a reference to SQL 2008 "Grouping Sets" which I have not covered yet.

Oh, the monster of a book! If you want to learn to do genetic programming, then this is the book for you. If you need an interesting presentation of what genetic programming is, then this book is way too heavy.

Let's start with the beginning. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems) is a scientific book written by John R. Koza to explain why, how and what to do to make your computer find solutions to problems by using natural selection algorithms to automatically create programs to solve them. This is not a new field and a lot of research has been done in it, but this book takes it almost to the level of encyclopaedic knowledge.

First, Koza submits the idea that genetic programming can be used in most problems where computers are been used. That's a bold claim, but he proceeds on demonstrating it. He takes problem classes, provides code to create the programs that solve them, shows results and statistical analysis on the results and explains what the algorithm did to create said program at specific iterations. That's a lot to take in. If you are working on a program and you are using the book, you are more likely to find it extremely useful, both as a source for information and as a reference that can always be consulted.

However, if you are a casual reader like myself, reading all that code and statistical analysis in the subway can be difficult. And it's a lot of book, too. So, after some consideration, realising that I have no current project on which to apply the knowledge within the book, I've decided to stop reading it. I got to about a quarter of it, so I can safely say that it is a very thorough and well written book. You just have to need it in a certain way.

Spurred by the story of the OpenDyslexic font, the one that is now included in this blog as an option for dyslexic people, I started exploring this concept of custom fonts for a web page. First of all, I wanted that the dyslexia option would work in all browsers, so I checked it in Chrome (my default), FireFox and Internet Explorer 8.

I was not surprised to see that Internet Explorer was not showing the fonts as Chrome did, but I was a little that FireFox did not render the font. I had FF 3.0.11 installed at that moment. I've updated FireFox, checked it, seemed to be working. Then, on a whim, I started to look into web fonts for Internet Explorer. And the option exists! Even better, you can use web fonts since Internet Explorer 6! The catch is, though, that Microsoft are using a proprietary font format which only recently was submitted to the W3C as an open standard.

Now about the dyslexic fonts people. I went to their site using Internet Explorer and I got that annoyingly condescending message "You are using an outdated browser.". In other words if you are dyslexic they will make and host a free font for you, but if you use Internet Explorer, screw you! I found this behaviour at least weird and more towards offensive.

So I downloaded their font, converted it from OTF to EOT format using the online free font converter Font2Web, then went to find some sort of font hosting site that would allow me to upload and host my custom font. And I found one called TypeFront. They are only offering for free only a single font that can be accessed 500 times daily, but still better than the other services that ask for money for every single option. So close to making the dyslexia option cross browser! And FAIL. TypeFront only allows the upload of fonts with the formats OTF, TTF and WOFF. They are not even mentioning EOT. I wrote them an email, but I don't expect much. Meanwhile, if you know of a decent free font hosting site, please let me know. Update: I was wrong. The TypeFront service allows for uploading only a few formats, but then it converts and publishes all formats, including EOT and even SVG. I am happy to report that the dyslexic font works now on all browsers!

Update April 2016: It appears that the TypeFront site is completely down, the domain gone.

It occurred to me that there is an interesting possibility, that the people at TypeFront and/or the people at DyslexicFonts do not know that Internet Explorer supports web fonts. The people at Google do. Check out this site: Google Web Fonts that allows the free embedding of over 554 font families. All you have to do is embed a script in the page. Incidentally I've included a feature in the site that allows to set a custom Google font for yourself, but there is no visual interface for it, yet. Their problem is that they did not adopt any font for dyslexics.

So, after documenting my journey, let me give you some links to resources that explain where all these formats come from, why some fonts are free and some are not, and so on:

  • True Type font format (TTF) - you will not be surprised to hear that the standard was developed by Apple in the 1980s as a competitor to Adobe's Type 1 fonts used in PostScript. This format is supported in all browsers but Internet Explorer through the CSS3 rule @font-face.
  • Open Type (OTF) is the next version of TrueType. Paradoxically, Open Type is a registered trademark of Microsoft's, so why they choose not to support it is beyond me. OTF is also supported in all other browsers.
  • Developed in 2009, the Web Open Font Format is a container with compression and additional metadata for all the other formats. The specification was submitted by Mozilla, Opera and Microsoft. And while all browsers support it (Internet Explorer 9+, FF 3.6+, Chrome, WebKit browsers, Safari, etc., they are still limited by which of the wrapped formats they can interpret. Thus, IE9 can open a woff file, if it contains EOT inside.
  • Embedded OpenType (EOT) is Microsoft's attempt at a web standard for fonts. It allows compression and has some sort of protection against copying. These little files work (somehow) since Internet Explorer 6, when CSS3 was a dream and all "modern browsers" had nothing.
  • Comparison of layout engines (web typography) - a small article discussing browser support for web typography.
  • The history of fonts - how typefaces evolved over time.
  • Intellectual property protection of typefaces

We had a legacy import page in our application that took a very long time to perform its operation. Thus, the user was faced with a long loading empty page and no feedback. We wanted to do something to show the user the progress of the import without fundamentally changing the page. Of course, the best solution would have been to make the import an asynchronous background operation and then periodically get the status from the server via Ajax calls, but limited by the requirement to not change the page we came up with another solution: we would send bits of javascript while the import went on.

An attempt was made but it didn't work. All the scripts were loaded and executed at once. The user would still see an empty page, then a progress bar that immediately gets to 100%. Strange, that, since we knew that in certain circumstances, the scripts are executed as they are loaded. The answer was that browsers are caching a minimum bit of the page before they are interpreting it, about 1024 characters. The solution, then, was to send 1024 empty spaces before we start sending in the progress. This value of 1024 is not really documented or standard; it is a browser implementation thing.

Our design had the page loaded in an iframe, which allowed for scripts and html to not be loaded in the import page (thus making us stumble upon this behavior), and allowed for them to be loaded in the parent page. The scripts that we sent through the ASP.Net pipeline (using Response.Write and Response.Flush) accessed the resources from the parent page and showed a nice progress bar.

In case the page would have been a simple ASP.Net page, then the html and the CSS would have had to be sent first, perhaps instead of the 1024 spaces. There would have been problems when the page would have finished the import and the output of the page would have followed the one sent via the pipeline, but for our specific scenario it seems mere spaces and script blocks did not change the way browsers interpreted the rest of the page output.

A secondary side effect of this change was that we prevented the closing of the connection by some types of routers that need HTTP connections to have some traffic sent through them in an interval of time, providing a sort of "keep-alive". Before we made this change, these routers would simply cut the connection, leaving the user hanging.

A little known (at least by the people I've talked to) feature of Transact SQL (the Microsoft SQL engine) is the setting of ROWCOUNT. Usually ROWCOUNT is used to get the number of rows an operation has returned or affected and it is actually @@ROWCOUNT. Something like this:
UPDATE MyTable SET Value = 10 WHERE [Key]='MySetting'
SET @RowsUpdated = @@ROWCOUNT

Instead, setting ROWCOUNT tells the SQL engine to return (or affect) only a specified number of rows. So let's use the example before:
SET ROWCOUNT = 1
UPDATE MyTable SET Value = 10 WHERE [Key]='MySetting'
SET @RowsUpdated = @@ROWCOUNT
In this case a maximum of one row will be updated, not matter how many rows exist in the table with the value in the Key column 'MySetting'. Also, @@ROWCOUNT will correctly output 1 (or 0, if no rows exist).

Now, you will probably thing that setting ROWCOUNT is equivalent to TOP and a lot more confusing. I had a case at work where, during a code review, a colleague saw two SELECT statements one after the other. One was getting all the values, with a filter, and another was selecting COUNT(*) with the same filter. He correctly was confused on the reason why someone would select twice instead of also selecting the count of rows returned (or using @@ROWCOUNT :) ). The reason was that there was a SET ROWCOUNT @RowCount which restricted the number of rows returned by the first SELECT statement.

Here comes the gotcha. Assuming that setting ROWCOUNT is equivalent to a TOP restriction in the SELECT statement (in SQL 2000 and lower you could not use a variable with the TOP restriction and I thought that's why the first solution was used) I replaced SET ROWCOUNT @RowCount with SELECT TOP (@RowCount). And suddenly no rows were getting selected. The difference is that if you set ROWCOUNT to 0, the next statement will not be restricted in any way. Instead, TOP 0 will return 0 rows. So, as usual, be careful with assumptions.

There are other important differences between TOP and SET ROWCOUNT. TOP accepts both numeric and percentage values. Also, SET ROWCOUNT will NOT work on UPDATE, DELETE and INSERT statements from the version of the SQL server after 2012, so it's basically obsolete. Also, the query optimizer can consider the value of expression in the TOP clauses during query optimization. Because SET ROWCOUNT is used outside a statement that executes a query, its value cannot be considered in a query plan.

Update: in SQL 2012 a new options has been added to the ORDER BY clause, called OFFSET and FETCH, that finally work like the LIMIT keyword in MySQL.

I had an idea one of the previous days, an idea that seemed so great and inevitable that I thought about patenting it. You know, when you have a spark of inspiration and you tell no one about it or maybe a few friends and a few years later you see someone making loads of money with it? I thought I could at least "subscribe" to the idea somehow, make it partly my own. And so I asked a patent specialist about it.

He basically said two things. First of all, even if it is a novel idea, if it made of previously existing parts that can obviously be put together, then it doesn't qualify as a patent. If the concept is obvious enough in any way, it doesn't qualify. Say if someone wrote a scientific paper about a part of it and you find the rest in some nutjob blog about alien conspiracies, then you can't patent it. The other thing that he told me is that a true patent application costs about 44000$, in filing and attorney fees. I don't imagine that's a small sum for someone in the US or in another rich country, but it is almost insane for anyone living anywhere else.

But there is a caveat here. What if the nutjob alien conspiracy blog would be this one? What if, by publishing my idea here, no one could ever patent it and the best implementation would be the one that would gather the most support? It's a bit of "no, fuck you!", but still, why the hell not? So here it is:

I imagine, with the new climate of "do not track"ing and privacy concerns that search engines will have a tougher and tougher time gathering information about your personal preferences. Google will not know what you searched for before and therefore will not be able to show you the things it thinks you are most interested in. And that is a problem, since it probably would have been right and you would have been interested in those things. The user, seeing how the search engine does not find what they are looking for, will not be happy.

My solution, and something that is way simpler than storing cookies and analysing behaviour, is to give the responsibility (back) to the user. They would choose a "search profile" and, based on that, the search engine would filter and prioritize the results in a way specific to that profile. You can customize your profile and maybe save it in a list or you can use a standard one, but the results you get are the ones you intended to get.

A few examples, if you will: the "I want to download free stuff" profile would prioritize blogs and free sites and filter out commercial sites that contain words like "purchase", "buy", "trial", "shareware", etc; it would remove Amazon and other online shops from the result list and prioritize ThePirateBay, for example. Some of the smarter and tech savy Googlers are using the "-" filter to remove such words, but they are still getting the most commercially available sites there are. A search profile like this would try to analyse the site, see if it fits the "commercial" category and then filter it out. Now, you might think that sites will adapt and try to trick the engine into thinking they are not commercial in nature. No, they won't, because then the "I want to buy something" profile would not find them. Of course, they will adapt somehow and create two versions of the site, one that would seem commercial and one that would not. But the extra effort would remove from their profit margin. Or try a search profile like "long tail", where the stories that get most coverage and are reproduced in a lot of sites would get filtered out, allowing one to access new information as it comes in.

Bottom line is, I need such a service, but at the moment I am unwilling to invest in making one. First of all it would be a waste of time if it didn't work. Second of all it would get stolen and copied immediately by people with more money than me if it did work. Guess what? It's in my free blog. If anyone does it, they can't patent it, they can only use it because it is a good idea and they should make it really nice and usable before other people make it better.

To my shame, I've lived a long time with the impression that for an HTML element, a style attribute that is written inline always beats any of the CSS rules that apply to that element. That is a fallacy. Here is an example:
<style>
div {
width:100px;
height:100px;
background-color:blue !important;
}
</style>
<div style="background-color:red;"></div>
What do you think the div's color will be? Based on my long standing illusion, the div should be red, as it has that color defined inline, in the style attribute. But that is wrong, the !important keyword forces the CSS rule over the inline styling. The square will actually be blue! And it is not some new implementation branch for non-Internet Explorer browsers, either. It works consistently on all browsers.

Now, you might think that this is some information you absorb, but doesn't really matter. Well, it does. Let me enhance that example and change the width of the div, using the same !important trick:
<style>
div {
width:100px !important;
height:100px;
background-color:blue !important;
}
</style>
<div style="background-color:red;"></div>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
<script>
$(function() {
// $('div').width(200); // the width remains 100!
// $('div').width('200px'); // the width remains 100!
// $('div')[0].style.width='200px'; // the width remains 100!

// $('div').width('200px !important'); //this is not a valid value for the width parameter! in IE it will even show an error
// $('div').css('width','200px !important'); //this is not a valid value for the width parameter! in IE it will even show an error
// $('div')[0].style.width='200px !important'; //this is not a valid value for the width parameter! in IE it will even show an error

var oldStyle=$('div').attr('style')||'';
$('div').attr('style',oldStyle+';width: 200px !important'); // this is the only thing I found to be working!

});
</script>
As you can notice, the might jQuery failed, setting the width property in style failed, the only solution was to add a string to the style tag and override the !important keyword from the CSS with an inline !important keyword!

Update (via Dan Popa): And here is why it happens: CSS Specificity

I've heard about Dilbert as this corporate satire comic that is very funny. I've avoided it as much as possible, probably because I was afraid it would make fun of my way of life and find it makes valid points. Today, I succumbed to my curiosity and clicked a YouTube Dilbert video.



Conclusion: I am Dilbert...