We had a legacy import page in our application that took a very long time to perform its operation. Thus, the user was faced with a long loading empty page and no feedback. We wanted to do something to show the user the progress of the import without fundamentally changing the page. Of course, the best solution would have been to make the import an asynchronous background operation and then periodically get the status from the server via Ajax calls, but limited by the requirement to not change the page we came up with another solution: we would send bits of javascript while the import went on.

An attempt was made but it didn't work. All the scripts were loaded and executed at once. The user would still see an empty page, then a progress bar that immediately gets to 100%. Strange, that, since we knew that in certain circumstances, the scripts are executed as they are loaded. The answer was that browsers are caching a minimum bit of the page before they are interpreting it, about 1024 characters. The solution, then, was to send 1024 empty spaces before we start sending in the progress. This value of 1024 is not really documented or standard; it is a browser implementation thing.

Our design had the page loaded in an iframe, which allowed for scripts and html to not be loaded in the import page (thus making us stumble upon this behavior), and allowed for them to be loaded in the parent page. The scripts that we sent through the ASP.Net pipeline (using Response.Write and Response.Flush) accessed the resources from the parent page and showed a nice progress bar.

In case the page would have been a simple ASP.Net page, then the html and the CSS would have had to be sent first, perhaps instead of the 1024 spaces. There would have been problems when the page would have finished the import and the output of the page would have followed the one sent via the pipeline, but for our specific scenario it seems mere spaces and script blocks did not change the way browsers interpreted the rest of the page output.

A secondary side effect of this change was that we prevented the closing of the connection by some types of routers that need HTTP connections to have some traffic sent through them in an interval of time, providing a sort of "keep-alive". Before we made this change, these routers would simply cut the connection, leaving the user hanging.

A little known (at least by the people I've talked to) feature of Transact SQL (the Microsoft SQL engine) is the setting of ROWCOUNT. Usually ROWCOUNT is used to get the number of rows an operation has returned or affected and it is actually @@ROWCOUNT. Something like this:
UPDATE MyTable SET Value = 10 WHERE [Key]='MySetting'
SET @RowsUpdated = @@ROWCOUNT

Instead, setting ROWCOUNT tells the SQL engine to return (or affect) only a specified number of rows. So let's use the example before:
SET ROWCOUNT = 1
UPDATE MyTable SET Value = 10 WHERE [Key]='MySetting'
SET @RowsUpdated = @@ROWCOUNT
In this case a maximum of one row will be updated, not matter how many rows exist in the table with the value in the Key column 'MySetting'. Also, @@ROWCOUNT will correctly output 1 (or 0, if no rows exist).

Now, you will probably thing that setting ROWCOUNT is equivalent to TOP and a lot more confusing. I had a case at work where, during a code review, a colleague saw two SELECT statements one after the other. One was getting all the values, with a filter, and another was selecting COUNT(*) with the same filter. He correctly was confused on the reason why someone would select twice instead of also selecting the count of rows returned (or using @@ROWCOUNT :) ). The reason was that there was a SET ROWCOUNT @RowCount which restricted the number of rows returned by the first SELECT statement.

Here comes the gotcha. Assuming that setting ROWCOUNT is equivalent to a TOP restriction in the SELECT statement (in SQL 2000 and lower you could not use a variable with the TOP restriction and I thought that's why the first solution was used) I replaced SET ROWCOUNT @RowCount with SELECT TOP (@RowCount). And suddenly no rows were getting selected. The difference is that if you set ROWCOUNT to 0, the next statement will not be restricted in any way. Instead, TOP 0 will return 0 rows. So, as usual, be careful with assumptions.

There are other important differences between TOP and SET ROWCOUNT. TOP accepts both numeric and percentage values. Also, SET ROWCOUNT will NOT work on UPDATE, DELETE and INSERT statements from the version of the SQL server after 2012, so it's basically obsolete. Also, the query optimizer can consider the value of expression in the TOP clauses during query optimization. Because SET ROWCOUNT is used outside a statement that executes a query, its value cannot be considered in a query plan.

Update: in SQL 2012 a new options has been added to the ORDER BY clause, called OFFSET and FETCH, that finally work like the LIMIT keyword in MySQL.

I had an idea one of the previous days, an idea that seemed so great and inevitable that I thought about patenting it. You know, when you have a spark of inspiration and you tell no one about it or maybe a few friends and a few years later you see someone making loads of money with it? I thought I could at least "subscribe" to the idea somehow, make it partly my own. And so I asked a patent specialist about it.

He basically said two things. First of all, even if it is a novel idea, if it made of previously existing parts that can obviously be put together, then it doesn't qualify as a patent. If the concept is obvious enough in any way, it doesn't qualify. Say if someone wrote a scientific paper about a part of it and you find the rest in some nutjob blog about alien conspiracies, then you can't patent it. The other thing that he told me is that a true patent application costs about 44000$, in filing and attorney fees. I don't imagine that's a small sum for someone in the US or in another rich country, but it is almost insane for anyone living anywhere else.

But there is a caveat here. What if the nutjob alien conspiracy blog would be this one? What if, by publishing my idea here, no one could ever patent it and the best implementation would be the one that would gather the most support? It's a bit of "no, fuck you!", but still, why the hell not? So here it is:

I imagine, with the new climate of "do not track"ing and privacy concerns that search engines will have a tougher and tougher time gathering information about your personal preferences. Google will not know what you searched for before and therefore will not be able to show you the things it thinks you are most interested in. And that is a problem, since it probably would have been right and you would have been interested in those things. The user, seeing how the search engine does not find what they are looking for, will not be happy.

My solution, and something that is way simpler than storing cookies and analysing behaviour, is to give the responsibility (back) to the user. They would choose a "search profile" and, based on that, the search engine would filter and prioritize the results in a way specific to that profile. You can customize your profile and maybe save it in a list or you can use a standard one, but the results you get are the ones you intended to get.

A few examples, if you will: the "I want to download free stuff" profile would prioritize blogs and free sites and filter out commercial sites that contain words like "purchase", "buy", "trial", "shareware", etc; it would remove Amazon and other online shops from the result list and prioritize ThePirateBay, for example. Some of the smarter and tech savy Googlers are using the "-" filter to remove such words, but they are still getting the most commercially available sites there are. A search profile like this would try to analyse the site, see if it fits the "commercial" category and then filter it out. Now, you might think that sites will adapt and try to trick the engine into thinking they are not commercial in nature. No, they won't, because then the "I want to buy something" profile would not find them. Of course, they will adapt somehow and create two versions of the site, one that would seem commercial and one that would not. But the extra effort would remove from their profit margin. Or try a search profile like "long tail", where the stories that get most coverage and are reproduced in a lot of sites would get filtered out, allowing one to access new information as it comes in.

Bottom line is, I need such a service, but at the moment I am unwilling to invest in making one. First of all it would be a waste of time if it didn't work. Second of all it would get stolen and copied immediately by people with more money than me if it did work. Guess what? It's in my free blog. If anyone does it, they can't patent it, they can only use it because it is a good idea and they should make it really nice and usable before other people make it better.

To my shame, I've lived a long time with the impression that for an HTML element, a style attribute that is written inline always beats any of the CSS rules that apply to that element. That is a fallacy. Here is an example:
<style>
div {
width:100px;
height:100px;
background-color:blue !important;
}
</style>
<div style="background-color:red;"></div>
What do you think the div's color will be? Based on my long standing illusion, the div should be red, as it has that color defined inline, in the style attribute. But that is wrong, the !important keyword forces the CSS rule over the inline styling. The square will actually be blue! And it is not some new implementation branch for non-Internet Explorer browsers, either. It works consistently on all browsers.

Now, you might think that this is some information you absorb, but doesn't really matter. Well, it does. Let me enhance that example and change the width of the div, using the same !important trick:
<style>
div {
width:100px !important;
height:100px;
background-color:blue !important;
}
</style>
<div style="background-color:red;"></div>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
<script>
$(function() {
// $('div').width(200); // the width remains 100!
// $('div').width('200px'); // the width remains 100!
// $('div')[0].style.width='200px'; // the width remains 100!

// $('div').width('200px !important'); //this is not a valid value for the width parameter! in IE it will even show an error
// $('div').css('width','200px !important'); //this is not a valid value for the width parameter! in IE it will even show an error
// $('div')[0].style.width='200px !important'; //this is not a valid value for the width parameter! in IE it will even show an error

var oldStyle=$('div').attr('style')||'';
$('div').attr('style',oldStyle+';width: 200px !important'); // this is the only thing I found to be working!

});
</script>
As you can notice, the might jQuery failed, setting the width property in style failed, the only solution was to add a string to the style tag and override the !important keyword from the CSS with an inline !important keyword!

Update (via Dan Popa): And here is why it happens: CSS Specificity

I've heard about Dilbert as this corporate satire comic that is very funny. I've avoided it as much as possible, probably because I was afraid it would make fun of my way of life and find it makes valid points. Today, I succumbed to my curiosity and clicked a YouTube Dilbert video.



Conclusion: I am Dilbert...

I know of this method from the good old Internet Explorer 6 days. In order to force the browser to redraw an element, solving weird browser refresh issues, change it's css class. I usually go for element.className+='';. So you see, you don't have to actually change it. Sometimes you need to do this after a bit of code has been executed, so put it all in a setTimeout.

More explicitly, I was trying to solve this weird bug where using jQuery slideUp/slideDown I would get some elements in Internet Explorer 8 to disregard some CSS rules. Mainly the header of a collapsible panel would suddenly and intermittently seem to lose a margin-bottom: 18px !important; rule. In order to fix this instead of panel.slideUp(); I used
panel.slideUp(400/*the default value*/,function() { 
setTimeout(function() {
header.each(function(){
this.className+='';
});
},1);
});
where panel is the collapsible part and header is the clickable part. Same for slideDown.

I had to fix this weird bug today, where only in IE9 the entire page would freeze suddenly. The only way to get anything done was to select some text or scroll a scrollable area. I was creating an input text field, then, when pressing enter, I would do something with the value and remove the element.

It appears that in Internet Explorer 9 it is wrong to remove the element that holds the focus. Use window.focus(); before you do that.

I was having one of those Internet Explorer moments in Javascript, when I wanted to use Array.isArray and I couldn't because it was IE8. So, I thought, I would create my own isArray function and attach it to Array, so that it works cross browser. The issue now was how do I detect if an object in Javascript is an Array.

The instanceOf operator came to mind immediately. After all, don't you do the same thing in C#, compare if an object "is" something? Luckily for me, I checked the Internet and reached the faithful StackOverflow with an answer. The interesting bit was explaining why instanceOf would not work for all cases and that is that objects that cross the frame boundaries have their own version of class.

Let's say that you have two pages and one if having the other in an iframe. Let's call them innovatively testParent and testChild. If you create an array instance in testChild like x=new Array(); or x=[];, then the result of x instanceOf Array will be true in testChild, but false in testParent. That's because the Array in one page is different from Array in the other. And, damn it, it makes sense, too. Imagine you did what I did and added a function to the Array class. Would that class be the same as the Array in the iframe, without the function? What if I decide to add
Array.prototype.indexOf?

So, bottom line: in Javascript, instanceOf will not work in any meaningful way across frame boundaries.

Oh, and just so you do have a good way to check if an object is and array, do this:
var strArray=Object.prototype.toString(new Array());
Array.isArray=function(obj) {
return Object.prototype.toString(obj)==strArray;
}

I had to do a very simple Microsoft SQL query in which I wanted to update some of the values in a row from a row in the same table. Actually, the query was already there, but was using two local variables to store the information, then make the update. Something like this:
DECLARE @Var1 INT
DECLARE @Var2 INT
SELECT @Var1=Column1,@Var2=Column2 FROM MyTable WHERE ID=1
UPDATE MyTable SET Column1=@Var1,Column2=@Var2 WHERE ID=2
I really hated that I was using two SQL statements and all that declaring to do a simple update, so I looked up the syntax for the UPDATE statement. It said that if I want to update a table from a source I need to use the FROM keyword, like this:
UPDATE MyTable 
SET Column1=Alias.Column1,Column2=Alias.Column2
FROM MyOtherTable AS Alias
WHERE ID=2
AND Alias.ID=1
As you can see, we use an alias to name another table or query, we use the Alias name for all the conditions for that table and nothing for the conditions on the table we update. Easy, no? I even tested it and it worked. So I tried this:
UPDATE MyTable 
SET Column1=Alias.Column1,Column2=Alias.Column2
FROM MyTable AS Alias
WHERE ID=2
AND Alias.ID=1
I used the same table to update and to alias and it seemed to work. However, the number of updated columns was always 0. Remarkable how difficult it is to find on the net a straight answer about a simple situation like this.

It turns out that even with the alias, MSSql is confusing some things. The solution is to use a query from your table, rather than the name of the table itself. Here is how you do it:
UPDATE MyTable 
SET Column1=Alias.Column1,Column2=Alias.Column2
FROM (SELECT * FROM MyTable) AS Alias
WHERE ID=2
AND Alias.ID=1


SQL 2005 also introduced Common Table Expressions, which can be used to clarify a query. In this case, using a CTE results in the same execution plan and makes the entire query even more convoluted:
WITH Alias(Column1,Column2)
AS (
SELECT Column1, Column2 FROM MyTable
)
UPDATE MyTable
SET Column1=Alias.Column1,Column2=Alias.Column2
FROM Alias
WHERE ID=2
AND Alias.ID=1

Even if the documentation says you can specify a CTE without declaring the column names, I couldn't do it in this situation, I don't know why. I admit I only tried the CTE solution for a minute before discarding it as too verbose.

This will be a short blog post that shows my error in understanding what Javascript maps or objects are. I don't mean Google Maps, I mean dynamic objects that have properties that can be accessed via a key, not an index. Let's me exemplify:
var obj={
property1:"value1",
property2:2,
property3: new Date()
};
obj["property 4"]="value 4";
obj.property5=new MyCustomObject();
obj[6]='value 6';
console.log(obj.property1);
console.log(obj['property2']);
console.log(obj["property3"]);
console.log(obj['property 4']);
console.log(obj.property5);
console.log(obj[6]);
In this example, obj is an instance of object that had three properties and others are added. First the declaration notation is JSON like, then any object can be assigned to a property via two notations: the '.'(dot) and the square brackets. Note that the value of 'property 4' and of '6' can only be accessed via square brackets, there is no dot notation to escape that space and obj.6 is invalid.

Now, the gotcha is that, coming from the C# world, I've immediately associated this with a Hashtable class: something that can have any object as key and any object as value, but instead, a map is more like a Dictionary<string,object>.

Let me show you why that may be confusing. This is perfectly usable:
obj[new Date()]=true;
In this example I've used a Date object as a key. Or have I? In Javascript any object can be turned into a string with the toString() function. In fact, our Javascript map uses a key much like 'Sat Jul 14 2012 00:07:00 GMT+0300 (GTB Daylight Time)'. The translations from one type to another are seamless (and can generate quite a bit of righteous anger, too).

My point is that you can also use something like
obj[new MyObject()]=true;
only to see it blow in your face. The key will most likely be '[Object object]'. Not at all what was expected.


So remember: javascript properties can be any string, no matter how strange, but not other types. obj[6] will return the value you have set in obj[6] because in both cases that 6 is first turned into a string '6' and then used. It has nothing to do with the '6th value' or '6th property'. Those are arrays. The same for a Date or some custom object that has a toString() function that returns something unique for that object. I wouldn't use that, though, as you would probably want to use objects as keys and compare them by reference, not string value.


Programming Game AI by Example is one of those books that would have changed my life had I had read them when I was 15. Mat Buckland is taking a really high tech portion of game making and turning it into child's play. With source code!

From the very beginning we are being told that AI in games is different from what we would normally associate with Artificial Intelligence. AI in games is the thing that makes game agents look smart, but let the user enjoy the game the most. In other words, something that seems smart, but is just stupid enough for you to continue playing.

The book is comprised of ten chapters, heavy with code, but very well structured. The main tool in use are Finite State Machines, but we first get a mechanics physics lecture in chapter 1 where we learn what a vector is and how to normalize it and how to use this in the game physics. Moving to chapter 2, we learn what a state machine is and how to optimize memory by making each one a singleton, how to compose them and why more exciting aspects of artificial intelligence, like say neural networks, are not used more in games. We delve further into methods to optimize what we have learned to make it practical: prioritized dithering, partitioning, BSP, quad and oct trees, fuzzy-Q logic, cell space partitioning, all with code examples, in chapter 3. Chapter 5 is reserved for graphs, Dijkstra, A* and such. Chapter 6 goes into integrating Lua into your games, as a good tool to define and tweak the innards of your game before compiling it all for performance into a single code base. Raven, the example game engine, is detailed in chapter 7. Path planning is described in chapter 8, complete with many optimizations and tricks to make an algorithmic movement of units look natural and smart. Chapter 9 is about goal driven agent behaviour, where we learn how to make an agent define goals and act upon those goals. The composite pattern is suggested as a good solution for goals within goals. We end with a very interesting chapter about fuzzy logic. The basis of this is to fuzzify a situation, infer a behaviour, then defuzzify into a usable algorithmic value.

The bottom line is that this is a very easy book to read, explaining matter-of-factly how to easily create the intelligence in games like Fifa or Counter Strike. The code examples are extensive, but not necessary to understand the gist of things. At the end, it is both a fascinating and intriguing read as well as a good reference book for when you actually need this stuff.

I end this review with a quote from Dijkstra that was also mentioned in the book: The question of whether Machines Can Think... is about as relevant as the question of whether Submarines Can Swim. Very nice book and a recommended read.

Yesterday I wanted to upgrade the NUnit testing framework we use in our project to the latest stable version. We used 2.5.10 and it had reached 2.6.0. I simply removed the old version and replaced it with the new. Some of the tests failed.

Investigating revealed all tests had something in common: they were testing if two collections are not equal (meaning not the same instance) then that the collections are not equivalent (meaning none of the items in one collection is found in the other), yet that the values in the items are the same. Practically it was a test that checked if a cloning operation was successful. And it failed because from this version on, the two collections were considered Equal and Equivalent.

That is at least strange and so I searched the release notes for some information about this and found this passage: EqualConstraint now recognizes and uses IEquatable<T> if it is implemented on either the actual or the expected value. The interface is used in preference to any override of Object.Equals(), so long as the other argument is of Type T. Note that this applies to all equality tests performed by NUnit.

Indeed, checking the failing tests I realized that the collections contained IEquatable types.

It was not a complete surprise, but I did not expect it, either: the switch statement in Javascript is type exact, meaning that a classic if block like this:
if (x==1) { 
doSomething()
} else {
doSomethingElse();
}
is not equivalent to
switch(x) {
case 1:
doSomething();
break;
default:
doSomethingElse();
break;
}
If x is a string with the value '1' the if will do something, while the switch will do something else (pardon the pun). The equivalent if block for the switch statement would be:
if (x===1) { 
doSomething()
} else {
doSomethingElse();
}
(Notice the triple equality sign, which is type exact)

Just needed to be said.

and has 2 comments
I found a bit of code today that tested if a bunch of strings were found in another. It used IndexOf for each of the strings and continued to search if not found. The code, a long list of ifs and elses, looked terrible. So I thought I would refactor it to use regular expressions. I created a big Regex object, using the "|" regular expression operator and I tested for speed.

( Actually, I took the code, encapsulated it into a method that then went into a new object, then created the automated unit tests for that object and only then I proceeded in writing new code. I am very smug because usually I don't do that :) )

After the tests said the new code was good, I created a new test to compare the increase in performance. It is always good to have a metric to justify the work you have been doing. So the old code worked in about 3 seconds. The new code took 10! I was flabbergasted. Not only that I couldn't understand how that could happen, how several scans of the same string could be faster than a single one, but I am the one that wrote the article that said IndexOf is slower than Regex search (at least it was so in the .Net 2.0 times and I could not replicate the results in .Net 4.0). It was like a slap in the face, really.

I proceeded to change the method, having now a way to determine increases in performance, until I finally figured out what was going on. The original code was first transforming the text into lowercase, then doing IndexOf. It was not even using IndexOf with StringComparison.OrdinalIgnoreCase which was, of course, a "pfff" moment for me. My new method was, of course, using RegexOptions.IgnoreCase. No way this option would slow things down. But it did!

You see, when you have a search of two strings, separated by the "|" regular expression operator, inside there is a tree of states that is created. Say you are searching for "abc|abd", it will search once for "a", then once for "b", then check the next character for "c" or "d". If any of these conditions fail, the match will fail. However, if you do a case ignorant match, for each character there will be at least two searches per letter. Even so, I expected only a doubling of the processing length, not the whooping five times decrease in speed!

So I did the humble thing: I transformed the string into lowercase, then did a normal regex match. And the whole thing went from 10 seconds to under 3. I am yet to understand why this happens, but be careful when using the case ignorant option in regular expressions in .Net.

and has 1 comment
A short post about an exception I've met today: System.InvalidOperationException: There was an error reflecting 'SomeClassName'. ---> System.InvalidOperationException: SomeStaticClassName cannot be serialized. Static types cannot be used as parameters or return types.

Obviously one cannot serialize a static class, but I wasn't trying to. There was an asmx service method returning an Enum, but the enum was nested in the static class. Something like this:
public static class Common {

public enum MyEnumeration {
Item1,
Item2
}

}

Therefore, take this as a warning. Even if the compilation does not fail when a class is set to static, it may fail at runtime due to nested classes.