A Comedy of Errors

Published Jul 19, 2013

Today was my last day at the large corporation I was employed at. I quit for several reasons, but mainly because the project I was working on wasn't challenging at all. So one has to wonder: how did I get to be bored at work when only two years ago I was so happy to be hired by one of the best employers in Bucharest to work on an exciting new project? And the answer is : misrepresentation. I've titled the post thus because I sincerely think very few people, if any, wanted to harm me or lie to me or take advantage of me and yet the thing I was hired for changed and shifted until I became annoying for proposing ways of improving the project and asking for work. Let me take you from the beginning and you will see what I mean.

At the end of March 2011 I was working at a medium sized company, a place where there were some interesting projects, but the work ethic and methodology was really lacking. I had been working there for about two years and I was starting to get annoyed for not getting any recognition in the face of obvious personal improvement. And in my vulnerable state I was head-hunted by a human resources person from this major American corporation who wanted me to work on a project for them. I said I will give it a try, personally believing that I will either not like the place, not like the money or, even more probable, they will not think me worthy of the job. See where I am getting with this? I was already sold on the concept of a new job there and I didn't want to get disappointed, so I was playing down my chances. It seems that, after one telephonic interview, a series of six consecutive face to face interviews and another one with the head of the company, I was good enough for them. All I had to do was negotiate the wage. I was rather disappointed with the way the then current place of employment was handling salary increases (I've gotten only one raise, 2.25% in size, in two years) so I had a sum in mind that I would consider the minimum I would get in the new place of employment. You see, I am not a very good negotiator, I hate haggling, so I just drew a line in the sand and said that no one will push me over it. I was so serious when I went to talk to them... they proposed a sum that was more than 10% higher than what I was willing to fight for. Surprised, I accepted. Now all I had to do is wait for a call to tell me how we would proceed. It was near my birthday and I thought "Wow! What a nice present!".

At this point I'd had contact with the HR girl, who was very nice, been interviewed by a lot of people, both technical and not, also very nice, and even passed by the head honcho of the company who played a little game with me when we met, by pretending to be a very arrogant and annoying person while the top HR specialist was watching me with a stern expression. I am kind of proud of myself to have seen through their ruse, but I think they did try too hard. No one can be such an idiot to consider refactoring useless because you write good code from the beginning, right? Anyway, the guy who was supposed to be manager gave me a call the first week, told me there was some restructuring going on in the company, that we were still on and that I had to wait. He was nice too. He called me every week for about three months to tell me we were still on, while I was sweating bullets because I had already announced in my company that I was going to leave so that they can prepare for my absence, and they were starting to look at me suspiciously: I wasn't going anywhere. At the time my soon to be manager said he was going in holiday and that another guy was going to call me. No one did for more than two weeks. So I called them!

Now, you see, the HR girl was genuinely believing that she would offer me a better job and my prospective manager was also convinced he wanted me in his team and that we would work great together, they all wanted was best for them and me! So imagine my shock when I called and the new guy told me "Oh, you still wanted to get hired? I had understood that you refused the job". No, you moron, I did not. I politely asked for more information only to find out that they had no more positions as full employees, a limitation that had come from the US corporate headquarters, and that they could only offer me a consultancy job. Devastated, I asked what that meant. It meant I didn't get company stock as a bonus, but I was only paid the sum we had already negotiated. I didn't know about employee benefits when I got the job and I didn't really care for them, so I said yes. It was going to be a temporary measure until more hiring positions were opened.

The company had gone through a "reorg" (I was to meet with a lot of new acronyms and made up words in the new job, much to my chagrin - this particular one meant "reorganizing") and I was not to work under the guy who talked to me week to week, but under the new guy, the one that didn't know if I wanted the job or not. But he seemed genuinely nice and motivated, very enthusiastic about the new project, an administration web UI made in ASP.Net MVC. He asked me if I knew anything about the project. I said no. Why would I care about a project if I don't know if I would be hired or not. He seemed disappointed, but proceeded to explain what the project did and how great it was going to be, as it was meant to replace the old thing, made in ASP.Net in Visual Basic... by monkeys.

You see, he had the best of intentions as well, he was technical, willing to create something exciting and challenging and convinced that I would fit in their team and help with this new project made with new technology. When I finally got my hands on code and started actually working, the project was dead in the water. They had decided instead (and by they I mean some schmuck in US, not the people actually working on it) to just refactor the old admin and continue on. Different from what you may think, I was actually excited. In my head I had this tool that I would be working on to transform all VB.Net 2.0 code into C#.Net 4.0, become the hero of all, and create a formal framework of refactoring code from one language to the other. (If you don't know the terms here, just imagine I wanted to replace wood with stone so that the big bad wolf would not blow the house in). Alas, it was not to be. "Too risky" they call it when they feel afraid. I was yet to understand that in a large corporation responsibility dilutes until it becomes nothing. The only tangible thing becomes blame, which replaces responsibility and exterminates creativity and stifles initiative.

You see, like all the actors in this play, I too had good intentions. The first code I wrote was to fix some bugs that I had noticed in a bit of code related to online shopping. The customers had also noticed this bug and had found complex methods to get around it. While my fix solved the initial problem, it also broke any such method and, as I was a rookie in the formal way of doing things in the new company, the fix wasn't even bug free. From that time I was labelled "dangerous", from the initial problems with understanding the project and also my vocal way of expressing what I thought of leaving a project unstructured and buggy. Well, in hindsight, I have to agree that I wouldn't have felt a lot of love for someone calling me an idiot. Even if I were... especially if I were. Anyway, from this little incident you might have already guessed that a complete overhaul of the code (wood to stone) was out of the question. The powers that be had decided that starship Enterprise was to stay home, no bold missions for it.

I could go on with details, but you are probably already bored. Enough to say that I had my first real experience with Scrum there, a way in which all people had a role, each development cycle had phases that were followed in order and documented along the way, a system which, in time, would collect enough statistical data about the team so that it could predict development speed. All it needed for that was a team that would remain constant. Due to repeated reorg-ing my team had never the same structure for more than two or three months. The general (not per project, overall) company policy would shift radically, often completely in the opposite direction, every six months at most. Plans were set in motion, then discarded before reaching anywhere; performance metrics were created to measure project progress, only to be changed at the next strategic hiccup. It was clear that this was going nowhere like that and, instead of changing their way of constantly shivering in fear, they decided to close the project.

Only you see, the project earned money. Not a lot, but enough to count. There are tens of thousands of people paying for the service and hosting sites on it. You can't shut down a project like this. So they invented yet another expression "sustainability mode", to express the way they intended to zombify the project that they had advertised to clients and developers alike like the next cool thing that would solve all problems. I felt cheated and I could only imagine how clients that paid money instead of receiving them felt. There is an expression "the way to Hell is paved with good intentions". All the people - there were 70 developers and testers on the project and God knows how many managers and support staff - had the best intentions. We achieved a highway to Hell. Oh, and by the way, I never really got hired as a full employee. I remained "temporary" a consultant for the full length of my work there.

So what is the outcome of all of this? Two years of my life are gone. I have learned some things, but in the meantime lost a lot of my initial enthusiasm towards development. I stopped reading technical blogs and only spent my days thinking of the tasks ahead, like a good little robot. I've earned a lot more money, many of which I saved in the bank. I gained ten kilos (that's about 20 pounds, for you metrically challenged folk). I almost made my wife divorce me once, but we got over it. I've made some good friends. I learned to play chess a little better. I am not yet sure if the good balances the bad. Now I have found a new job opportunity, one that is even better paid. I only hope it will not be equally as depressing.

Was I wrong to be so optimistic about getting hired, as I am now, I guess, because it led to disappointment? Better to have loved and lost, I say. Were the people that misrepresented themselves and the project I was to work on wrong? I don't think so, I think they were equally optimistic and got equally disappointed. Was it wrong to have better expectations from the world? Prepare for the worst, but expect the best, I say. So yet again, I can't really blame anyone in the Romanian office and it is difficult to point the finger at the guys in the US as well. And yet, this is the result...

Beginning HTML5 and CSS3, by Richard Clark, Oli Studholme, Divya Manian, and Christopher Murphy

Published Jul 18, 2013

and has 0 comments

Beginning HTML5 and CSS3 is a strange enough book. It is not a book for beginners, as the title would have you believe, but only something that gives you a taste of new HTML and CSS features. Some of the things discussed are not very thorough, but may be very detailed (like when they talk about a hard to spot bug for a specific browser version). They talk very little of some often used features, but very in detail about something that will probably not be used by many people, like data annotation.

What is immediately obvious, though, is that the authors are professionals with a lot of experience. They see things and think about them in a way that a person with no design experience like myself has never thought about. Their explanations are backed by a lot of links and downloadable code, so it can be used like a reference. I would say that about a third of the book relates to HTML and the last two are about CSS. Awesome and weird things are being discussed, from custom fonts to 3D transforms, from data annotation of any HTML so that is machine parseable (like Google crawlers and such) to pagination control for layouts that need to look like books or be used in e-readers. It is also a modern book, the type that lets you know about various features, but instead of rehashing a subject, they give you a link to more information from someone else.

You can get example code from the book's site, as well as see the table of contents and details about the authors. What immediately jumps into mind is that the page is HTML5 and uses CSS3, but is not nearly as carefully crafted, data annotated or awesome as they advise in the book, which validates a little my view of the book: an interesting book to read about features you will probably rarely use. It certainly made me experiment some with my blog and think of ways of implementing many of the features, but in the end nobody wants something very over the top, so only small changes were made.

Sift3 gone wild!

Published Jun 25, 2013

and has 1 comment

A while ago (geez, it's been 6 years!) I wrote an algorithm that was supposed to quickly and accurately find the distance between two strings. After a few iterations it got really simple to implement, understand and use, unlike more academic algorithms like Levenshtein, for example. I placed all the code in this blog and allowed everyone to use it in any way they saw fit. Let me make this clear that it is not the greatest invention since fire, but it is mine and I feel proud when people use it. And today I accidentally stumbled upon something called Mailcheck, by Derrick Ko and Wei Lu. Not only did they use my algorithm, but they also graciously linked to my blog. And, according to the description from their GitHub page, this javascript library is being used by the likes of Kicksend, Dropbox, Kickstarter, Minecraft and the Khan Academy. Talking about Sift going wild! Woohoo!

So I started to Google for other uses of Sift3. Here is a list:

Mailcheck, the software that I was talking about above.
Sift3 for AutoKey - Autokey does "Fast scriptable desktop automation with hotkeys". Toralf also published the result on GitHub Gist: AutoHotkey: StrDiff() and his implementation is now used in 7Plus, a software to improve usability in Windows
Longest common substring problem - wikibooks varient vs sift3 varient - which seems to make Wikibooks the ~~winner. Drat! :)~~ loser! On second look I noticed that the values did not show time, but operations per second, so more is better. Also, looking at the implementation I noticed that it uses a maxOffset not of 5, but of the minimum length of the compared strings, which makes it more accurate, but much slower (and still wins!)
A Java implementation on BitBucket
A PDF document suggesting the algorithm is being used in an Italian software called CRM Deduplica

All in all I am very satisfied of how Sift3 is being used in the wild and, I have to say, grateful to the people that trusted my work enough to include it in theirs. It took 6 years, but look how much it has grown!

Update: To celebrate the usage of my algorithm, I've added an improved Javascript version in the original post, a form of the algorithm that I call "3B", since there are only minor improvements.
Now I have a weird idea of an algorithm that would compute the similarity between lists of strings (which is the usual usage of string distance). Could it be done, in a simple and straightforward manner like Sift3? What do you think?

Life with the Corporation

Published May 1, 2013

Posted in
misc
rant
programming
essay

and has 3 comments

It is getting close to two years since I've been employed by a large international corporation and I've decided to write a blog entry describing my feelings towards this kind of work. This is not a rant against my employers, mind you, but an attempt to explain to folks what being in a corporation actually means. And the best analogy I could find... is marriage.

Now, while this may seem funny it is also very true, if you ask me. People start their career by looking for a direction, rather than an employer, but they use employment as a tool to find and validate that direction. You may have finished college or university, but it is not clear yet where you would like to work. You still nurture thoughts of starting your own company and selling it for billions. You don't know what exactly you are good for yet. How is that different from when starting your love life? You don't know your "type", you look gratefully to any girl that would look back at you, you still hope you will find that "one true love" that will be pristine, beautiful, smart, good in bed and totally in love with you and your greatest fear is that if you talk to her, all kind of crap is going to spew out because you are not yet sure of yourself, or that in the blessed event you get her to bed, you will suck at it.

With experience comes enlightenment, though. You realize that some companies are not worth working for, that some bosses are just insufferable, that you are actually good at some things and you can pursue a career in that direction (while other directions would be a waste of everybody's time) and that being friends with your employer is nice, but not actually required. You realize you have a "type", a sort of working environment where you would like to work, while making yourself and everybody else happy. You are not working for a large company yet, you are just exploring your options and gaining confidence. "Listen", you say to your boss, "it was a great time working for you, but I need my space. It's not you, it's me. I can do better.". You can do that because you start to know your worth (or at least your minimum worth).

After this period of dating, you start to have longer relationships. You start to like your working place, give your loyalty to it, enjoying when your employers are praised of their product. At this stage, most companies are startups or small branches. Your boss is not far from where you could imagine yourself. Maybe he was a skilled worker too and, after "getting the ropes", he started his own company, thus effectively "graduating" to the next level. It's not so far from the "world as a school" view that many students have. This means if you do something bad, you are likely to get the negative feedback from the clients thrown at you, shouted at, made to feel ashamed for failing "the family". Not different from getting scolded by a parent or big brother or maybe a teacher. When you do something right, your boss might reward you, either by praise or by a bonus. You are also likely to find out what went well. You think of your employer as your peer and you start to get feelings of commitment towards the company, maybe even dream that as you evolve as a worker, you will be more and more appreciated, maybe even become owner, pulling that company up with you.

So, at this stage you think your girlfriend likes you as you like her, that if there is anything wrong she will tell you and that the relationship works great. Great disappointment awaits when you become more and more invested in the relationship and her response is just to smile at you more. At one point you tell her that you need more, time when you face reality and see that she will agree with you and break it off either directly or by remaining indifferent to your efforts. You could, of course, become complacent in this dysfunctional relationship, in which neither will make an effort and the results will be mediocre at best. You will not be happy.

Let's assume that it is over. You took the cold shower, realized that as an employee your role is to do your job and maybe take the blame for the bad things you do. No benevolent employer will come to you, pat you on the back, admit that his company has grown largely because of you and bring you up as a partner. You will become a bit cynical, starting to look less at the interesting work and the smart people and more for objective benefits like salary and working conditions. You still have the passion for what you do, after all, it is the reason you went for this career in the first place: you think you are good at it. You also have a medium large resume that proves it. You may not be your job, but you are your work, still.

This is the moment when the corporation becomes attractive and when they start to want you. The working conditions are great, the money is good, there is a system in place that guarantees quality and that allows you to continuously grow. They will even train you. It's like a new dream, now that the first two (having your own firm or being elevated to partner if you work well) have gone out in smoke. There are people that will pay you more to do the same thing and even support you in becoming better.

Once you go this way, you have a stable place to work, great kitchen, good furniture, console games. You are even allowed to play them if you are doing your job well. Your remuneration is not increasing exponentially anymore, but you have good chances you will get a raise annually, after being reviewed by managers and peers. You hardly hear of any of your clients, unless the entire product line is bad. Occasionally you get whiff of comments from clients that sound wonderful, but just feel fake: lots of "great" and "awesome" for a work you know to be good at best and most likely mediocre. With a corporation the strategy may change at any moment: products abandoned, directions chosen arbitrarily, features appearing out of nowhere. You still remember what it meant to be an important part of a company and so you start suggesting ways of improving the end result, only to be met with condescendent pats on the back. Not only your idea is not very good, it is you who doesn't understand what the company wants. Yes, it would work, but best for you and your colleagues, not for the greater entity. If in a small company your superior was somebody like you, only better or richer or older, now you have people of the same age as you that never did your work and that have completely different sets of values determining what and how you do your job. You really want to please them, but you feel misunderstood and you are incapable of understanding them in return.

You have reached the marriage stage. Your wife looked so good when you first met and she still does, only she walks in loose pyjamas in the house, doesn't smile so much, has completely different habits and a weird value system. Doesn't like what you like, wants you to "compromise" all the time and do things for "the couple". And while you "work on the relationship" the result is boring. You hardly hear her complain, but neither do you say anything, that would be rude, you would hurt her feelings. You read psychology books on how to improve your relationship or how to resolve conflicts without aggression. You do the things that you both like a little instead of doing the things you like a lot. Whenever you feel down, disappointed with your life, you consider all the years you gave to the marriage, the children, the way things might still get better in the future. Maybe it's your fault or you can still fix it somehow.

Just as in a marriage, your company does not pay you for your efforts, but for the fact that any other company would pay the same for a man of your qualities. You don't get to choose the direction of your life, because it is not yours anymore. Your job is to do the things you are told, not less and certainly not more, integrate with the process (maybe with your team, but that's a bonus. Process always beats people in a company), appear pleasant and always consider the consequences of your actions and words. Everybody is polite, even if they fire you or tell you your work sucked. If you somehow get noticed and start a real relationship with your betters you are immediately despised by your peers. You go up in the hierarchy for social reasons. If you are good enough in that, you may join the management track and continually getting raises, bonuses and promotions. Fail to do that and everyone will notice how hard working you are right until the moment they don't need you anymore.

And this is the point when you think "Wow, if I ever get out of this, I will start my own company!" only you spent too much time doing menial work and you lost your edge, you lost contact with any potential clients, you lost contact with your field of work, you got used to the comfort of getting a lot of money on a regular basis. You think "Oh, dear, if I quit this, I will have to go dating again and now I am old and forgot how it is done!". You are not really miserable, you are just not happy. You can live with that.

What do I think? You are not made for a single direction only. You may choose to change it at any time or even walk several at a time. Having chosen a direction, you must never stop. Others keep going forward and will leave you behind. The purpose of money is to allow you to survive. It's for housing, nourishment and maybe a bit of comfort, it is not a "level" you reach, or a unit of your value. Your value is determined by the results of your actions. Shitty actions, shitty value, no matter who's at fault. It's never too late to date, because it is worth finding someone you love, even if you end up marrying her afterwards :-)

Of course, you will all ask me now "Siderite, how come you are not following your own advice? Why do you stay?". There are several reasons. I am a little afraid. There is this economic crisis and no one is actually hiring. There are few startups. If I go somewhere else, I would only be switching one corporation for another. I may also be not "not happy" enough; complacency is in my blood. I certainly have that disgusting habit of watching a movie till the end, even if it is a bad one, just to be able to comment on it in full knowledge. I do feel that I have more things to learn here. Rest assured, though, if I spend too much time considering my options until none remain, you will read it in this blog and know not to follow my example.

Careful with the new .NET overloads of String.Join when moving to .Net 4.0

Published Apr 29, 2013

Posted in
.NET
programming
VB

and has 0 comments

I was trying to figure out an issue with our product and, in order to understand what was going on, I copied a class from the project into a small sandbox project to see how it would work. Lo and behold, a problem occurred in one of the utility functions that would have made the entire feature unusable. Yet the feature worked fine, except the little detail I was working on. What was going on?

Let me show you the code first (simplified for your pleasure):

Dim al As New ArrayList
al.Add("A")
al.Add("B")
Dim result as String = String.Join(":", al.ToArray(GetType(String))))

What do you think the result will hold?

In our production site the result was "A:B". In my sandbox project the result was "System.String[]". It took me a little to understand what was going on. You see, the sandbox project was .Net 4.0 while the production site still worked with 3.5. New to .Net 4.0 are overloads for the String.Join method, including one that receives a params array of objects. Since ArrayList.ToArray(Type type) returns Array no matter the type boxed inside, this is the overload that is chosen. The list of strings is taken as the first parameter, stringified, and the result is what you saw.

Conclusion: be very careful of the types you send to methods. Even if Visual Basic automatically casts method parameters, you never know for sure which type it will choose to transform into. And if you want to upgrade a VB project from .Net 2.0-3.5 to 4.0, be careful of the new overloads that have appeared.

Visual Studio 2010 restarting after loading a solution

Published Apr 18, 2013

and has 0 comments

This has happened to a colleague of mine: when trying to open a solution file with VS 2010 the program would start, it would successfully load the solution, open the few files that were open when last closed, then suddenly restart. No fancy dialog prompting for action, no message of any kind, just a total silent fail. He tried using the /log filename option of Visual Studio to gather more information, it did not help. He tried using the /SafeMode switch to load only the essential bits of Visual Studio, to no avail. The only useful information was in the Windows Application log (run eventvwr.exe in the command line) which pointed to the module cslangsvc.dll failing.

The Windows Application log entry for the crash:
Faulting application name: devenv.exe, version: 10.0.40219.1, time stamp: 0x4d5f2a73
Faulting module name: cslangsvc.dll, version: 10.0.40219.1, time stamp: 0x4d5f3b95
Exception code: 0xc0000005
Fault offset: 0x00249be1
Faulting process id: 0x1af8
Faulting application start time: 0x01ce3c253d1db9e5
Faulting application path: C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\devenv.exe
Faulting module path: C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC#\VCSPackages\cslangsvc.dll
Report Id: f3522e88-a818-11e2-a285-14109fd5a416

The bit of the Visual Studio log that is relevant:
<entry>
<record>437</record>
<time>2013/04/18 11:17:35.042</time>
<type>Information</type>
<source>VisualStudio</source>
<description>Unexpected system error mode before loading package [Visual Studio XML Editor Package]</description>
<guid>{87569308-4813-40A0-9CD0-D7A30838CA3F}</guid>
</entry>

At this point, we thought that it was a problem caused by Visual Studio trying to parse the open files, probably related to Intellisense, which means we need to make the solution open ignoring the files open when using it last time. That means deleting the file with the .suo extension associated to the solution.

This is a common issue, as the first Google search item when looking for cslangsvc.dll is this: Visual Studio 2010 Crashing on Solution Load. The thing is the guy does claim he deleted his .suo file and that the problem was still reproducing. There are also some Microsoft Connect items logged (Vs2010 crashes on cslangsvc.dll) that have no resolution.

Well, deleting the .suo file associated with the solution worked. This, of course, removes more than the last opened files, like the source control associations of the solution, various custom options for said solution, etc, but it shouldn't be a problem.

There is a bit of information in the Visual Studio log which points to the XML Editor Package. This means it could be caused by XML files or aspx/ascx files. However, it might not. We did not pursue the issue any further. Hope it helps other people looking for a resolution.

Detecting if an URL can be loaded in an IFrame

Published Apr 1, 2013

and has 2 comments

I was trying to solve a problem on this blog, where the opening of links in their own fancy javascript window would fail if the server did not allow opening their pages in frames. The result would be an ugly empty black window and an ugly javascript error in the browser console in the form of Refused to display '[some URL]' in a frame because it set 'X-Frame-Options' to 'SAMEORIGIN'.

So I started looking for a way to detect these pesky URLs. First attempt was using jQuery.Ajax with method 'HEAD', which inquires the HTTP headers only from a given URL. There is no reason I can see to deny access to 'HEAD' requests, but the browser does it anyway based on... HTTP headers! Not to mention that this solution fails for more links than a frame because of Ajax cross-site scripting issues.

Second attempt: use an adhoc hidden iframe to detect if the URL can be opened. This worked, but at a cost that prohibits me using the solution in the blog. I will publicize it, though, maybe it works for other scenarios. It uses jQuery, so you will have to translate it yourself into the raw Javascript version or to make it use your favorite framework.

The code first:

var Result={
    CouldNotLoadUrl:1,
    UrlLoadedButContentCannotBeAccessed:2,
    UrlLoadedContentCanBeAccessed:3
};


function isAvailable(url, callback, timeout) {
    if (!+(timeout)||+(timeout)<0) {
        timeout=5000;
    }
    var timer=setTimeout(function() {
        ifr.remove();
        callback(Result.CouldNotLoadUrl,url);
    },timeout);
    var ifr=$('<iframe></iframe>')
        .hide()
        .appendTo('body');
    ifr.on('load',function() {
        if (timer) clearTimeout(timer);
        var result;
        try {
            var iframe=ifr[0];
            var doc=(iframe.contentWindow||iframe.contentDocument).location.href;
            result=Result.UrlLoadedContentCanBeAccessed;
        } catch(ex) {
            result=Result.UrlLoadedButContentCannotBeAccessed;
            alt=ex;
        }
        ifr.remove();
        callback(result,url,alt);
    });
    ifr.attr('src',url);
}

You use it like this:

isAvailable('https://siderite.dev',function(result,url,alt) {
    switch(result) {
        case Result.CouldNotLoadUrl:
            alert('Could not load '+url+' in an iframe (timeout after '+alt+' milliseconds)');
            break;
        case Result.UrlLoadedButContentCannotBeAccessed:
            alert(url+' loaded in an iframe, but content is innaccessible ('+alt+')');
            break;
        case Result.UrlLoadedContentCanBeAccessed:
            alert(url+' loaded in an iframe and content is accessible');
            break;
    }
},10000);

You will need to have jQuery loaded and to have a html body loaded in the DOM (so if you copy these into an empty html file to test, make sure you add <body></body> before the script or execute isAvailable on the DOM Ready event.

And now the explanation.
First, it is imperative to first append the iframe element to body before binding the load event. That is because jQuery creates the element in a document fragment and this process fires a load event by itself! Then, different browsers act differently. Google Chrome does not fire a load event for an iframe with an URL that has this problem. Internet Explorer does fire the event, but the iframe's content document is not accessible (and this can be caught in a try/catch block). FireFox does fire the event, but only the leaf properties of the content document throw an exception, like the href of the location. In order to fix all of these, I used a timeout for Chrome, to return a false result after a time, then an access to ifr[0].contentDocument.location.href to make it throw an exception in both Internet Explorer and FireFox.

Finally, the reason why I cannot use it on the blog is that it would force the browser of the viewer to load all the URLs completely in the background in order to add a silly click event on the links. I have one more idea in mind, though, and that is to detect the frame loading problem when I open it and in that case to create the content of the iframe manually to contain a link to the URL. I will attempt it sometime soon.

Update: I found a solution that seems reasonable enough. When creating the iframe in which I want to nicely load the page that the link points to, I am not just creating an empty frame, but I also add content: a link that points to the same page. The SAMEORIGIN problem is still there, so the link opens the URL in target="_blank" and has a click handler that closes the dialog 100 milliseconds later. Thus, when changing the frame src, if the content of the frame does not change, the user will have the option to click the link and see the page open in a new tab/window.

Filtered Indexes in Microsoft Transact SQL 2008

Published Mar 12, 2013

Posted in
database
programming

and has 0 comments

I had this case today when I had to add a new column to a table with a gazillion rows and also create an index on this column. In my mind, having a column with NULL values would make the creation of the index instantaneous. I was wrong! In order to create the index, the SQL engine still scans all the rows and for tables with a lot of rows it takes a long while. But it felt really stupid. I knew that the column was filled with NULL values, I didn't need the computing of any index when I create it, instead only on INSERT/UPDATE/DELETE operations. So I started to look into solutions.

Enter filtered indexes! In Microsoft SQL Server 2008 an option for filters on indexes was introduced. The index must not be clustered and the definition it just the same as before, only with a WHERE clause that applies the filter. This seems to be the right solution to my scenario.

In order to test this I created a table called Test with two columns, a and b, both nvarchar(255). I filled the table with ten million rows having values for a and no values for b. Then I created an index on b; it took about 30 seconds. Then I created an index on a; it took 50 seconds. I removed the indexes and created a filtered index on b on the condition that b is not null. The operation was instantaneous. Success!
And here are the actual operations with more exact values (check out the comments for extra SQL tips on speed):

CREATE TABLE Test(a NVARCHAR(255),b NVARCHAR(255))

DECLARE @i INT = 625000 -- ten million divided by 16

BEGIN TRAN -- if not using a transaction, SQL will create a transaction per insert!

WHILE @i>0
BEGIN

  SET @i=@i-1
  INSERT INTO Test(a) VALUES(CAST(@i AS NVARCHAR(255)))

END

COMMIT TRAN -- 625000 rows with a filled and b empty [8 seconds]

-- insert selects are a lot faster than the while above
INSERT INTO Test(a)
SELECT a+'x1' FROM Test -- 1250000 total rows [3 seconds to create another 625000 rows]

INSERT INTO Test(a)
SELECT a+'x2' FROM Test -- 2500000 total rows [7 seconds]

INSERT INTO Test(a)
SELECT a+'x3' FROM Test -- 5000000 total rows [18 seconds]

INSERT INTO Test(a)
SELECT a+'x4' FROM Test -- 10000000 total rows [46 seconds]

CREATE INDEX IXa ON Test(a) -- [27 seconds]

DROP INDEX IXa ON Test

CREATE INDEX IXb ON Test(b) -- [25 seconds, approximately the same]

DROP INDEX IXb ON Test

CREATE INDEX IXa ON Test(a) -- there are no null values in the a column [56 seconds, double for a filtered index with a useless filter]
WHERE a IS NOT NULL

DROP INDEX IXa ON Test

CREATE INDEX IXb ON Test(b) -- [0 seconds - instant!]
WHERE b IS NOT NULL

A lot of good things can come from using filtered indexes, like for example a unique index on non-null values (which was pretty much a pain in the ass to do otherwise), but there are also gotchas. One of the things you have to watch out for is using filtered indexes on numeric columns. In this case the SET ARITHABORT ON command must be used (or insure in some other way that the option is on for all SQL sessions - SQL Management Studio and code both!, otherwise errors might occur or the index might be ignored. Also there seem to be some bugs that have not been addressed even in SQL 2012, like when using MERGE or when using filtered indexes on column being or not being null.

An more detailed article on this feature can be found here: SQL University: Advanced Indexing – Filtered Indexes.

Software patterns are useless

Published Mar 5, 2013

and has 1 comment

A couple a weeks ago I went for a job interview in order to see what is out there. It was a terrible phone interview and I failed to make a connection with the technical interviewers. I think I was just as disappointed in them as they were with me. However, what I believe killed it for them was an experiment I decided to conduct: to the usual question about software patterns I answered boldly that I didn't believe in software patterns and that I believed management techniques were what drove productivity and quality of work, not particular software commonalities. It was partly true, though, I do believe that, and this post is about my thoughts on the matter. Now, be warned: I may offend a few people that religiously pray in UML at Martin Fowler's shrine in the church of the Gang of Four.

Let's start with a brief history of software patterns. It started with inspiration from a building architecture book that explained that for similar problems there are similar solutions in architecture and that listing them would be a boon for the would be architect. Someone applied this to software in the form of common practices to solve common problems. The idea was that, outside the main goal of cataloguing best practices, these software patterns would provide a sort of common language for software architects.

The problem is, of course, practice. The good part of a software pattern is that it provides a tested solution to a common problem. The bad part of a software pattern is that there are not that many common problems and most of the time software patterns are applied badly in practice. Invariably, at some point, the application of a software pattern leads to the Golden Hammer "antipattern". If the software pattern is well thought to apply to as many of the situations where a certain problem is met, then it is defined by a lot of flexibility. That may sound good, but a flexible architecture is usually low performing, overly complex or simply hard to understand in order to use them in very specific circumstances. That is why for most requirements there isn't one software library, but many, each attempting to juggle the right amount of performance, complexity and ease of use. And, of course, if a pattern is not well thought, why use it at all?

I guess the point I am trying to make is that current software patterns try to catalogue small issues, things that are, really, of little consequence, and that other things are way more important to behold, like long term vision. What is the point of using Inversion of Control if you don't plan to ever make components modular? Why would you create an MVC application if the code monkeys that you have hired will riddle the view with business logic? In fact, why would you make any effort of standardizing your application if you don't plan anything? And that is the basis of my contention: planning an application is the bottleneck. I would go for Software Planning Patterns way before I even consider mid level software patterns. The planning is where the need of the technician does battle with the need of the business owner. One strategy might be perfect when chosen only to become obsolete during implementation, I agree, but then you have an initial strategy, a current strategy and the techs must find the way to transition from one to another. Planning is where all the interested parties come together and need to reach a decision; the technical implementation, let's face it, must just work and then, hopefully, be reasonably maintainable.

And I dare say that in building architecture the long term plan for the building is already there. It must be, as it will last for decades. You don't start a skyscraper only to change your mind in the middle of the work and go for a stadium. You know the purpose of the building, you know how you will use it, you know the needs it has to cover, and all that is left is to determine the technical way to achieve this plan. Software is way more elastic than this and I believe this is why the concept of pattern does not easily transfer from the domain of construction to the one of software development. In a way, forcing these patterns on the software world is in itself like using a Golden Hammer: they don't fit exactly. Moreover, the word of the day in software is Agile, the management technique that assumes right from the word go that there will be change in the plans for the project and that the team must be ready for it. I submit that the current state of software patterns is too rigid, too inflexible, based on the assumption that there is a plan and that it will not change. Or worse, based on the assumption that there is no plan and that anything must be enabled by the software architecture. They either force you to lose flexibility or add so much of it that it makes the end product bloated and unproductive.

The answer is somewhere in the middle and that middle is different from project to project. No matter how well software patterns are designed and applied, in the end they must conform (or end up hindering) the strategic plans for the software project, which are, in my view, the true bottleneck of software development. As a domain specific language between software architects, software patterns are good, but one has to acknowledge the extreme minority of architects in software. Even in that small guild I don't find there are a lot of discussions where the lingo of software patterns is used much. The complex patterns are invalidated by the many "flavours" that unavoidably appear to handle that complexity, while the simple patterns are invalidated by components that encapsulate them and relieve the developer from having to implement them. My conclusion is that the importance of software patterns is being exaggerated. Little more than a miniatlas of common software practices, it serves as a pretty picture book, rather than an instrument that promotes understanding the field of software development.

Why doubling single quotes is not enough to protect from SQL Injection

Published Jan 9, 2013

Posted in
database
programming

and has 0 comments

We all know that the best way to prevent SQL injection is to use parameters, either in stored procedures or in parameterized queries. Yet on some occasions we meet code that replaces every single quote with two single quotes. The question arises: OK, it's ugly, but isn't it enough?

I have recently found out of something called "Unicode Smuggling" which uses the database against itself to bypass protection as described above. More details here: SQL Smuggling , but the basic idea is this: if the replacement scheme is implemented in the database and uses VARCHAR or maybe the code uses some non-unicode string, then the protection is vulnerable to this by leveraging what is known as Unicode Homoglyphs. If you feel adventurous and want to examine thoroughly the ways Unicode can be used maliciously, check out UTR#36.

Here is an example:

CREATE PROC UpdateMyTable
@newtitle NVARCHAR(100)
AS
/*
Double up any single quotes
*/
SET @newtitle = REPLACE(@newtitle, '''','''''')

DECLARE @UpdateStatement VARCHAR(MAX)

SET @UpdateStatement = 'UPDATE myTable SET title='''  + @newtitle + ''''

EXEC(@UpdateStatement)

Note the use of VARCHAR as the type of @UpdateStatement. This procedure receives a string, doubles all single quotes, then creates an SQL string that then is executed. This procedure would be vulnerable to this:

EXEC UpdateMyTable N'ʼ;DROP TABLE myTable--'

The first character in the provided string is not a single quote, but the Unicode character U+02BC . SQL will silently convert this into a single quote when stored in a VARCHAR. The injection will work.

Small demo in MS-SQL:

DECLARE @nhack NVARCHAR(100) = N'ʼ;DROP TABLE myTable--'
DECLARE @hack VARCHAR(100) = N'ʼ;DROP TABLE myTable--'
SELECT UNICODE(@nhack),UNICODE(@hack) -- Results: 700      39

More discussing this here: Can I protect against SQL Injection by escaping single-quote and surrounding user input with single-quotes?

T-SQL conditional sort

Published Jan 7, 2013

and has 2 comments

I've often encountered this situation: a stored procedure needs to display a list of records ordered by a dynamic parameter. In Transact SQL, the Microsoft SQL server, one cannot do this elegantly in any way. I will list them all and tell you what the problem with each is.

First of all, let's start with an example. Assume we have a table called Test with a lot of rows, which has a datetime column which has an index on it. Let's call that TheDate to avoid any SQL keywords. We want to do something like this:

SELECT TOP 10 * FROM Test ORDER BY TheDate ASC

Notice that I want to get the top 10 rows, which means I only need a small part of the total. I also order directly by TheDate. In order to release a piece of code we also need to test it for performance issues. Let's look at the execution plan:

Now, let's try to order it dynamically on a string parameter which determines the type of the sort:

SELECT TOP 10 * FROM Test ORDER BY CASE WHEN @sort='ASC' THEN TheDate END ASC, TheDate DESC

As you see, I've used CASE to determine the sort order. There is no option to give a parameter as the sort order. The execution plan is this:

Surprise! The execution plan for the second query shows it is ten times slower. What actually happens is that the entire table is sorted by the case expression in a intermediate table result, then 10 items are extracted from it.

There must be a solution, you think, and here is an ingenious one:

DECLARE @intSort INT = CASE WHEN @sort='ASC' THEN 1 ELSE -1 END
SELECT TOP 10 * FROM Test ORDER BY CAST(TheDate AS FLOAT)*@intSort ASC

I transform the datetime value into a float and then I use a mathematical expression on it, multiplying it with 1 or -1. It is the simplest expression possible under the circumstances. The execution plan is:

Bottom line, there is no exception to the rule: when you order by an expression, SQL Server does not use indexes, even if the expression is easily decompilable. Don't get mislead by the apparent functional programming style of SQL syntax. It doesn't really optimize the execution plan in that way.. Even if the column is an integer, it will not work. Ordering by TheInteger is fundamentally faster than ordering by -TheInteger.

And now the solution, ugly as it may be (imagine the select is a large one, with joins and where conditions):

IF @sort='ASC' 
BEGIN
  SELECT TOP 10 * FROM Test ORDER BY TheDate ASC
END
ELSE
BEGIN
  SELECT TOP 10 * FROM Test ORDER BY TheDate DESC
END

Yes, the dreaded duplication of code. But the execution plans are now equivalent: 50%/50%.

This post was inspired by real events, where the production SQL server went into overdrive trying to create and manage many temporary result tables from a stored procedure that wanted to avoid duplication using the CASE method.

Update: there is, of course, another option: creating an SQL string in the stored procedure that is dynamically modified based on the sort parameter, then the SQL executed. But I really dislike that method, for many reasons.

More information from another blogger: Conditional Order By. They also explore rank using windows functions and in one of the comments there is a reference to SQL 2008 "Grouping Sets" which I have not covered yet.

Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems) , by John R. Koza

Published Dec 27, 2012

and has 0 comments

Oh, the monster of a book! If you want to learn to do genetic programming, then this is the book for you. If you need an interesting presentation of what genetic programming is, then this book is way too heavy.

Let's start with the beginning. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems) is a scientific book written by John R. Koza to explain why, how and what to do to make your computer find solutions to problems by using natural selection algorithms to automatically create programs to solve them. This is not a new field and a lot of research has been done in it, but this book takes it almost to the level of encyclopaedic knowledge.

First, Koza submits the idea that genetic programming can be used in most problems where computers are been used. That's a bold claim, but he proceeds on demonstrating it. He takes problem classes, provides code to create the programs that solve them, shows results and statistical analysis on the results and explains what the algorithm did to create said program at specific iterations. That's a lot to take in. If you are working on a program and you are using the book, you are more likely to find it extremely useful, both as a source for information and as a reference that can always be consulted.

However, if you are a casual reader like myself, reading all that code and statistical analysis in the subway can be difficult. And it's a lot of book, too. So, after some consideration, realising that I have no current project on which to apply the knowledge within the book, I've decided to stop reading it. I got to about a quarter of it, so I can safely say that it is a very thorough and well written book. You just have to need it in a certain way.

My first encounter with custom web fonts

Published Oct 21, 2012

and has 5 comments

Spurred by the story of the OpenDyslexic font, the one that is now included in this blog as an option for dyslexic people, I started exploring this concept of custom fonts for a web page. First of all, I wanted that the dyslexia option would work in all browsers, so I checked it in Chrome (my default), FireFox and Internet Explorer 8.

I was not surprised to see that Internet Explorer was not showing the fonts as Chrome did, but I was a little that FireFox did not render the font. I had FF 3.0.11 installed at that moment. I've updated FireFox, checked it, seemed to be working. Then, on a whim, I started to look into web fonts for Internet Explorer. And the option exists! Even better, you can use web fonts since Internet Explorer 6! The catch is, though, that Microsoft are using a proprietary font format which only recently was submitted to the W3C as an open standard.

Now about the dyslexic fonts people. I went to their site using Internet Explorer and I got that annoyingly condescending message "You are using an outdated browser.". In other words if you are dyslexic they will make and host a free font for you, but if you use Internet Explorer, screw you! I found this behaviour at least weird and more towards offensive.

So I downloaded their font, converted it from OTF to EOT format using the online free font converter Font2Web, then went to find some sort of font hosting site that would allow me to upload and host my custom font. And I found one called TypeFront. They are only offering for free only a single font that can be accessed 500 times daily, but still better than the other services that ask for money for every single option. So close to making the dyslexia option cross browser! And FAIL. TypeFront only allows the upload of fonts with the formats OTF, TTF and WOFF. They are not even mentioning EOT. I wrote them an email, but I don't expect much. Meanwhile, if you know of a decent free font hosting site, please let me know. Update: I was wrong. The TypeFront service allows for uploading only a few formats, but then it converts and publishes all formats, including EOT and even SVG. I am happy to report that the dyslexic font works now on all browsers!

Update April 2016: It appears that the TypeFront site is completely down, the domain gone.

It occurred to me that there is an interesting possibility, that ~~the people at TypeFront and/or~~ the people at DyslexicFonts do not know that Internet Explorer supports web fonts. The people at Google do. Check out this site: Google Web Fonts that allows the free embedding of over 554 font families. All you have to do is embed a script in the page. Incidentally I've included a feature in the site that allows to set a custom Google font for yourself, but there is no visual interface for it, yet. Their problem is that they did not adopt any font for dyslexics.

So, after documenting my journey, let me give you some links to resources that explain where all these formats come from, why some fonts are free and some are not, and so on:

True Type font format (TTF) - you will not be surprised to hear that the standard was developed by Apple in the 1980s as a competitor to Adobe's Type 1 fonts used in PostScript. This format is supported in all browsers but Internet Explorer through the CSS3 rule @font-face.
Open Type (OTF) is the next version of TrueType. Paradoxically, Open Type is a registered trademark of Microsoft's, so why they choose not to support it is beyond me. OTF is also supported in all other browsers.
Developed in 2009, the Web Open Font Format is a container with compression and additional metadata for all the other formats. The specification was submitted by Mozilla, Opera and Microsoft. And while all browsers support it (Internet Explorer 9+, FF 3.6+, Chrome, WebKit browsers, Safari, etc., they are still limited by which of the wrapped formats they can interpret. Thus, IE9 can open a woff file, if it contains EOT inside.
Embedded OpenType (EOT) is Microsoft's attempt at a web standard for fonts. It allows compression and has some sort of protection against copying. These little files work (somehow) since Internet Explorer 6, when CSS3 was a dream and all "modern browsers" had nothing.
Comparison of layout engines (web typography) - a small article discussing browser support for web typography.
The history of fonts - how typefaces evolved over time.
Intellectual property protection of typefaces

Showing the progress of a long running operation in a web page

Published Oct 19, 2012

and has 0 comments

We had a legacy import page in our application that took a very long time to perform its operation. Thus, the user was faced with a long loading empty page and no feedback. We wanted to do something to show the user the progress of the import without fundamentally changing the page. Of course, the best solution would have been to make the import an asynchronous background operation and then periodically get the status from the server via Ajax calls, but limited by the requirement to not change the page we came up with another solution: we would send bits of javascript while the import went on.

An attempt was made but it didn't work. All the scripts were loaded and executed at once. The user would still see an empty page, then a progress bar that immediately gets to 100%. Strange, that, since we knew that in certain circumstances, the scripts are executed as they are loaded. The answer was that browsers are caching a minimum bit of the page before they are interpreting it, about 1024 characters. The solution, then, was to send 1024 empty spaces before we start sending in the progress. This value of 1024 is not really documented or standard; it is a browser implementation thing.

Our design had the page loaded in an iframe, which allowed for scripts and html to not be loaded in the import page (thus making us stumble upon this behavior), and allowed for them to be loaded in the parent page. The scripts that we sent through the ASP.Net pipeline (using Response.Write and Response.Flush) accessed the resources from the parent page and showed a nice progress bar.

In case the page would have been a simple ASP.Net page, then the html and the CSS would have had to be sent first, perhaps instead of the 1024 spaces. There would have been problems when the page would have finished the import and the output of the page would have followed the one sent via the pipeline, but for our specific scenario it seems mere spaces and script blocks did not change the way browsers interpreted the rest of the page output.

A secondary side effect of this change was that we prevented the closing of the connection by some types of routers that need HTTP connections to have some traffic sent through them in an interval of time, providing a sort of "keep-alive". Before we made this change, these routers would simply cut the connection, leaving the user hanging.

SET ROWCOUNT vs TOP

Published Sep 27, 2012

Posted in
database
programming

and has 0 comments

A little known (at least by the people I've talked to) feature of Transact SQL (the Microsoft SQL engine) is the setting of ROWCOUNT. Usually ROWCOUNT is used to get the number of rows an operation has returned or affected and it is actually @@ROWCOUNT. Something like this:

UPDATE MyTable SET Value = 10 WHERE [Key]='MySetting'
SET @RowsUpdated =  @@ROWCOUNT

Instead, setting ROWCOUNT tells the SQL engine to return (or affect) only a specified number of rows. So let's use the example before:

SET ROWCOUNT = 1
UPDATE MyTable SET Value = 10 WHERE [Key]='MySetting'
SET @RowsUpdated =  @@ROWCOUNT

In this case a maximum of one row will be updated, not matter how many rows exist in the table with the value in the Key column 'MySetting'. Also, @@ROWCOUNT will correctly output 1 (or 0, if no rows exist).

Now, you will probably thing that setting ROWCOUNT is equivalent to TOP and a lot more confusing. I had a case at work where, during a code review, a colleague saw two SELECT statements one after the other. One was getting all the values, with a filter, and another was selecting COUNT(*) with the same filter. He correctly was confused on the reason why someone would select twice instead of also selecting the count of rows returned (or using @@ROWCOUNT :) ). The reason was that there was a SET ROWCOUNT @RowCount which restricted the number of rows returned by the first SELECT statement.

Here comes the gotcha. Assuming that setting ROWCOUNT is equivalent to a TOP restriction in the SELECT statement (in SQL 2000 and lower you could not use a variable with the TOP restriction and I thought that's why the first solution was used) I replaced SET ROWCOUNT @RowCount with SELECT TOP (@RowCount). And suddenly no rows were getting selected. The difference is that if you set ROWCOUNT to 0, the next statement will not be restricted in any way. Instead, TOP 0 will return 0 rows. So, as usual, be careful with assumptions.

There are other important differences between TOP and SET ROWCOUNT. TOP accepts both numeric and percentage values. Also, SET ROWCOUNT will NOT work on UPDATE, DELETE and INSERT statements from the version of the SQL server after 2012, so it's basically obsolete. Also, the query optimizer can consider the value of expression in the TOP clauses during query optimization. Because SET ROWCOUNT is used outside a statement that executes a query, its value cannot be considered in a query plan.

Update: in SQL 2012 a new options has been added to the ORDER BY clause, called OFFSET and FETCH, that finally work like the LIMIT keyword in MySQL.