I had to investigate a situation where a message of "Object moved to here", where "here" was a link, appeared in our ASP.Net application. First of all, we don't have that message in the app, it appears it is an internal message in ASP.Net, more exactly in HttpResponse.Redirect. It is a hardcoded HTML that is displayed as the response status code is set to 302 and the redirect location is set to the given URL. The browser is expected to move to the redirect location anyway, and the displayed message should be only a temporary thing. However, if the URL is empty, the browser does not go anywhere.

In conclusion, if you get to a webpage that has the following content:
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="[url]">here</a>.</h2>
</body></html>
then you are probably trying to Response.Redirect to an empty URL.

If I would have written an article two years ago (wait a minute, I did!) it would have been a cold enumeration of rules and my outsider opinion about it.

If I would have written an article about Scrum two months ago, it would have probably been an insider rant, explaining how just following the rules of Scrum leads to blind bureaucracy and to a lot of waste of time.

Well, now I am writing about Scrum as I understood it from a personal viewpoint, because I've had this epiphany: Scrum (or any other development process) is a personal process first and foremost. To use a metaphor, so that I can move it out of the way and talk shop, it is like driving a car on a straight road. You are the engine (strong and reliable, hopefully), the road is the development process and your speed is your development speed. It is so easy to do everything fast and furious, it's a straight road after all, but what if there is a fog? Then you would have to slow down, for danger that you wouldn't see a sudden obstacle.

To get real now, the fog is the lack of foreknowledge about what you are going to do. And I don't mean project vision, or strategic planning, I am talking about your personal schedule, of how well you know what you are going to do. Scrum is trying to achieve this by enforcing a time table (the sprints) and a schedule (the sprint planning) and a recurrent update mechanism (the daily meetings), but it is only the beginning. If you plan your sprint superficially, it is like adding fog to the road in front of you. If you (and I mean YOU!) do not update your schedule as you go, including documentation, estimated time, time spent and all useful metrics, you add fog to the road behind you. If you are surrounded by uncertainty, you cannot plan anything. You don't know where you are, where you are going and how fast you are going to get there.

After 6 months of badly implemented Scrum, I've experimented with using a simple text file to mark when I start a job and when I finish it and any breaks in between, updating the actual work time and estimated time in the Scrum tool. We are using a clearly defined specification document for each feature, including requirements, acceptance tests, implementation details, code reviews, definition of done, test plan, updated as we go. I've discovered that all this huge amount of information, instead of slowing me down, lifts the fog and allows me to push the pedal to the metal and go as fast as I can. I know when I am doing something, why, and who is doing everything else and why. At the end of the day I don't have to rack my brains to remember what I did, I just cut and paste the task names and times from my text file to an email and the Scrum master just goes over the list and we are free to talk about what really mattered in those tasks: new issues, dependencies, implementation details. The result is visibility of what we are doing and not less important, I get to go home early.

Of course, I am writing this enthusiastic post after a single day of well done and 6 months of poorly done, but I have a great feeling about this, something akin to fog being lifted from my eyes.

A Microsoft patch for ASP.Net released on the 29th of December 2011 adds a new functionality that rejects POST http requests with more than 1000 keys and any JSON http request with more than 1000 members. That is pretty huge, and if you have encountered this exception:
Operation is not valid due to the current state of the object.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.InvalidOperationException: Operation is not valid due to the current state of the object.

Source Error:
An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

Stack Trace:
[InvalidOperationException: Operation is not valid due to the current state of the object.]
System.Web.HttpValueCollection.ThrowIfMaxHttpCollectionKeysExceeded() +2692302
System.Web.HttpValueCollection.FillFromEncodedBytes(Byte[] bytes, Encoding encoding) +61
System.Web.HttpRequest.FillInFormCollection() +148

[HttpException (0x80004005): The URL-encoded form data is not valid.]
System.Web.HttpRequest.FillInFormCollection() +206
System.Web.HttpRequest.get_Form() +68
System.Web.HttpRequest.get_HasForm() +8735447
System.Web.UI.Page.GetCollectionBasedOnMethod(Boolean dontReturnNull) +97
System.Web.UI.Page.DeterminePostBackMode() +63
System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +133


then your site has been affected by this patch.

Well, you probably know that something is wrong with the design of a page that sends 1000 POST values, but still, let's assume you are in a situation where you cannot change the design of the application and you just want the site to work. Never fear, use this:

<configuration xmlns=”http://schemas.microsoft.com/.NetConfiguration/v2.0>
<appSettings>
<add key="aspnet:MaxHttpCollectionKeys" value="5000" />
<add key="aspnet:MaxJsonDeserializerMembers" value="5000" />
</appSettings>
</configuration>


More details:
Knowledge base article about it
The security advisor for the vulnerability fixed
The entire MS11-100 security update bulletin

Update 3rd of March 2016: I've opened this blog post on the latest versions of Internet Explorer, Firefox and Chrome and the bug is not present anymore. You should consider this blog post obsolete

The CSS standard allows selecting an element with a certain CSS class, using the dot notation (.myClass), but it also allows an element to have more than one CSS class, separated by space (class="myClass anotherClass"). Now, sometimes we would like to select an element that has two simultaneous classes and the CSS syntax for this is to put two class selectors one after the other, with nothing to separate them (.myClass.anotherClass). This blog entry is about how you should avoid using this, at least for now, as it seems to be one of the buggiest parts of the CSS implementation in the current browsers.

First of all, Internet Explorer just fails with this. Up to version 8 simultaneous class selectors just failed in a random way. I was surprised a few days ago to see that Chrome and Firefox also have issues with this. Even if the selector did appear to work, it did not register in the css rule lists for either Firebug or the Chrome developer tools, when used with CSS3 content selectors. Remove the double class selector and they would magically appear.

The bug can be reproduced in this code:
<style type="text/css">
.a.b span:before {
content:"before ";
}

.b span:after {
content:" after";
}
</style>
<div class="a b">
<span>Test</span>
</div>

and here is the result:
Test


In Chrome and Firefox the span element will appear with an after rule, but not a before rule, even if they are both applied.

and has 0 comments
Update February 2016: Tested it on Visual Studio 2015, with the Roslyn compiler, and the problem seems to have vanished.

Here is the now obsolete post:

A class in .Net can have default parameters, with values that are specified in the constructor signature, like this:
public MyClass(int p1,int p2=0) {}

If the class is inheriting from the Attribute class, then one can also specify property values when using it to decorate something, like this:
public class MyTestAttribute:Attribute {
public int P3 { get;set; }
}

[MyTest(P3=2)]
public class MyClass {}

What do you think this code would do?
public class MyTestAttribute:Attribute {
public MyTestAttribute(int p1,int p2=0) {}
public int P3 { get;set; }
}

[MyTest(1,P3=2)]
public class MyClass {}


Well, I tell you what is going to happen. Visual Studio and ReSharper both will see no problem with the syntax, but the compiler will issue an error based on the exception "error CS0182: An attribute argument must be a constant expression, typeof expression or array creation expression of an attribute parameter type", but without specifying any file or line.

My guess is that it is trying to interpret the P3=2 line as an expression to be calculated and passed as the second attribute of the constructor. What I was expecting is to set the default value to the second constructor parameter, then set the property P3. The vagueness of the error points out to a possible bug.

It all started with this site that got stuck in the Google Chrome's DNS cache so that any changes to the Windows/System32/drivers/etc/hosts file were ignored. I didn't want to close all Chrome windows (since the DNS cache is application wide in Chrome), so I googled for an answer. And here it was, a simple url that, typed in the Chrome address bar, would allow me to clear the cache: chrome://net-internals#dns.

But there are a lot more cool things there: testing of failed sites, a log of browser network events, control over open connections and so much more. That got me curious on other cool chrome:// URLs and I found some links listing a lot of them.

I don't have the time to parse all these cool hidden Chrome URLs and review them in this blog entry, so I will just list some links and let you explore the goodness:
Google Chrome’s Full List of Special about: Pages
12 Most Useful Google Chrome Browser chrome:// Commands
About and Chrome URLs

Update: The Chrome url containing all others can be found at chrome://about/.

I was trying to access http://localhost/Reports/Page.aspx, in other words an ASP.Net page in the Reports path of the local site. Instead, I was getting a Windows authentication prompt that had no business being there. At first I thought to debug the page, but it wouldn't even get there before I got the authentication prompt. I googled for it, but I didn't get far because I was looking for weird Windows authentication prompts, not for the specific location of my page: the Reports folder. It was stranger yet, as I stopped IIS and the authentication dialog was still appearing!

In the end, a colleague told me the solution: SQL Reporting Services is answering on the local Reports path! I stopped the service and voila! no more authentication prompt. Instead, a Service unavailable 503 error. This article explained things quite clearly. Even if you stop the service, you have to delete the access control list entry for /Reports with the command netsh http delete urlacl url=http://+:80/Reports or, I guess, restart the system after you set the Reporting Services service to Manual or Disabled.

Update: It is even easier to go to Sql Server Configuration Tools (in the Start Menu), run the Reporting Service Configuration Manager, then change the URL for the Report Manager URL to something other than Reports.

But what is this strange Access Control List? You can get a clue by reading about Http.sys API in Windows Vista and above and about Namespace Reservation. Apparently, one can do similar things on Windows Server 2003 and maybe even XP with the Httpcfg utility.

Let's start with an example:
DECLARE @SiteId INT
SELECT @SiteId=isnull(SiteId,0) FROM Orders WHERE OrderID=15
UPDATE Order_Sites SET SiteID=@SiteId


Can you spot the problem? What if SiteID in Order_Sites is not nullable? What if there is no order with OrderId 15?

That's right, when you select into a variable, you must be certain that the query returns any rows, otherwise the variable will not be set at all.

The solution is to add another operation that sets the value correctly. Here are three possible options:
  • Set @SiteId to 0 before the select.
  • Set @SiteId to isnull(@SiteId,0) after the select and simplify the select to not contain the isnull.
  • Use the select as an argument of the isnull operation:
    SET @SiteId= isnull((select SiteId from Master_orders where OrderID=-1),0)
    Yes, you can do that.


Either way, always pay attention to this gotcha in using SQL.

I had this operation on a Javascript object that was using a complex regular expression to test for something. Usually, when you want to do that, you use the regular expression inline or as a local variable. However, given the complexity of the expression I thought it would be more efficient to cache the object and reuse it anytime.

Now, there are two gotchas when using regular expressions in Javascript. One of them is that if you want to match on a string multiple times, you need to use the global flag. For example the code
var reg=new RegExp('a',''); //the same as: var reg=/a/;
alert('aaa'.replace(reg,'b'));
will alert 'baa', because after the first match and replace, the RegExp object returns from the replace operation. That is why I normally use the global flag on all my regular expressions like this:
var reg=new RegExp('a','g'); //the same as: var reg=/a/g;
alert('aaa'.replace(reg,'b'));
(alerts 'bbb')

The second gotcha is that if you use the global flag, the lastIndex property of the RegExp object remains unchanged for the next match. So a code like this:
var reg=new RegExp('a',''); //same as: /a/;
 
reg.test('aaa');
alert(reg.lastIndex);
 
reg.test('aaa');
alert(reg.lastIndex);
will alert 0 both times. Using the global flag will lead to alerting 1 and 2.

The problem is that the solution to the first gotcha leads to the second like in my case. I used the RegExp object as a field in my object, then I used it repeatedly to test for a pattern in more strings. It would work once, then fail, then work again. Once I removed the global flag, it all worked like a charm.

The moral of the story is to be careful of constructs like _reg.test(input);
when _reg is a global regular expression. It will attempt to match from the index of the last match in any previous string.


Also, in order to use a global RegExp multiple times without redeclaring it every time, one can just manually reset the lastIndex property : reg.lastIndex=0;

Update: Here is a case that was totally weird. Imagine a javascript function that returns an array of strings based on a regular expression match inside a for loop. In FireFox it would return half the number of items that it should have. If one would enter FireBug and place a breakpoint in the loop, the list would be OK! If the breakpoint were to be placed outside the loop, the bug would occur. Here is the code. Try to see what is wrong with it:
types.forEach(function (type) {
if (type && type.name) {
var m = /(\{tag_.*\})/ig.exec(type.name);
// type is tag
if (m && m.length) {
typesDict[type.name] = m[1];
}
}
});
Click here to see the answer

I've had a horrible week. It all started with a good Scrum sprint (or so I thought) followed by a period of quiet in which I could concentrate on my own ideas. And one of my ideas was to optimize the structure of the solution we work on, containing 48 projects, in order to save space and compilation time. In my eyes, I was a hero, considering that for a company with tens to hundreds of devs, even a one second increase in speed would be important. So, I set up doing that.

Of course, the sprint was not as good as I had imagined. A single stored procedure led to not less than four bugs in production, with me being to blame for them all. People lost more time working on reproducing the bugs, deploying the fix, code reviewing, etc. At long last I thought I was done with it and I could show everyone how great the solution looked now (on my computer) and atone for my sins.

So from a solution that spanned from 700Mb clean and 4Gb after compilation, I managed to get it to a maximum of 1.4Gb. In fact, it was so small I could put it all in a Ram disk, leading to enormous speeds. In comparison, a normal drive goes to about 30MB per second, an SSD drive (without encryption) goes to about 250MB/s, while my RamDisk was running at a whooping 3.6GB/s. That sped up the compilation and parsing of files. Moreover, I had discovered that MsBuild has this /m parameter that makes it use more processors. A compilation would go to about 40 seconds, down from two minutes and a half. Great! Alas, it was not to be so easy.

First of all, the steps I was considering were simple:
  • Take all projects and make them have a single output folder. That would decrease the size of the solution since there would be no copies of the .dll files, Then the sheer speed of the compilation would have to increase, since there would be less copying and less compilation.
  • More importantly, I was considering making a symlink to a RAM drive and using it instead of the destination folder.
  • Another step I was considering was making all references to the dll files in the output folder, not to the projects, allowing for projects to be opened independently.


At first I was amazed the solution decreased in size so much and I just placed the entirety of it into a RAM drive. This fixed some of the issues with Visual Studio, because when I was selecting a file through a symlink to add as a reference, it would resolve to the target folder instead of the name of the symlink. And it was't easy either. Imagine removing all project references and replacing them with dll references for 48 projects. It took forever.

Finally I had the glorious compilation. Speed, power, size, no warnings either (since I also worked on that) and a few bug fixes thrown in there for good measure. I was a god! Then the problems appeared.

Problem 1: I had finished the previous sprint with a buggy stored procedure committed to production. Clients were losing money and complaining. That put a serious dent in my pride, especially since there were multiple problems coming from both less attention to how I wrote the code to downright lack of knowledge of the flow of the application. For the last part I am not really the only one to blame, but it was my responsibility.

Problem 2: The application was throwing some errors about the target framework of a dll. It was enough to make me understand a major flaw in my design: there were .Net 3.5 and .Net 4.0 assemblies in the solution and placing them all in the same output folder would break some build scripts. Even worse, the 8 web projects in the solution needed to have their output in the bin folder, so that IIS would find them. Fixed it only to see the size of the solution rise back to 3Gb.

Problem 3: Visual Studio would not be so smart as to understand that if a project is loaded, going to the declaration of a member in the compiled assembly means I want to see the actual source, not the IL code. Well, sometime it worked, but sometimes it didn't. As a result I restored the project references instead of the assembly references.

Problem 4: the MsBuild /m flag would do wonders on my machine, but it would not do much on the build server. Nor would it do its magic on slower, less multiprocessor computers than my own.

Problem 5: Facing a flood of problems coming from me, my colleagues lost faith and decided to not even try the modifications that removed the compilation warnings from the solution.

Conclusion: The build went marginally faster, but not enough to justify a whole week of work on it. The size decreased by 25%, making it feasible to put it all in a RAM Drive, so that was great, to the detriment of working memory. I still have to see if that is a good or a bad thing. The multiprocessor hacks didn't do much, the warnings are still there and even some of my bug fixes were problematic because someone else also worked on them and didn't tell anyone. All in a week's work.

Things I have learned from all this: Baby steps. When I feel enthusiasm, I must take it as a sign of trouble. I must be dispassionate as an ice cube and think things through. If I am working on a branch, integrate the trunk into it every day, so as to not make it harder to do at the end. When doing something, do it from start to finish, no matter what horrors I see while doing it. Move away from Sodom and not look back at it. Someone else will fix that, maybe, you just do your task well. When finishing something, commit it into the source control so it can easily be reverted through a single atomic operation.

It is difficult to me to adjust to something that involves this amount of planning and focus. I feel as if the chaotic development years of my youth were somewhat better, even if at the time I felt that it was stupid and focus and planning was needed. As a good Romanian, I am neurotic enough to see the worst side of everything, master at complaining about it, but incapable of actually doing something. Yeah... this was a bad week.

and has 0 comments
No news from my personal or work fields. However, I've found two interesting news just today and I wanted to share them.

First, the development of a camera to capture pictures that you can focus later. Although I have heard of solid metal lenses that would be less than 1$ to make and would achieve the same effect, the only actually functioning system I've heard of so far is the Lytro Living Picture camera. Here is an Ars Technica article on it and here is a YouTube video demo.

The second news is more IT related. It involves the cryptographic standards for XML, as defined by W3C. They failed! Here is an article about how they were cracked by using a vulnerability in the Cipher Block Chaining and here is a link to their press release.

and has 0 comments
A question arose at the office today: What is faster? Using a Dictionary<string,object> with a StringComparer.OrdinalCaseInsensitive constructor parameter or using a normal constructor call and using ToLower on the key before using it. The quick answer: using ToLower on the key.

The longer answer is that StringComparer.OrdinalCaseInsensitive implements IEqualityComparer<string> by using a native code function for GetHasCode(), which is very efficient. Unfortunately, it must use the case insensitive string comparison on both input key and stored keys, while calling ToLower on the keys before using them makes the comparison only once.

Short version: here is the link to the uploaded .NET regular expression. (look in the comment for the updated version)

I noticed that the javascript code that I am using to parse PGN chess games and display it is rather slow and I wanted to create my own PGN parser, one that would be optimal in speed. "It should be easy", I thought, as I was imagining getting the BNF syntax for PGN, copy pasting it into a parser generator and effortlessly getting the Javascript parser that would spit out all secrets of the game. It wasn't easy.

First of all, the BNF notation for Portable Game Notation was not complete. Sure, text was used to explain the left overs, but there was no real information about it in any of the "official" PGN pages or Wikipedia. Software and chess related FTPs and websites seemed to be terrible obsolete or missing altogether.

Then there was the parser generator. Wikipedia tells me that ANTLR is pretty good, as it can spew Javascript code on the other end. I downloaded it (a .jar Java file - ugh!), ran it, pasted BNF into it... got a generic error. 10 minutes later I was learning that ANTLR does not support BNF, but only its own notation. Searches for tools that would do the conversion automatically led me to smartass RTFM people who explained how easy it is to do it manually. Maybe they should have done for me, then.

After all this (and many fruitless searches on Google) I decided to use regular expressions. After all, it might make a lot of sense to have a parser in a language like C#, but the difference in speed between a Javascript implementation and a native regular expression should be pretty large, no matter how much they optimize the engine. Ok, let's define the rules of a PGN file then.

In a PGN file, a game always starts with some tags, explaining what the event is, who played, when, etc. The format of a tag is [name "value"]. There are PGN files that do not have this marker, but then there wouldn't be more than one game inside. The regular expression for a tag is: (\[\s*(?<tagName>\w+)\s*"(?<tagValue>[^"]*)"\s*\]\s*)+. Don't be scared, it only means some empty space maybe, then a word, some empty space again, then a quoted string that does not contain quotes, then some empty space again, all in square brackets and maybe followed by more empty space, all of this appearing at least once.

So far so good, now comes the list of moves. The simplest possible move looks like 1. e4, so a move number and a move. But there are more things that can be added to a move. For starters, the move for black could be following next (1. e4 e5) or a bit after, maybe if there are commentaries or variations for the move of the white player (1... e5). The move itself has a variety of possible forms:
  • e4 - pawn moved to e4
  • Nf3 - knight moved to f3
  • Qxe5 - queen captured on e5
  • R6xf6 - the rook on the 6 rank captured on f6
  • Raa8 - The rook on file a moved to a8
  • Ka1xc2 - the knight at a1 captured on c2
  • f8=Q - pawn moved to f8 and promoted to queen
  • dxe8=B - pawn on the d file captured on e8 and promoted to bishop


There is more information about the moves. If you give check, you must end it with a + sign, if you mate you end with #, if the move is weird, special, very good, very bad, you can end it with stuff like !!, !?, ?, !, etc which are the PGN version of WTF?!. And if that is not enough, there are some numbers called NAG which are supposed to represent a numeric, language independent, status code. Also, the letters that represent the pieces are not language independent, so a French PGN might look completely different from an English one. So let's attempt a regular expression for the move only. I will not implement NAG or other pieces for non-English languages: (?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?). I know, scary. But it means a letter in the list PNBRQK, one for each possible type of chess piece, which may appear or it may not, then a letter between a and h, which would represent a file, then a number between 1 and 8 which would represent a rank. Both letter and number might not appear, since they represent hints on where the piece that moved was coming from. Then there is a possible letter x, indicating a capture, then, finally, the destination coordinates for the move. There follows an equal sign and another piece, in case of promotion. An astute reader might observe that this also matches a rook that promotes to something else, for example. This is not completely strict. If this long expression is not matched, maybe something that looks like OO, O-O, OOO or O-O-O could be matched, representing the two possible types of castling, when a rook and a king move at the same time around each other, provided neither had not moved yet. And to top it off, we allow for some empty space and the characters ! and ? in order to let chess annotators express their feelings.

It's not over yet. PGN notation allows for commentaries, which are bits of text inside curly brackets {what an incredibly bad move!} and also variations. The variations show possible outcomes from the main branch. They are lists of moves that are enclosed in round brackets. The branches can be multiple and they can branch themselves! Now, this is a problem, as regular expressions are not recursive. But we only need to match variations and then reparse them in code when found. So, let's attempt a regular expression. It is getting quite big already, so let's add some tokens that can represent already discussed bits. I will use a @ sign to enclose the tokens. Here we go:
  • @tags@ - we will use this as a marker for one or more tags
  • @move@ - we will use this as a marker for the complicated move syntax explained above
  • (?<moveNumber>\d+)(?<moveMarker>\.|\.{3})\s*(?<moveValue>@move@)(?:\s*(?<moveValue2>@move@))?\s* - the move number, 1 or 3 dots, some empty space, then a move. It can be followed directly by another move, for black. Lets call this @line@
  • (?:\{(?<varComment>[^\}]*?)\}\s*)? - this is a comment match, something enclosed in curly brackets; we'll call it @comment@
  • (?:@line@@variations@@comment@)* - wow, so simple! Multiple lines, each maybe followed by variations and a comment. This would be a @list@ of moves.
  • (?<endMarker>1\-?0|0\-?1|1/2\-?1/2|\*)?\s* - this is the end marker of a game. It should be there, but in some cases it is not. It shows the final score or an unfinished match. We'll call it @ender@
  • (?<pgnGame>\s*@tags@@list@@ender@) - The final tokenised regular expression, containing an entire PGN game.


But it is not over yet. Remember @variations@ ? We did not define it and with good reason. A good approximation would be (?:\((?<variation>.*)\)\s*)*, which defines something enclosed in parenthesis. But it would not work well. Regular expressions are greedy by default, so it would just get the first round bracket and everything till the last found in the file! Using the non greedy marker ? would not work either, as the match will stop after the first closing bracket inside a variation. Comments might contain parenthesis characters as well.

The only solution is to better match a variation so that some sort of syntax checking is being performed. We know that a variation contains a list of moves, so we can use that, by defining @variations@ as (?:\((?<variation>@list@)\)\s*)*. @list@ already contains @variations@, though, so we can do this a number of times, to the maximum supported branch depth, then replace the final variation with the generic "everything goes" approximation from above. When we read the results of the match, we just take the variation matches and reparse them with the list subexpression, programatically, and check extra syntax features, like the number of moves being subsequent.

It is no wonder that at the Regular Expressions Library site there was no expression for PGN. I made the effort to upload it, maybe other people refine it and make it even better. Here is the link to the uploaded regular expression. The complete regular expression is here:
(?<pgnGame>\s*(?:\[\s*(?<tagName>\w+)\s*"(?<tagValue>[^"]*)"\s*\]\s*)+(?:(?<moveNumber>\d+)(?<moveMarker>\.|\.{3})\s*(?<moveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<moveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\(\s*(?<variation>(?:(?<varMoveNumber>\d+)(?<varMoveMarker>\.|\.{3})\s*(?<varMoveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<varMoveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\((?<varVariation>.*)\)\s*)?(?:\{(?<varComment>[^\}]*?)\}\s*)?)*)\s*\)\s*)*(?:\{(?<comment>[^\}]*?)\}\s*)?)*(?<endMarker>1\-?0|0\-?1|1/2\-?1/2|\*)?\s*)


Note: the flavour of the regular expression above is .Net. Javascript does not support named tags, the things between the angle brackets, so if you want to make it work for js, remove ?<name> constructs from it.

Now to work on the actual javascript (ouch!)

Update: I took my glorious regular expression and used it in a javascript code only to find out that groups in Javascript do not act like collections of found items, but only the last match. In other words, if you match 'abc' with (.)* (match as many characters in a row, and capture each character in part) you will get an array that contains 'abc' as the first item and 'c' as the second. That's insane!

Update: As per Matty's suggestion, I've added the less used RxQ move syntax (I do have a hunch that it is not complete, for example stuff like RxN2, RxNa or RxNa2 might also be accepted, but they are not implemented in the regex). I also removed the need for at least one PGN tag. To avoid false positives you might still want to use the + versus the * notation after the tagName/tagValue construct. The final version is here:

(?<pgnGame>\s*(?:\[\s*(?<tagName>\w+)\s*"(?<tagValue>[^"]*)"\s*\]\s*)*(?:(?<moveNumber>\d+)(?<moveMarker>\.|\.{3})\s*(?<moveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<moveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\(\s*(?<variation>(?:(?<varMoveNumber>\d+)(?<varMoveMarker>\.|\.{3})\s*(?<varMoveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<varMoveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\((?<varVariation>.*)\)\s*)?(?:\{(?<varComment>[^\}]*?)\}\s*)?)*)\s*\)\s*)*(?:\{(?<comment>[^\}]*?)\}\s*)?)*(?<endMarker>1\-?0|0\-?1|1/2\-?1/2|\*)?\s*)

The Regexlib version has also been updated in a comment (I don't know how - or if it is possible - to edit the original).

There was this FTP surrogate program that used SQL as a filesystem. I needed to store the size of the file (which was an HTML template and was stored as NTEXT) in the row where the content was stored. The problem is that the size of a text in a Microsoft SQL Server NTEXT column is about two bytes per character, while the actual size of the content, stored web like in UTF8, was different to almost half.

I thought that there must be an easy way to compute it, trying to cast the string to TEXT then using LEN, trying DATALENGTH, BINARY, etc. Nothing worked. In the end I made my own function, because the size of a string in UTF8 is documented on the Wikipedia page of that encoding: 1 byte for ASCII characters (character code<128), 2 bytes for less than 2048, 3 for 65536 and 4 for the rest. So here is the sql function that computes the size in UTF8:

CREATE FUNCTION [fn_UTF8Size]
(
@text NVARCHAR(max)
)
RETURNS INT
WITH SCHEMABINDING
AS

BEGIN
DECLARE @i INT=1
DECLARE @size INT=0
DECLARE @val INT
WHILE (@i<=LEN(@text))
BEGIN

SET @val=UNICODE(SUBSTRING(@text,@i,1))

SET @size=@size+
CASE
WHEN @val<128 THEN 1
WHEN @val<2048 THEN 2
WHEN @val<65536 THEN 3
ELSE 4
END
SET @i=@i+1

END

RETURN @size
END


A similar approach would work for any other encoding.

I was updating the content of a span element via AJAX when I noticed that the content was duplicated. It was no rocket science: get element, replace content with the same thing that was rendered when the page was first loaded, just with updated values. How could it be duplicated?

Investigating the DOM in the browser (any browser, btw) I've noticed something strange: When the page was first loaded, the content was next to the container element, not inside it. I've looked at the page source, only to see that it was, by all counts, correct. It looked something like this:
<p><div><table>...</table></div></p>
. The DOM would show the div inside the paragraph element and the table as the sibling of the paragraph element. The behavior was fixed if the page DOCTYPE was changed from XHTML into something else.

It appears that block elements should not be inside layout elements, like p. The browsers are attempting to "fix" this problem and so they change the DOM, assuming that if a table starts inside the paragraph, then you must have forgotten to close the paragraph. If I was adding it via ajax, the browser did not seem to want to fix the content in any way, as I was manipulating the DOM directly and there was no parsing phase.