I had this operation on a Javascript object that was using a complex regular expression to test for something. Usually, when you want to do that, you use the regular expression inline or as a local variable. However, given the complexity of the expression I thought it would be more efficient to cache the object and reuse it anytime.

Now, there are two gotchas when using regular expressions in Javascript. One of them is that if you want to match on a string multiple times, you need to use the global flag. For example the code
var reg=new RegExp('a',''); //the same as: var reg=/a/;
alert('aaa'.replace(reg,'b'));
will alert 'baa', because after the first match and replace, the RegExp object returns from the replace operation. That is why I normally use the global flag on all my regular expressions like this:
var reg=new RegExp('a','g'); //the same as: var reg=/a/g;
alert('aaa'.replace(reg,'b'));
(alerts 'bbb')

The second gotcha is that if you use the global flag, the lastIndex property of the RegExp object remains unchanged for the next match. So a code like this:
var reg=new RegExp('a',''); //same as: /a/;
 
reg.test('aaa');
alert(reg.lastIndex);
 
reg.test('aaa');
alert(reg.lastIndex);
will alert 0 both times. Using the global flag will lead to alerting 1 and 2.

The problem is that the solution to the first gotcha leads to the second like in my case. I used the RegExp object as a field in my object, then I used it repeatedly to test for a pattern in more strings. It would work once, then fail, then work again. Once I removed the global flag, it all worked like a charm.

The moral of the story is to be careful of constructs like _reg.test(input);
when _reg is a global regular expression. It will attempt to match from the index of the last match in any previous string.


Also, in order to use a global RegExp multiple times without redeclaring it every time, one can just manually reset the lastIndex property : reg.lastIndex=0;

Update: Here is a case that was totally weird. Imagine a javascript function that returns an array of strings based on a regular expression match inside a for loop. In FireFox it would return half the number of items that it should have. If one would enter FireBug and place a breakpoint in the loop, the list would be OK! If the breakpoint were to be placed outside the loop, the bug would occur. Here is the code. Try to see what is wrong with it:
types.forEach(function (type) {
if (type && type.name) {
var m = /(\{tag_.*\})/ig.exec(type.name);
// type is tag
if (m && m.length) {
typesDict[type.name] = m[1];
}
}
});
Click here to see the answer

and has 0 comments

I was looking for autobiographies, since I liked quite a few of them lately and I felt like more, and so I got two. One is interesting because it is finally in print after 100 years since the author's death. I am talking about the first volume of Mark Twain's biography. However, I really could not make myself read it: the language was so pompous and the content so lame that I felt pain trying to.

Not so the second book, which seemed even more unlikely for me to like it: THE PLEASURES OF STATISTICS: The Autobiography of Frederick Mosteller, but which I did. It started with a few projects that Fred Mosteller participated in, explaining the day to day concerns and situations of a statistician while working on them. I thought at first that the book is going to be all like this, so after about a third I was about to abandon the read. You see, it was all very interesting from a professional statistician's point of view, but I wanted the more personal viewpoint of the man. And so I got it. Suddenly the book changed pace and went with the early life and education of Mosteller. The end of the book again covered some cases of work, but this time with a personal touch that explained the motivation behind the acts. And finally, the editor's epilogue, written from testimonies of friends and colleagues.

In this review, a Theodore M. Porter argues that the autobiography was flawed, as it covered little of his family life and couldn't reconcile the different viewpoints that appeared in the book, like the scientific and personal. But I disagree. The autobiography was unfinished and I guess the editor did the best he could with what he had, but it couldn't have been a lot different from what Mosteller himself intended. You start with the actual work: statistics, explained in layman's terms, then you continue with the actual man, explaining the origins and education, then you get back to statistics, but examining the work from the personal viewpoint of the man described. Yes, he could have written about his family more, but it wouldn't have been about statistics. The little he does write about his wife is about how supportive she was throughout his career. And yes, the tone of the book is a bit clinical, but this is how the writer actually thought like; he was a scientist in the true sense of the word and I liked this book exactly because it made me understand how such a man thinks and feels.

Even more than the structure of the book and the insight in the mind of a conscientious and brilliant scientist what I liked most is the peek at the world in the middle of the 20th century and how strikingly different it was from what we see today. The concerns of a teacher towards the best method to get his students to understand and like the subject, the way people were getting together to solve problems and worked for years on a book or bunch of science papers, the way academia was also supportive, not only political, and most of all, to see how people can be both brilliant and empathic, both clinical in science and warm in person.

I wouldn't recommend this book to everyone. I had a hard time reading it to the end and paying attention to every bit. Nor should one study it like a school manual, because as far as I see, the book is about a man's soul and you only have to understand and feel that. Whether it is because of my autobiography fad or because I resonated with the man or for some other reason, the bottom line is that I enjoyed reading the book. Maybe you will too.

Finally I have found the chess game viewer I wanted in order to publish my own PGN games in the blog! The name is Chess Tempo PGN Viewer and it is well written, fast, supports annotations and variations and is very configurable. Most of all, it is all Javascript (sorry for the occasional Java prompts. I almost caved in and did what I swore I wouldn't ever do: have Java applets in my blog).

Please tell me if you have issues with the new chess viewer.

and has 0 comments
Luckily for me, some chess videos are for beginners like me. Here is one from OnlineChessLessons, describing a simple, clear opening called The Stonewall Attack. After watching the video I played a game with my trusted Nokia phone and managed to create a game starting with this opening. I then analysed it using chess engines Houdini and Rybka and annotated it manually. Check it out, including the variations.

Video first:

Make sure you also follow the article attached to the video.

And now my game:
[Event "29/10/2011 1:08:49 pm"]
[Date "29/10/2011"]
[White "Siderite"]
[Black "Nokia Easy5"]
[Result "1-0"]
[ECO "D05"]
[Opening "Colle"]
[Variation "5.c3 Nc6 6.Nbd2 Bd6 7.O-O O-O"]
[TimeControl "600"]
[Termination "normal"]
[PlyCount "59"]
[WhiteType "human"]
[BlackType "computer"]


1.d4 e6 2.e3 d5 3.Bd3 Bd6 4.f4 Nf6 5.Nd2 O-O 6.Ngf3 c5 7.c3 Nc6 8.O-O c4 9.Bc2 Ng4
{At this point, the engines suggest Bxh7, a classic sacrifice.}
10.Qe2
{However, I moved the queen to defend e3.}
( 10.Bxh7+ Kxh7 11.Ng5+ Kg8 12.Qxg4
{This variation wins the h7 pawn, but moves away from the spirit of the original game.}
12...Qf6 13.e4 ) 10...f5
{At this point, engines suggest Ne5, followed by knight exchange from black.}
11.h3
{I chose to shoo the knight away, but weakening g3, where the knight would love to come later.}
( 11.Ne5 Ngxe5 12.dxe5 Be7 13.b4 ) 11...Nf6 12.Ne5 Qa5 13.g4 fxg4 14.hxg4 Qb6
{engines would want me to attack the knight on f6 before moving the rook out.}
15.Rf2 ( 15.g5 Bxe5 16.dxe5 Ne8 17.Kg2 g6 18.b3
{Engines decide to try a queen side attack as well, in order to weaken the black pawn chain. I was not interested in that.}
) 15...h6
{engines suggest attacking the rook and with Ng6, which is a natural attacking move and a lovely outpost.}
16.Rh2 ( 16.Ng6 Rf7 17.g5 Nh7 18.Qh5 Ne7 19.Nxe7+ Rxe7 20.gxh6 Nf8 21.Rg2 Qc7 22.Qg5
{However at this point the game moves into queen side attacks and a slower attacking pace.}
) 16...Ne7 17.g5 hxg5 18.fxg5 Bxe5 19.dxe5 Nd7
{engines suggest now a beautiful move: Rh8, followed by a munching of black pieces or/and mate. Make sure you check out the variation.}
20.Nf3 ( 20.Rh8+ Kf7 ( 20...Kxh8
{Taking the rook leads to a quick mate.}
21.Qh5+ Kg8 22.Bh7+ Kh8 23.Bg6+ Kg8 24.Qh7# ) 21.Qh5+ g6 22.Bxg6+ Nxg6 23.Qh7+ Ke8 24.Qxg6+ Kd8 25.Rxf8+ Nxf8 26.Qf6+ Ke8 27.Nf3
{Try this variation on a chess engine to see it to the end. White is only one pawn up, but it is a passed one. The king is safe as well.}
) 20...Rb8 21.g6 Nf5 22.Rh3 Nh6 23.Kh1
{I felt like the pin on e3 was annoying and stopping me from using the black bishop. The engines recommend moving Qh2 instead, which is much better.}
( 23.Qh2 Nc5 24.Ng5 Ne4 25.Bxe4 dxe4 26.Nh7 Qc7 ( 26...Rd8
{If you wanted to know why black did not move the rook when attacked by the knight, follow this variation through.}
27.Rxh6 Rd1+ 28.Kg2 Qd8 29.Nf6+ Kf8 30.Rh8+ Ke7 31.Qh4 Rd2+ 32.Bxd2 b6 33.a4 Qxd2+ 34.Kh1 Qe1+ 35.Rxe1 Bb7 36.Rxb8 Bc8 37.Rxc8
gxf6 38.Qh7# ) 27.Nxf8 Qd8 28.Rxh6 Qg5+ 29.Kf2 Qxh6 30.Qxh6 gxh6 31.Nh7
{At this point white is a knight up, but what a boring continuation.}
) 23...Nf5 24.Qh2 Ng3+
{Engines suggest I move the king and concentrating on the attack, but I took the knight with the rook.}
25.Rxg3 ( 25.Kg1 Qxe3+ 26.Bxe3 Ne2+ 27.Kg2 Nf4+ 28.Bxf4 Rf5 29.Rh8# ) 25...Re8 26.Qh7+ Kf8 27.e4
{At this point, a mate in 8 is found.}
27...Qc7
{But with this move, mate will happen in 4.}
28.Bg5 Nf6 29.exf6 Rd8 30.Qh8# 1-0


Enjoy!

P.S. I am currently looking for a method of displaying the game as I want it on the blog: dynamic, with annotation and variation support, preferably something that is not Java and optimally something that reads the PGN from a span and replaces it with a nice looking chess interface. Right now, the usual game engine fails for some reason I need to analyse.

I've had a horrible week. It all started with a good Scrum sprint (or so I thought) followed by a period of quiet in which I could concentrate on my own ideas. And one of my ideas was to optimize the structure of the solution we work on, containing 48 projects, in order to save space and compilation time. In my eyes, I was a hero, considering that for a company with tens to hundreds of devs, even a one second increase in speed would be important. So, I set up doing that.

Of course, the sprint was not as good as I had imagined. A single stored procedure led to not less than four bugs in production, with me being to blame for them all. People lost more time working on reproducing the bugs, deploying the fix, code reviewing, etc. At long last I thought I was done with it and I could show everyone how great the solution looked now (on my computer) and atone for my sins.

So from a solution that spanned from 700Mb clean and 4Gb after compilation, I managed to get it to a maximum of 1.4Gb. In fact, it was so small I could put it all in a Ram disk, leading to enormous speeds. In comparison, a normal drive goes to about 30MB per second, an SSD drive (without encryption) goes to about 250MB/s, while my RamDisk was running at a whooping 3.6GB/s. That sped up the compilation and parsing of files. Moreover, I had discovered that MsBuild has this /m parameter that makes it use more processors. A compilation would go to about 40 seconds, down from two minutes and a half. Great! Alas, it was not to be so easy.

First of all, the steps I was considering were simple:
  • Take all projects and make them have a single output folder. That would decrease the size of the solution since there would be no copies of the .dll files, Then the sheer speed of the compilation would have to increase, since there would be less copying and less compilation.
  • More importantly, I was considering making a symlink to a RAM drive and using it instead of the destination folder.
  • Another step I was considering was making all references to the dll files in the output folder, not to the projects, allowing for projects to be opened independently.


At first I was amazed the solution decreased in size so much and I just placed the entirety of it into a RAM drive. This fixed some of the issues with Visual Studio, because when I was selecting a file through a symlink to add as a reference, it would resolve to the target folder instead of the name of the symlink. And it was't easy either. Imagine removing all project references and replacing them with dll references for 48 projects. It took forever.

Finally I had the glorious compilation. Speed, power, size, no warnings either (since I also worked on that) and a few bug fixes thrown in there for good measure. I was a god! Then the problems appeared.

Problem 1: I had finished the previous sprint with a buggy stored procedure committed to production. Clients were losing money and complaining. That put a serious dent in my pride, especially since there were multiple problems coming from both less attention to how I wrote the code to downright lack of knowledge of the flow of the application. For the last part I am not really the only one to blame, but it was my responsibility.

Problem 2: The application was throwing some errors about the target framework of a dll. It was enough to make me understand a major flaw in my design: there were .Net 3.5 and .Net 4.0 assemblies in the solution and placing them all in the same output folder would break some build scripts. Even worse, the 8 web projects in the solution needed to have their output in the bin folder, so that IIS would find them. Fixed it only to see the size of the solution rise back to 3Gb.

Problem 3: Visual Studio would not be so smart as to understand that if a project is loaded, going to the declaration of a member in the compiled assembly means I want to see the actual source, not the IL code. Well, sometime it worked, but sometimes it didn't. As a result I restored the project references instead of the assembly references.

Problem 4: the MsBuild /m flag would do wonders on my machine, but it would not do much on the build server. Nor would it do its magic on slower, less multiprocessor computers than my own.

Problem 5: Facing a flood of problems coming from me, my colleagues lost faith and decided to not even try the modifications that removed the compilation warnings from the solution.

Conclusion: The build went marginally faster, but not enough to justify a whole week of work on it. The size decreased by 25%, making it feasible to put it all in a RAM Drive, so that was great, to the detriment of working memory. I still have to see if that is a good or a bad thing. The multiprocessor hacks didn't do much, the warnings are still there and even some of my bug fixes were problematic because someone else also worked on them and didn't tell anyone. All in a week's work.

Things I have learned from all this: Baby steps. When I feel enthusiasm, I must take it as a sign of trouble. I must be dispassionate as an ice cube and think things through. If I am working on a branch, integrate the trunk into it every day, so as to not make it harder to do at the end. When doing something, do it from start to finish, no matter what horrors I see while doing it. Move away from Sodom and not look back at it. Someone else will fix that, maybe, you just do your task well. When finishing something, commit it into the source control so it can easily be reverted through a single atomic operation.

It is difficult to me to adjust to something that involves this amount of planning and focus. I feel as if the chaotic development years of my youth were somewhat better, even if at the time I felt that it was stupid and focus and planning was needed. As a good Romanian, I am neurotic enough to see the worst side of everything, master at complaining about it, but incapable of actually doing something. Yeah... this was a bad week.

and has 0 comments
No news from my personal or work fields. However, I've found two interesting news just today and I wanted to share them.

First, the development of a camera to capture pictures that you can focus later. Although I have heard of solid metal lenses that would be less than 1$ to make and would achieve the same effect, the only actually functioning system I've heard of so far is the Lytro Living Picture camera. Here is an Ars Technica article on it and here is a YouTube video demo.

The second news is more IT related. It involves the cryptographic standards for XML, as defined by W3C. They failed! Here is an article about how they were cracked by using a vulnerability in the Cipher Block Chaining and here is a link to their press release.

and has 0 comments

The complete title of the book is How Life Imitates Chess: Making the Right Moves, from the Board to the Boardroom, which is a mouthful, but very precise. It does explain how principles of chess, economics and politics apply in all three fields and how Garry Kasparov has evolved from chess player to world champion and, nowadays, into a political anti-Putin figure.

If you ask me, abandoning chess to get into business and worse, politics, it's a complete loss. However I do understand the guy, he got bored. Someday I may abandon computer programming.

Back to the book, though, it felt a lot like The Art of Learning, also written by a brilliant chess player (who incidentally also abandoned chess... hmm). It was more precise, most logical, though, looking at things from a more of a clinical perspective. I would have wanted to learn more about Kasparov's relationship with Karpov, for example, since he is always calling him his nemesis, but never says anything about how he felt about the guy.

This book is peppered with good advice, historic comparisons and great quotes from chess players and great men. Also, short descriptions of the relationship between famous "chess pairs" are giving the book an extra chess dimension. All in all I recommend it highly, although it felt more like a useful reference than a soul book like The Art of Learning.

and has 0 comments
I really wanted to wait a little longer before writing this post, but the shere number of series that appeared forced my hand. So here it goes:

  • Doctor Who - As I was saying before, the show is slightly darker now, with the Doctor dying and River Song being somehow involved... It's kind of fun. The new actor, Math Smith, has matured in the show and exudes more confidence on the set.

  • Torchwood - Continuing the format of the third "season", that of a continuous miniseries, season 4 explores dark cabals that manipulate the world and use a device to make everyone immortal. Fascist like camps for people who would normally be dead are created, complete with crematoria. Some interesting ideas there, but the acting and scripting were pretty poor.

  • The Sarah Jane Adventures - I was surprised to see that the new season of Sarah Jane Smith started with Elisabeth Sladen still in the lead role. As you may know, the actress succumbed to cancer, so these are previously filmed episodes. Sarah Jane adopted new alien child, a girl this time, and she has the curious ability to grow spontaneously. Maybe she will become the next Sarah Jane? The show is pretty nice and doesn't seem so childish anymore. It is almost on par with its progenitor show, Doctor Who

  • Eureka - It continues to be brainless fun, a show basically held together by the lead actor, Colin Ferguson. As a funny side note, The Guild's very own (and smoking hot) Felicia Day together with show colleague (and Wesley Crusher from Star Trek) Wil Wheaton. They both act like their Guild characters, too :)

  • House MD - The eight season gets rid of useless Cuddy and gains a small Asian girl prodigy who stands up to House and gains his respect. We know she is going to leave dejected and heartbroken, as all the others did, but at least she's a change from the stuck up bimbo role that usually filled the gap. She is not smoking hot, though, and that is an issue that needs resolving!

  • Criminal Minds - I couldn't get myself to watch any of the episodes in season 7. It is basically the lack of time that is the cause. I will get to it eventually.

  • Dexter - The show is back on track with the usual Dexter, lovable and deadly. As opposed to the fourth book in the series, this season seems to explore religious cultism rather than cannibalism. Works for me. BSG's Edward James Olmos is the bad guy here, joined by Colin Hanks, as his acolyte.

  • Fringe - Peter Bishop has healed the wound between the two universes, but erased himself from existence in the process. Or has he? His father and Olivia keep getting glimpses of him. Same old show, in rest. The weird experimentation at the end of the last season seems to have ended. I don't know if that is good or bad.

  • True Blood - In the end it was still satisfying to watch. Not great, but certainly entertaining. Tara (finally) dies, as well as the annoying werewolf bitch, the pathetic old witch and the ridiculous gay nurse guy. A culling of all obnoxious characters can be good.

  • Men of a Certain Age - I keep nagging the wife to keep watching the show, as we started watching it together, but she keeps delaying. She doesn't want me to watch it alone either. Girls!

  • Weeds - After 7 years, the hot MILF Mary-Louise Parker is still hot! Anything else about the show kind of sucks, though. Same old convoluted plots that somehow revolve around marijuana, but have ceased being about the light drug for quite some time. I am still watching it because of the MILF factor, naturally.

  • The Good Wife - As Aron Nimzowitsch used to say, the threat is stronger than the execution. The fact that Alicia is finally banging Will removes the thrill of the sexual tension between them. The personal issues of the characters also begin to take a larger role than actual court cases. That is not good. The show is still entertaining, but has lost some of its appeal (pun not intended)

  • Haven - still silly, but now has gotten more confrontational, with "troubled" being discriminated against by normal people. Also we learn that Audrey is not really Audrey, nor is she the woman that looked like her and was in the town a few decades ago. Instead, it seems she appears whenever the troubles start and she always has the memories of someone else.

  • Royal Pains - I haven't watched it and I don't miss it. The wife doesn't seem to consider it either. It's personally cancelled.

  • Lost Girl - A new Ash has been chosen, one that is less lenient than the one before and who is quite annoying. There is a sort of cold war between him and our lovely succubus. An old girlfriend of Dyson also appears. Wrraawr! (or is it Growl! ?) The show is fun enough.

  • Nikita - The second season has started with gorgeous Amanda now heading Division. Except the ridiculously good looking actors, the show's got nothing for me anymore. I am discontinuing it from my show list.

  • Falling Skies - Still waiting for season 2.

  • Southpark - I've just watched the third episode from season 15. It is the only one that was funny from the season. I hope this is the beginning of a trend.

  • The Killing - Still on my watch list, didn't get around to it.

  • Mortal Kombat Legacy - The web show stopped at episode 9, just when it was getting interesting. Oh, well...

  • Suits - The very light courtroom male show is still fun to watch, although it has become sort of repetitive. The betrayed friend that turns on the main character is both exaggerated and an obvious season end cliff hanger.

  • Camelot - Claire Forlani and Eva Green, free boobs or not, are not enough to make me watch this series. Sorry, girls!

  • Wilfred - Unwatched episodes are rotting in my view list and I could not force myself to watch them. I guess this is another show I will stop watching.
  • Breaking Bad - A full season is waiting for me to watch. Whenever I had the time, I didn't have the mood to watch it. Maybe it is still fun, but I don't know yet. I fear disappointment.

  • Californication - Waiting for the next season.


And now for new shows:

  • King - Beautiful police woman... bla bla bla. I can google the actress and drool at her pictures instead of watching this average police drama. Not a new show, because I mentioned it in the last post, but it is part of a larger group of "do not want"s

  • The Protector - Blonde police woman protects the city. Yeah, right.

  • Against the Wall - Blonde police woman takes a job in Internal Affairs, much to the dismay of her policeman father and policemen three brothers, who feel she switched sides. Some funny situations and an interesting concept kept me watching this for a few episodes, but that's about it.

  • Prime Suspect - Blonde police woman fights to gain acceptance in a misogynistic crime unit. The cases are brutal, so is her mistreatment from her colleagues. That would make for an interesting premise if it weren't for my lack of time and ... because she is a police woman! And like it or not, this is no misogynism, either. There are just too many shows about that at a time when I've had it up to my ears of police dramas.

  • Unforgettable - Redhead ex police woman (and smoking hot, too), can't forget anything she sees. She teams up with ex boyfriend to... solve police cases. Yes, he is a cop. If not for the lead actress, a completely forgettable show.

  • Ringer - Sarah Michelle Geller is NOT a police woman. I like that. Instead she is a police witness that runs away from testifying and/or getting killed by a drug kingpin. Her solution is to take the role of her twin sister who mysteriously commits suicide. She finds herself married to a gold digging con artist, having a interior decorator best friend, having an affair with the husband of her best friend and so on and so on. I didn't buy it. Sorry.

  • Suburgatory - The short plotline sounded interesting: a single dad moves into the suburbs with urban, mischievous teenage daughter. The execution, though, was ridiculous. One of those obvious shows that spoon feed you anything you need to think or enjoy. Pass!

  • How to be a Gentleman - Drama from Entourage was one of the lead actors. Unfortunately for him, the show got cancelled almost immediately. It was also one of those background "ha ha" things. Awful!

  • Beavis&Butthead - Yes! My childhood show is coming back. I don't know when, but when it will start it will either be disappointing or epic. I am betting on epic.

  • American Horror Story - Interesting TV series remake of The Shining. I didn't really enjoy the film and I don't like how King writes his books. I tried to watch the first episode, but stopped in the middle. Not my type of show. Also, the "horror" is one forced by sound effects and obvious film tricks. I hated it.

  • Homeland - Haven't started to watch it yet, but it sounds promising.

  • Hung - Season 2 started, but I couldn't get myself to watch it. I guess I won't watch it.

  • Lie to Me - A lot of people recommended this show. I finally started to watch it. The insights in human behaviour are very interesting, especially since they are showing pictures or videos of celebrities in similar situations to the fictive situations in the episodes. There is much to learn there. However the structure of the show is a combination of CSI and House. All too fast, trying to seem intelligent and smart, but creating a soulless mechanical feel. I will be watching it occasionally for the science alone.

  • Terra Nova - Finally a sci-fi show! Humanity has botched the future, so they come back into the past, during the age of the dinosaurs. The main character is a cop. WHY?!?! OH, WHY?!?! The show is ok, but full of clichés and the idea of a father of three fending off for his family in a colony of humans in the Cretaceous... brrrr, who the hell thought that up? The science is nil and the family issues override any decent sci-fi in the show. Too bad. I will watch it, still, but only because of lack of alternatives.

  • The Fades - By a long shot, this is the best of the new shows. It is a British horror series involving dead people trying to get back to life by eating living flesh. They are not zombies, they are Fades, and they make a lot more sense than other undead concepts. The main character is a clueless kid who happens to have special powers. Watch this, it is very nice, indeed.

  • The Field of Blood - Well, it appears as a series, but for now it was a two parter film. A heavy Glasgow accent makes this hard to watch with no translation, but a nice film anyways. A sort of journalist drama in the 80's, with crimes and family issues and rampant religious ideology.

  • The Secret Circle - A tv series remake of The Craft, as far as I see, with good looking teens being... witches! At least they are not vampires. The out of control member of the circle doesn't come even close to the beautiful craziness of Fairuza Balk. Baby, I still love you! Btw, the show is not worth watching, better rewatch The Craft a few times.

  • Hidden - Not watching it yet, but it is on my list


Upcoming:

  • Grimm - A fantasy series related to fairy tales. To my chagrin this is what Wikipedia says: The show has been described as "a cop drama—with a twist...a dark and fantastical project about a world in which characters inspired by Grimm's Fairy Tales exist". Cop drama? Really?!

  • Once Upon a Time - Another fairy tale series, with actresses from House and Big Love and whatever shows ended or from which people left.

  • Hell on Wheels - a Western tv series. Could that work? We'll have to see



I have been pruning away from the TV series I have been watching, in order to gain some control over my personal life. But whenever I do that, more TV shows appear. It's like I need a hot redhead police woman to investigate this serial murders of my time and bring the culprits down in a climactic ending. But with my luck, I will probably find myself in cliff hangers at the end of each season!

and has 0 comments
A question arose at the office today: What is faster? Using a Dictionary<string,object> with a StringComparer.OrdinalCaseInsensitive constructor parameter or using a normal constructor call and using ToLower on the key before using it. The quick answer: using ToLower on the key.

The longer answer is that StringComparer.OrdinalCaseInsensitive implements IEqualityComparer<string> by using a native code function for GetHasCode(), which is very efficient. Unfortunately, it must use the case insensitive string comparison on both input key and stored keys, while calling ToLower on the keys before using them makes the comparison only once.

and has 1 comment
Geek Love is another title that mislead me. Geek, in this book, refers to the original definition of the word "A carnival performer who does wild or disgusting acts" and not a cool lovable geek as myself. The subject is "controversial": the life of an albino hunchbacked dwarf woman as a member of a carnival family.

There is a lot going on in the book. Carnival people poison themselves in order to make their children as freaky as possible. Said children are then raised only if they are strange enough. The failures are either preserved in glass jars if they are too mutated, or given away if they are too normal. The successes range from the main character, to a sociopath hairless cult leader with flippers instead of members, to siamese sisters that have the same lower body to a telekinetic God like child who only wants to be loved and gets manipulated into doing stuff for others. The sisters later give birth to a grotesquely obese child, fathered by a man with half his face blown up who squirts inside their vagina in his death moment. The death and the face blowing are unrelated. There is more, like a rich heiress who pays beautiful women to mutilate themselves in order to have a better life, unencumbered by the sexual desire of her subjects or of the people surrounding them.

The book itself was pretty innovative, but rather boring. If I had an alternative, I wouldn't have finished it, but as I had not, I am a bit proud of having finished it. If nothing else, the book is strange enough to be interesting. Also the writing is pretty good, introverted, taking the reader inside the mind of a person who considers normal people too bland and treasures her deformity and that of her daughter, fathered through telekinesis with the sperm of her brother.

Don't get me wrong, it was not painful reading the book and I do not regret having reading it. However I wish there was something more interesting in the story other than the strangeness of one's thoughts.

Short version: here is the link to the uploaded .NET regular expression. (look in the comment for the updated version)

I noticed that the javascript code that I am using to parse PGN chess games and display it is rather slow and I wanted to create my own PGN parser, one that would be optimal in speed. "It should be easy", I thought, as I was imagining getting the BNF syntax for PGN, copy pasting it into a parser generator and effortlessly getting the Javascript parser that would spit out all secrets of the game. It wasn't easy.

First of all, the BNF notation for Portable Game Notation was not complete. Sure, text was used to explain the left overs, but there was no real information about it in any of the "official" PGN pages or Wikipedia. Software and chess related FTPs and websites seemed to be terrible obsolete or missing altogether.

Then there was the parser generator. Wikipedia tells me that ANTLR is pretty good, as it can spew Javascript code on the other end. I downloaded it (a .jar Java file - ugh!), ran it, pasted BNF into it... got a generic error. 10 minutes later I was learning that ANTLR does not support BNF, but only its own notation. Searches for tools that would do the conversion automatically led me to smartass RTFM people who explained how easy it is to do it manually. Maybe they should have done for me, then.

After all this (and many fruitless searches on Google) I decided to use regular expressions. After all, it might make a lot of sense to have a parser in a language like C#, but the difference in speed between a Javascript implementation and a native regular expression should be pretty large, no matter how much they optimize the engine. Ok, let's define the rules of a PGN file then.

In a PGN file, a game always starts with some tags, explaining what the event is, who played, when, etc. The format of a tag is [name "value"]. There are PGN files that do not have this marker, but then there wouldn't be more than one game inside. The regular expression for a tag is: (\[\s*(?<tagName>\w+)\s*"(?<tagValue>[^"]*)"\s*\]\s*)+. Don't be scared, it only means some empty space maybe, then a word, some empty space again, then a quoted string that does not contain quotes, then some empty space again, all in square brackets and maybe followed by more empty space, all of this appearing at least once.

So far so good, now comes the list of moves. The simplest possible move looks like 1. e4, so a move number and a move. But there are more things that can be added to a move. For starters, the move for black could be following next (1. e4 e5) or a bit after, maybe if there are commentaries or variations for the move of the white player (1... e5). The move itself has a variety of possible forms:
  • e4 - pawn moved to e4
  • Nf3 - knight moved to f3
  • Qxe5 - queen captured on e5
  • R6xf6 - the rook on the 6 rank captured on f6
  • Raa8 - The rook on file a moved to a8
  • Ka1xc2 - the knight at a1 captured on c2
  • f8=Q - pawn moved to f8 and promoted to queen
  • dxe8=B - pawn on the d file captured on e8 and promoted to bishop


There is more information about the moves. If you give check, you must end it with a + sign, if you mate you end with #, if the move is weird, special, very good, very bad, you can end it with stuff like !!, !?, ?, !, etc which are the PGN version of WTF?!. And if that is not enough, there are some numbers called NAG which are supposed to represent a numeric, language independent, status code. Also, the letters that represent the pieces are not language independent, so a French PGN might look completely different from an English one. So let's attempt a regular expression for the move only. I will not implement NAG or other pieces for non-English languages: (?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?). I know, scary. But it means a letter in the list PNBRQK, one for each possible type of chess piece, which may appear or it may not, then a letter between a and h, which would represent a file, then a number between 1 and 8 which would represent a rank. Both letter and number might not appear, since they represent hints on where the piece that moved was coming from. Then there is a possible letter x, indicating a capture, then, finally, the destination coordinates for the move. There follows an equal sign and another piece, in case of promotion. An astute reader might observe that this also matches a rook that promotes to something else, for example. This is not completely strict. If this long expression is not matched, maybe something that looks like OO, O-O, OOO or O-O-O could be matched, representing the two possible types of castling, when a rook and a king move at the same time around each other, provided neither had not moved yet. And to top it off, we allow for some empty space and the characters ! and ? in order to let chess annotators express their feelings.

It's not over yet. PGN notation allows for commentaries, which are bits of text inside curly brackets {what an incredibly bad move!} and also variations. The variations show possible outcomes from the main branch. They are lists of moves that are enclosed in round brackets. The branches can be multiple and they can branch themselves! Now, this is a problem, as regular expressions are not recursive. But we only need to match variations and then reparse them in code when found. So, let's attempt a regular expression. It is getting quite big already, so let's add some tokens that can represent already discussed bits. I will use a @ sign to enclose the tokens. Here we go:
  • @tags@ - we will use this as a marker for one or more tags
  • @move@ - we will use this as a marker for the complicated move syntax explained above
  • (?<moveNumber>\d+)(?<moveMarker>\.|\.{3})\s*(?<moveValue>@move@)(?:\s*(?<moveValue2>@move@))?\s* - the move number, 1 or 3 dots, some empty space, then a move. It can be followed directly by another move, for black. Lets call this @line@
  • (?:\{(?<varComment>[^\}]*?)\}\s*)? - this is a comment match, something enclosed in curly brackets; we'll call it @comment@
  • (?:@line@@variations@@comment@)* - wow, so simple! Multiple lines, each maybe followed by variations and a comment. This would be a @list@ of moves.
  • (?<endMarker>1\-?0|0\-?1|1/2\-?1/2|\*)?\s* - this is the end marker of a game. It should be there, but in some cases it is not. It shows the final score or an unfinished match. We'll call it @ender@
  • (?<pgnGame>\s*@tags@@list@@ender@) - The final tokenised regular expression, containing an entire PGN game.


But it is not over yet. Remember @variations@ ? We did not define it and with good reason. A good approximation would be (?:\((?<variation>.*)\)\s*)*, which defines something enclosed in parenthesis. But it would not work well. Regular expressions are greedy by default, so it would just get the first round bracket and everything till the last found in the file! Using the non greedy marker ? would not work either, as the match will stop after the first closing bracket inside a variation. Comments might contain parenthesis characters as well.

The only solution is to better match a variation so that some sort of syntax checking is being performed. We know that a variation contains a list of moves, so we can use that, by defining @variations@ as (?:\((?<variation>@list@)\)\s*)*. @list@ already contains @variations@, though, so we can do this a number of times, to the maximum supported branch depth, then replace the final variation with the generic "everything goes" approximation from above. When we read the results of the match, we just take the variation matches and reparse them with the list subexpression, programatically, and check extra syntax features, like the number of moves being subsequent.

It is no wonder that at the Regular Expressions Library site there was no expression for PGN. I made the effort to upload it, maybe other people refine it and make it even better. Here is the link to the uploaded regular expression. The complete regular expression is here:
(?<pgnGame>\s*(?:\[\s*(?<tagName>\w+)\s*"(?<tagValue>[^"]*)"\s*\]\s*)+(?:(?<moveNumber>\d+)(?<moveMarker>\.|\.{3})\s*(?<moveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<moveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\(\s*(?<variation>(?:(?<varMoveNumber>\d+)(?<varMoveMarker>\.|\.{3})\s*(?<varMoveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<varMoveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?[a-h][1-8](?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\((?<varVariation>.*)\)\s*)?(?:\{(?<varComment>[^\}]*?)\}\s*)?)*)\s*\)\s*)*(?:\{(?<comment>[^\}]*?)\}\s*)?)*(?<endMarker>1\-?0|0\-?1|1/2\-?1/2|\*)?\s*)


Note: the flavour of the regular expression above is .Net. Javascript does not support named tags, the things between the angle brackets, so if you want to make it work for js, remove ?<name> constructs from it.

Now to work on the actual javascript (ouch!)

Update: I took my glorious regular expression and used it in a javascript code only to find out that groups in Javascript do not act like collections of found items, but only the last match. In other words, if you match 'abc' with (.)* (match as many characters in a row, and capture each character in part) you will get an array that contains 'abc' as the first item and 'c' as the second. That's insane!

Update: As per Matty's suggestion, I've added the less used RxQ move syntax (I do have a hunch that it is not complete, for example stuff like RxN2, RxNa or RxNa2 might also be accepted, but they are not implemented in the regex). I also removed the need for at least one PGN tag. To avoid false positives you might still want to use the + versus the * notation after the tagName/tagValue construct. The final version is here:

(?<pgnGame>\s*(?:\[\s*(?<tagName>\w+)\s*"(?<tagValue>[^"]*)"\s*\]\s*)*(?:(?<moveNumber>\d+)(?<moveMarker>\.|\.{3})\s*(?<moveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<moveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\(\s*(?<variation>(?:(?<varMoveNumber>\d+)(?<varMoveMarker>\.|\.{3})\s*(?<varMoveValue>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?)(?:\s*(?<varMoveValue2>(?:[PNBRQK]?[a-h]?[1-8]?x?(?:[a-h][1-8]|[NBRQK])(?:\=[PNBRQK])?|O(-?O){1,2})[\+#]?(\s*[\!\?]+)?))?\s*(?:\((?<varVariation>.*)\)\s*)?(?:\{(?<varComment>[^\}]*?)\}\s*)?)*)\s*\)\s*)*(?:\{(?<comment>[^\}]*?)\}\s*)?)*(?<endMarker>1\-?0|0\-?1|1/2\-?1/2|\*)?\s*)

The Regexlib version has also been updated in a comment (I don't know how - or if it is possible - to edit the original).

and has 0 comments
There is this childish game called "cordless phone", which funny enough is older than any possible concept of wireless telephony, where in a large group of people a message is sent to someone else by whispering it to your neighbour. Since humans are not network routers, small mistakes creep up in the message as it is copied and resent (hmm, there should be a genetic reference here somewhere as well).

The point is that, given enough people with their own imperfections and/or agendas, a message gets distorted as the number of middle men increases. It also happens in the world of news. Some news company invests in news by paying investigative reporters. The news is created by a human interpreting things from eye witness accounts to scientific papers, but then it is reported by other news agencies, where the original information is not the main source, but the previous news report. Then marketing shows its ugly head, as the titles need to be shockier, more impressive, forcing the hapless reader to open that link, pick up that paper, etc. Occasionally there are translations errors, but mostly it is about idiots who don't and can't understand what they are reporting on, so the original message gets massacred!

So here is one of the news of today, re-reported by Romanian media, after translation and obfuscation and marketization (and retranslation by me, sorry): "Einstein was wrong? A particle that is travelling at more than the speed of light has been discovered". In the body, written a little better, "Elementary subatomic particle" got translated as "Elementary particle of matter". Dear "science" reporters, the neutrino is not a particle that needed discovering and it is not part of normal matter, with which it interacts very little. What is new is just the strange behaviour of the faster than light travel, which is only hinted by some data that may be or not be correct and refuted by some other, like supernova explosions, information that you haven't even bothered to copy paste into your article. And, as if this was not enough, the comments of the readers, kind of like myself ranting here probably, are making the reporter seem brilliant in comparison.

Is there a solution? Not really. People should try to find the original source of messages as much as possible, or at least a reporting source that is professional enough to not skew the information too much when summarizing it for the general public. A technical solution could work that would analyse news reports, group them per topic, then remove copies and translations, red flag emotional language or hidden divergent messages and ignore the titles altogether, maybe generate new ones. And while I know this is possible to do, it would be very difficult (but possibly rewarding) as software goes. One thing is for certain: reading the titles and assuming that they correctly summarize the complete articles is a terrible mistake, alas, one that is very common.

There was this FTP surrogate program that used SQL as a filesystem. I needed to store the size of the file (which was an HTML template and was stored as NTEXT) in the row where the content was stored. The problem is that the size of a text in a Microsoft SQL Server NTEXT column is about two bytes per character, while the actual size of the content, stored web like in UTF8, was different to almost half.

I thought that there must be an easy way to compute it, trying to cast the string to TEXT then using LEN, trying DATALENGTH, BINARY, etc. Nothing worked. In the end I made my own function, because the size of a string in UTF8 is documented on the Wikipedia page of that encoding: 1 byte for ASCII characters (character code<128), 2 bytes for less than 2048, 3 for 65536 and 4 for the rest. So here is the sql function that computes the size in UTF8:

CREATE FUNCTION [fn_UTF8Size]
(
@text NVARCHAR(max)
)
RETURNS INT
WITH SCHEMABINDING
AS

BEGIN
DECLARE @i INT=1
DECLARE @size INT=0
DECLARE @val INT
WHILE (@i<=LEN(@text))
BEGIN

SET @val=UNICODE(SUBSTRING(@text,@i,1))

SET @size=@size+
CASE
WHEN @val<128 THEN 1
WHEN @val<2048 THEN 2
WHEN @val<65536 THEN 3
ELSE 4
END
SET @i=@i+1

END

RETURN @size
END


A similar approach would work for any other encoding.

I was updating the content of a span element via AJAX when I noticed that the content was duplicated. It was no rocket science: get element, replace content with the same thing that was rendered when the page was first loaded, just with updated values. How could it be duplicated?

Investigating the DOM in the browser (any browser, btw) I've noticed something strange: When the page was first loaded, the content was next to the container element, not inside it. I've looked at the page source, only to see that it was, by all counts, correct. It looked something like this:
<p><div><table>...</table></div></p>
. The DOM would show the div inside the paragraph element and the table as the sibling of the paragraph element. The behavior was fixed if the page DOCTYPE was changed from XHTML into something else.

It appears that block elements should not be inside layout elements, like p. The browsers are attempting to "fix" this problem and so they change the DOM, assuming that if a table starts inside the paragraph, then you must have forgotten to close the paragraph. If I was adding it via ajax, the browser did not seem to want to fix the content in any way, as I was manipulating the DOM directly and there was no parsing phase.

I have been reviewing my blog posts for the last few months and I noticed a troubling trend: a lot more social commentary and hobby related stuff than actual tech work. Check out this statistic of posts in the last three months:
  • TV and Movie: 5
  • Books: 6
  • Personal or hobby: 6
  • Social commentary: 1
  • Tech: 8
8 is marginally more than 6, but split them between misc and programming and you get 18 misc for 10 programming (with some overlapping). And consider that two of the tech posts were attempts to fix something that did not work so well.

What does this mean? Do I not learn new stuff at work? Am I not interested in tech work anymore? Am I working too much and not having time to blog? Well, it is a bit of all. I am interested in tech work, but right now I am fighting to adapt to the new job. I am learning new stuff, but that is mostly office related than new frontiers of programming. And I am a bit tired as well.

I have been thinking of cool tech stuff to share with you at least in this post, but I could find none. I am reading a lot of blogs with new information about stuff ranging from Windows 8, .Net 5, the future of C# and Visual Studio to videos of Vesta, things that verge on proving the dark matter model is wrong and amazing BIOS rootkits, but that is not what I am doing.

So let me summarize the technical state of my work so far:
  • Scrum - my workplace uses Scrum as a development practice and invests a lot in maintaining the quality of its implementation. I've learned a lot about the advantages, but also the disadvantages of the practice (there is nothing as annoying as an Outlook alert that you need to do the daily scrum meeting when you are concentrated on a task)
  • Visual Basic - as the original application that was bought by my employing company 5 years ago was written in Visual Basic, large portions of it are still VB. That only proves my point that refactoring code should be a priority, not a nice to have option. I wonder how many developing hours, research hours and hair roots could have been saved if the company would have invested in moving the application to a readable and canonical code form. I also wonder if the guy that invented Visual Basic is now burning in hell, as so many devs with whom I've talked about VB seem to want.
  • Visual Basic - it just deserves two bullet points, for the bullet reason only at least. Also, try converting C# generic and lambda expression code to Visual Basic. Hilarious!
  • Computing power - I am now working on a laptop that has a Quad Core I7 processor, 8Gb of RAM and a Solid State Drive. And I still want it 10 times faster. It seems to me that computing power is only keeping up with the size of the software projects and the complexity of the tools used to develop them, so that the total compile time for a project remains constant. Also, if for some reason the company issues you with a computer powerful enough to break the constant, they also need to enforce drive encryption as to compensate.
  • Continuous Integration and Unit Testing - it gives one a good feeling of comfort to know that after "it works on my machine", the source control server can compile, test and run the software successfully (while you are working at something else, no less).
  • Software Patterns - there are people who can think and visualize software patterns. They can architect any piece of code and make it really neat. However, it now seems to me that an over-architected software is just as hard to read and follow as a non-architected one. Fortunately for me, my colleagues are more the smart "let's make it work" type


That is about it. No magical silver bullet practices, no amazing software, no technological edge code, just plain software shop work.