In my own quest to find interesting books that would help me understand my place as a software developer I've stumbled upon Dreaming in Code, something I knew nothing about other than it featured the word "code" in the title. It had to be good!

In the end the book surpassed my expectations by describing software from a totally different point of view than the programming books I am used to. Dreaming in Code is not a technical book. It can be read by software developers and bored housewives alike. It features a kind and professional tone and the three years of documenting the book can only help put the whole story in perspective.

The storyline is simple: a software visionary decides to start a new project, one that would be open source, innovative and revolutionary and also a replacement for slumbering Outlook and Exchange type of software. Scott Rosenberg documents the development process, trying to figure out the answer to the decades long question: why is software hard? What starts very ambitious, with no financial or time contraints, ends up taking more than three years to get to a reasonable 0.6 release, time when the book ends. The project is still ongoing. They make a lot of mistakes and change their design a lot, but they keep at it, trying to learn from errors and adapt to a constantly changing world.

For me that is both a source of inspiration and concern. If Americans with a long history of software spend millions of dollars and years to create a software that might just as well not work, what chance do I stand trying to figure out the same questions? On the other hand the spirit of the team is inspirational, they look like a bunch of heroes battling the boring and pointless world of software development I am used to. And of course, there is the little smugness "Hey, I would have done this better. Give a million dollars to a Romanian and he will build you anything within a month". The problem, of course, is when you try to hire two Romanians! :)

Anyway, I loved this book. It ended before it had any chance of getting boring, it detailed the quest of the developers while in the same time putting everything in the context of great software thinkers and innovators and explaining the origin and motivation behind the most common and taken for granted technologies and IT ideas. It is a must read for devs, IT managers and even people that try to understand programmers, like their wives.

Here are some links:
Official book site
Scott Rosenberg's own blog
The official site of the Chandler software project

and has 0 comments
Now this is one cool song. If the quality would have been just a little bit better... Anyway, I have been searching the entire videosphere looking for something even remotely similar. Bjork is weird, but usually restrained in her singing. In this video she shows the power in that tiny Icelandic body. Still searching... Skin sounds great as always, but I feel she can't really keep up :)



You might want to see this video as well. High quality and a collection of weird instruments, but not my favourite way of remixing a song.

and has 2 comments
I was browsing the HanselMinutes site for interesting podcasts to listen to while going to work and I found one entitled Windows Home Server. First I thought it was one of those Home versions, like Windows XP Home which ended up being total crap. But a home server? So I got curious and listened to it.

Apparently Windows Home Server is meant to act as a central point to your data, providing easy backup solutions and storage management well above what RAID can do. Also, there is a central console that you can use to manage and also connect to the computers in your home. I found interesting enough the way they plan to combine Microsoft Passport with a dynamic DNS for your computer, allowing you to connect to your home via browser, waking up computers that are shut down and accessing them as well.

But the most interesting technology seems to be the Windows Home Server Drive Extender, a technology that takes all drives available of any type and adds all of the storage to a single namespace that you can access. You select which part of the data will be duplicated, which means the server will choose multiple drives to store your important data, leaving downloaded movies and music alone and saving space. Even more interesting is that the server backup system itself uniquely stores clusters. So, in my understanding, if you have 10 computers with Windows XP on it, all the common files will have the same clusters and will only be stored once!

This technology seems more useful and powerful than Windows Vista and considering it is based on the Windows Server 2003 technology which itself was based on Windows Server 2000, the minimum requirements are really low, like an old 1Ghz computer.

and has 1 comment
Another game in the Vampire universe, this time everything is happening in the present, with a lowly human being Embraced out of a sudden by a rebelious vampire. The outcome is that your "sire" is killed and you are left alone to discover what a vampire really is.



Now, at the start of the game I thought to myself "Redemption was way better". It's true, Bloodlines is trully 3D, but you have 1 character only. It does employ a lot of elements from first person shooters, but the only thing the main character seems to be able to do is take orders from everybody. But then the story really became interesting. The annoying lack of free will remained a thorn in my side for the rest of the game, but with each quest I had to finish there appeared more ways to solve problems, more and more innovative quests (not just go, kill, exit like in Redemption).

What I found really cool is the little quiz you get to answer in the beginning of the game. The answers to it determine the vampire "clan" you belong to and they give you some special abilities accordingly. It guessed me right, too. I got to be Gangrel and killed almost all enemies with axe or sword, even if they had Steyr AUG :). That means that they had a way of ending any situation according to the chosen clan, which is very cool indeed as programming goes. Also, the 3D characters are really well defined. Chicks are sexy, movements are natural and facial expressions are almost realistic!

As with Redemption, the story is really well defined and together at the start of the game then it goes faster and faster as the end approaches. I didn't like that. Also, when you are close to the end and you want to escape a collapsing cave using an inflatable boat the game crashes to Windows desktop. What you need to do, actually what you need to do from the start of the game is go download the patch and install it. Here is the link. Also, the minimum requirements are 512MB of memory, but you will find yourself unable to control your character from time to time, and in this case you must tweak the game a little. It will provide a modest help, the only real solution I found is to save the game when this happends and then some memory gets freed or whatever. You can also try all kinds of console commands provided you start the game with the -console command line option.

My conclusion: another great game probably plagued by deadline management. They were planning even a multiplayer option in the game, but they scratched it. The ending also allowed for at least a dozen continuations, but they didn't pursue this. Bottom line: if you liked Redemption, but you thought it wasn't "killy" enough or you wanted to be more FPS like, you will love this one.

and has 0 comments
This book is a short and easy read and it describes the way the web will change the world in Hamilton's vision. Part of the Web series it tells the story of a quest of online friends in the Realworld. At the end both virtual and real worlds mingle in an interesting way. A nice read from Hamilton, a quick pocket book relaxing read.

and has 0 comments
I wondered about this book, since it had Hamilton's later style combined with a nearly marginal subject. Also, Misspent Youth has the title Magic Memories on my PDA. But the bottom line is that this is the story of the beginning of the rejuvenation technology, heavily featured in the Pandora/Void universe, but with other details that link it to Night's Dawn. However, if you completely ignore this science fiction limbo status and the few social issues that Peter F. Hamilton raises in the book, the story is no more than a soap!

I mean you have it all: young upper class people interchanging partners like they're researching combinatorics, puppy romance broken by experienced charmer, broken homes, even parent and son on opposite political sides. For someone that has read the more monumental scifi from the writer, this is like a break from the science fiction of it and towards a more personal point of view. For someone else, it may feel simply mediocre.

My conclusion: even if the book is well written, it is plagued by a the lack of a proper subject, the positive outcome of every single thing (remember Fallen Dragon? I said I can't possibly relate with the passive philosophy of the main character there, same here) and the quick, undetalied ending that one can observe also in the Commonwealth Saga.

and has 0 comments
The concept of Open Courses is not so new. You've probably stumbled across some course package that is both free and online, but that is just not doing anything for you. A good example is the Microsoft courses, which need that annoying passport registration, you need to take the html based courses in a specific amount of time and they spam you with all the email reminders. What you actually wanted was information, quickly summarized, indexed maybe, and a video/audio stream that would demonstrate what the theory is all about. You don't want to register, have restrictions or even do it online. You want to download stuff and run it locally whenever you feel like it.

I am glad to say I found exactly what I wanted in the MIT Open Course Ware site. They have a huge list of classes, most have only PDF materials, but some have video recordings of the actuall class! With PDF notes! Even MP3 materials for your mp3 player! No registration required and everything you have there you can also find on YouTube! And the videos are profesionally shot, not some web cam in the back thing.

Interested yet? Access the site and browse about. You might want to use this link to get to the audio/video only courses or use Google to find only the courses that have video. You won't get MIT to say you studied with them, but you will learn what they teach if you make the effort!

One thing you need to be able to run the .RM files is Real Alternative, a package that allows you to play Real Media without installing the annoying and not free Real Player.

And MIT is not the only one doing that. You can access the links of:
Open Courseware Consortium
OpenContentOnline
Open Courseware finder

and has 0 comments
There are a lot of things I want to blog about right now, but I'll start with this, since it is at the root of all the other articles I want to write. What is time management all about? It's about making use of those bits and small crums of free time!

I mean you wake up in the morning, you do what you need to do: make the bed, wash, pet the pet, wife the wife, make the meal, eat the meal, read your emails and blogs, etc... and you are left with about 17 minutes of extra time. If you leave home you get to the office too early, if you watch a TV series episode you leave in 50 minutes and you get there too late. You can't read a book, because that would mean make yourself confortable, open the book, get in the atmosphere of it, then get out of it and leave through the front door in 17 minutes. It can't be done and let you feel good about it. So, what to do?

My first choice is audio podcasts. I download a few (related to programming, but that's my personal interest), all in MP3 format, I choose one every morning and copy it on my cell phone. I use the handsfree to listen to it and that's it. In an ideal world I could do that while washing, making the bed, cooking the meal and eating it, but then I would mess the wife and pet part. Even so, audio files will run through the 17 minutes, through you leaving the house, walking to the car or bus or tram or even directly to the office, during the trip with any means of transportation and up until you get to the office.

Ok, you're in the tram, right in the middle of the distance between home and office and the podcast ends. You have a small cell phone with capacity for only one podcast or you ran out of battery. What to do?

My second choice: a PDA. I am using an 5 year old PDA, one that is black and white, has no screen backlight, it was a present from my uncle (Thanks, Alex!). I use a simple text reader like HandStory to read text books. The advantage is that the battery holds a lot (no cell or Bluetooth capability or graphics or lights to consume it) and you leave the book in the exact place where you stopped reading. Yes: turn PDA on, continue reading, turn PDA off. That means zero time consumed on finding the book, the page, the paragraph. The PDA comes with it's own protective casing and it fits perfectly in my jeans pocket. I can have tens of books on it and I need to recharge it once a week or even two weeks.
Of course, a new PDA could be used both as an audio player and a book reader.

So we've covered staying in touch with the world evolution with audio files and keeping literated by using text books. What's next? Video! In theory you could use the PDA for video feeds, but that would be pointless, in my opinion, since you cannot watch video while walking and something really small cannot give the output that is required for comfort. But that's me.

Anyway, I plan to write an entire article about open courseware next, but you may already know what I am hinting at. Use open courseware videos or video presentations on your computer while you are waiting for programs to compile, in the office lunch break or even in the background while you work. If something needs video, you can switch quickly, rewind a little, and see it done. You can watch parts of it before you leave for work or while you are eating when you come back.

And there you have it. Every single moment of your day can be used to absorb information. What are the downsides? Besides the obvious medical issues like reading or watching something on a screen every single moment (I don't have problems with this yet and I've been kind of glued to the screen for at least 10 years), there is the absorbtion capacity problem. At times you will feel totally wasted, tired, stupid, nothing works, you don't get the world around you, it's like your IQ has droppped a few stories to a neighbour below. Well, in that case TAKE A BREAK! Yes, that's all that is required. Take a day off or spend a weekend day doing absolutely nothing. With so many distractions around you, it will be difficult, especially after you've trained yourself to gobble down information, but think of it like jogging. Even if you manage to do it for half an hour, there is a small moment when you need to stop in order to continue running. Walking fast doesn't cut it, you need to stop.

Happy gobbling! :)

and has 0 comments
Update: Damn YouTube and their lawsuits! I had to change the song with another live performance, but it isn't that good. Well, the quality of the video is better, but not so emotional.

This is a song that I find myself liking while listening to my music in the background. It's a pretty difficult feat, since I usually type away and don't really pay attention to what I listen. I am putting the live show from Spain, because I think that the official video for this song is crap. Enjoy!


and has 0 comments
This is not a single story, but many short ones from my latest favourite writer: Peter F. Hamilton. A Second Chance at Eden is set in the Night's Dawn universe, but before that story unfolded. We have Marcus Calvert, father of Joshua, the hero of Night's Dawn; we have the birth of Eden, affinity bondage stories, zero-tau, psychic abilities, even a party assassin turned good (that would provide the template for a character in Pandora's Star).

I think that the collection is best read after you've read the lengthy stories. It rings so many bells that would normally not mean anything than sci fi speculation otherwise.

Bottom line: Great writing from Hamilton. It's nice that you can read one story and take a break and do something else :). I guess if you are not that sure you want to read the sagas, starting with this will open your appetite and you will find the same connections I did, only backwards.

When one wants to indicate clearly that a control is to perform an asynchronous or a synchronous postback, one should use the Triggers collection of the UpdatePanel. Of course, I am assuming you have an ASP.Net Ajax application and you are stuck on how to indicate the same thing on controls that are insides templated controls like DataGrid, DataList, GridView, etc.

The solution is to get a reference to the page ScriptManager then use the method RegisterPostBackControl on your postback control. You get a reference to the page ScriptManager with the static ScriptManager.GetCurrent(Page); method. You get the control you need inside the templated control Item/RowCreated event with a e.Item/Row.FindControl("postbackControlID");

So, the end result is:

ScriptManager sm=ScriptManager.GetCurrent(Page);
Control ctl=e.Item/Row.FindControl("MyControl");
sm.RegisterPostBackControl(ctl);


Of course, if you want it the other way around (set the controls as Ajax async postback triggers) use the RegisterAsyncPostBackControl method instead.

Special thanks to Sim Singh from India for asking me to research this.

I was reading this post where Jeff Atwood complained about too many shiny tools that only waste our time and of which there are so many that the whole shining thing becomes old.

Of course, I went to all the links for tools in the post that I could find, and then some. I will probably finish reading it after I try them all :)

Here are my refinements on the lists that I've accessed, specific with .NET programming in mind and free tools:
  • Nregex.com - nice site that tests your regular expressions online and let's you explore the results. Unfortunately it has no profiling or at least a display of how long it took to match your text
  • PowerShell - Great tool once you get to know it. It comes complete with blog, SDK and Community Extensions
  • PowerTab - adds Tab expansion in PowerShell
  • Lutz Roeder's Reflector - the .NET decompiler and its many add-ons
  • Highlight - a tool to format and colorize source code for any flavour of operating system and output file format.


There are a lot more, but I am lazy and I don't find the use for many of them, but you might. Here is Scott Hanselman's list of developer tools from which I am quite amazed he excluded ReSharper, my favourite Visual Studio addon.

Warning: this is going to be one long and messy article. I will also update it from time to time, since it contains work in progress.

Update: I've managed to uncover something new called lookbehinds! They try to match text that is behind the regular expression runner cursor. Using lookbehinds, one might construct a regular expression that would only match a certain maximum length, fixing the problem with huge mismatch times in some situations like CSV parsing a big file that has no commas inside.

Update 2: It wouldn't really work, since look-behinds check a match AFTER it was matched, so it doesn't optimize anything. It would have been great to have support for more regular expressions ran in parallel on the same string.

What started me up was a colleague of mine, complaining about the ever changing format of import files. She isn't the only one complaining, mind you, since it happened to me at least on one project before. Basically, what you have is a simple text file, either comma separated, semicolon separated, fixed width, etc, and you want to map that to a table. But after you make this beautiful little method to take care of that, the client sends a slightly modified file in an email attachment, with an accompanying angry message like: "The import is not working anymore!".

Well, I have been fumbling with the finer aspects of regular expressions for about two weeks. This seemed like the perfect application of Regex: just save the regular expression in a configuration string then change it as the mood and IQ of the client wildly fluctuates. What I needed was:
  • a general format for parsing the data
  • a way to mark the different matched groups with meaningful identifiers
  • performance and resource economy


The format is clear: regular expression language. The .NET flavour allows me to mark any matched group with a string. The performance should be as good as the time spent on the theory and practice of regular expressions (about 50 years).

There you have it. But I noticed a few problems. First of all, if the file is big (as client data usually is) translating the entire content in a string and parsing it afterwards would take gigantic amounts of memory and processing power. Regular expressions don't work with streams, at least not in .Net. What I needed is a Regex.Match(Stream stream, string pattern) method.

Without too much explanation (except the in code comments) here is a class that does that. I made it today in a few hours, tested it, it works. I'll detail my findings after the code box (which you will have to click to expand).

StreamRegex - click to expand/collapse


One issue I had with it was that I kept translating a StringBuilder to a string. I know it is somewhat optimized, but the content of the StringBuilder was constantly changing. A Regex class that would work at least on a StringBuilder would have been a boost. A second problem was that if the input file was not even close to my Regex pattern, the matching would take forever, as the algorithm would add more and more bytes to the string and tried to match it.

And of course, there was my blunt and inelegant approach to regular expression writing. What does one do whan in Regex hell? Read Steve Levithan's blog, of course! It was then when I decided to write this post and also document my regular expression findings.

So, let's summarize a bit, then add a bunch of links.
  • the .NET regular expression flavour supports marking a group with a name like this
    (?<nameOfGroup>someRegexPattern)
  • it also supports non capturing grouping:
    (?:pattern)
    This will not appear as a Group in any match although you can apply quantifiers to it
  • also supported are atomic or greedy grouping.
    (?>".+")
    The pattern above will match "abc" but not "abc"d because ".+ matches the whole pattern and the ending quote is not matched. Normally, it would backtrack, but atomic groups do not backtrack once they failed, saving time, but possibly skipping matches
  • one can also use lazy quantifiers:ab+? will match ab in the string abbbbbb
  • posessive quantifiers are not supported, but they can be substituted with atomic groups:
    ab*+ in some regex flavours is (?>ab*) in .NET
  • let's not forget the
    (?#this is a comment)
    notation to add comments to a regular expression
  • Look-behinds! - great new discovery of mine that can match an already matched expression. I am not sure how it would hinder speed, though. Quick example: I want to match "This is a string", but not "This is a longer string, that I don't want to match, since it is ridiculously long and it would make my regex run really slow when I really need only a short string" :), both as separate lines in a text file.
    ([^\r\n]+)(?:$|[\r\n])(?<=(?:^|[\r\n]).{1,21})
    This expression matches all strings that do not contain line breaks, then looks behind to check if there is a string begin or a line break character at at most 21 characters behind, effectively reducing the maximum length of the matched string to 20. Unfortunately, this would slow even more the search, since it would only back check a match AFTER the match completed.


What does that mean? Well, first of all, an increase in performance: using non capuring grouping will save memory, using atomic quantifiers will speed up processing. Then there is the "Unrolling the loop" trick, using atomic grouping to optimize repeated alternation like (that|this)*. Group names and comments ease the reading and reuse of regular expressions.

Now for the conclusion: using the optimizations described above (and in the following links) one can write a regular expression that can be changed, understood and used in order to break the input file into matches, each one having named groups. A csv file and a fixed length record file would be treated exactly the same. Let's say using something like (?<ZipCode>\w*),(?<City>\w*)\r\n or (?<ZipCode>\w{5})(?<City>\w{45})\r\n or use look-behinds to limit the maximum line size. All the program has to do is parse the file and create objects with the ZipCode and City properties (if present), maybe using the new C# 3.0 anonymous types. Also, I have read about the DFA versus NFA types of regular expression implementations. DFAs are a lot faster, but cannot support many features that are supported by NFA implementations. The .Net regex flavour is NFA, but using atomic grouping and other such optimizations bridges the gap between those two.

There is more to come, as I come to understand these things. I will probably keep reading my own post in order to keep my thoughts together, so you should also stay tuned, if interested. Now the links:

.NET Framework General Reference Grouping Constructs
.NET Framework General Reference Quantifiers
Steve Levithan's blog
Regular Expression Optimization Case Study
Optimizing regular expressions in Java
Atomic Grouping
Look behinds
Want faster regular expressions? Maybe you should think about that IgnoreCase option
Scott Hanselman's .NET Regular Expression Tool list
Compiling regular expressions (also worth noting is that the static method Regex.Match will cache about 15 used regular expressions so that they can be reused. There is also the Regex.CacheSize property that can be used to change that number)
Regular expressions at Wikipedia
Converting a Regular Expression into a Deterministic Finite Automaton
From Regular Expressions to DFA's Using
Compressed NFA's


There is still work to be done. The optimal StreamRegex would not need StringBuilders and strings, but would work directly on the stream. There are a lot of properties that I didn't expose from the standard Regex and Match objects. The GroupCollection and Group objects that my class exposes are normal Regex objects, some of their properties do not make sense (like index). Normally, I would have inherited from Regex and Match, but Match doesn't have a public constructor, even if it is not sealed. Although, I've read somewhere that one should use composition over inheritance whenever possible. Also, there are some rules to be implemented in my grand importing scheme, like some things should not be null, or in a range of values or in some relation to other values in the same record and so on. But that is beyond the scope of this article.

Any opinions or suggestions would really be apreciated, even if they are not positive. As a friend of mine said, every kick in the butt is a step forward or a new and interesting anal experience.

Update:

I've taken the Reflected sources of System.Text.RegularExpressions in the System.dll file and made my own library to play with. I might still get somewhere, but the concepts in that code are way beyond my ability to comprehend in the two hours that I allowed myself for this project.

What I've gathered so far:
  • the Regex class is no sealed
  • Regex calls on a RegexRunner class, which is also public and abstract
  • RegexRunner asks you to implement the FindFirstChar, Go and InitTrackCount methods, while all the other methods it has are protected but not virtual. In the MSDN documentation on it, this text seals the fate of the class This API supports the .NET Framework infrastructure and is not intended to be used directly from your code.
  • The RegexRunner class that the Regex class calls on is the RegexInterpreter class, which is a lot of extra code and, of course, is internal sealed
.

The conclusion I draw from these points and the random experiments I did on the code itself are that there is no convenient way of inheriting from Regex or any other class in the System.Text.RegularExpressions namespace. It would be easy, once the code is freely distributed with comments and everything, to change it in order to allow for custom Go or ForwardCharNext methods that would read from a stream when reaching the end of the buffered string or cause a mismatch once the runmatch exceeds a certain maximum length. Actually, this last point is the reason why regular expressions cannot be used so freely as my original post idea suggested, since trying to parse a completely different file than the one intended would result in huge time consumption.

Strike that! I've compiled a regular expression into an assembly (in case you don't know what that is, check out this link) and then used Reflector on it! Here is how to make your own regular expression object:
  • Step 1: inherit from Regex and set some base protected values. One that is essential is base.factory = new YourOwnFactory();
  • Step 2: create said YourOwnFactory by inheriting from RegexRunnerFactory, override the CreateInstance() method and return a YourOwnRunner object. Like this:
    class YourOwnFactory : RegexRunnerFactory
    {
    protected override RegexRunner CreateInstance()
    {
    return new YourOwnRunner();
    }
    }

  • Step 3: create said YourOwnRunner by inheriting from abstract class RegexRunner. You must now implement FindFirstChar, Go and InitTrackCount.
. You may recognize here a Factory design pattern! However, consider that the Microsoft normal implementation (the internal sealed RegexInterpreter) has like 36Kb/1100 lines of highly optimised code. This abstract class is available to poor mortals for the single reason that they needed to implement regular expressions compiled into separate assemblies.

I will end this article with my X-mas wish list for regular expressions:
  • An option to match in parallel two or more regular expressions on the same string. This would allow me to check for a really complicated expression and in the same time validate it (for length, format, or whatever)
  • Stream support. This hack in the above code works, but does not real tap in the power of regular expressions. The support should be included in the engine itself
  • Extensibility support. Maybe this would have been a lot more easy if there was some support for adding custom expressions, maybe hidden in .NET (?#comment) syntax.

I've stumbled upon a link toward SQL 2000 best practices analyzer. Aparently, it is a program that scans my SQL server and tells me what I did wrong. It worked, somewhat, because at some tests it failed with a software exception, but then I searched the Microsoft site for other best practices analyzers and I found a whole bunch of them!

Here are a few links that seemed interesting:

The last link is for a framework that loads all analyzers so you can run them all. It's a pretty basic tool, there is still work to be done on it, but you can also make your own analyzers and the source code for the program and the included ASP.Net plugin is also available.

I was following a link of someone trying to translate my blog in Japanese. Apparently Siderite = シデライト . As you can see, it's like a little story: first there was this smiling guy, then his smile got smaller and smaller as his eyes got bigger and bigger. In the end, he was completely abstracted. Poor old Siderite :(

Update: Babylon says Siderite in Japanese is 菱鉄鉱. Is there any Japanese reader who can help me with this dilemma?