So I was trying to optimize a sorting algorithm, only the metrics did not make any sense. On the test page I had amazing performance, on the other it was slow as hell. What could possibly be the problem?

The difference between the two tests was that one was sorting inline (normal array reading and writing) and the other was using a more complex function and iterated a data source. So I went to test the performance of iterating itself.

The code tests the speed of adding all the items of a large array in three cases:

  • a classic for...in loop that increments an index and reads the array at that index
  • a for...of loop that iterates the items of the array directly
  • a for...of loop that iterates over a generator function that yields the values of the array
time(()=>{ let sum=0; for (let i=0; i<arr.length; i++) sum+=arr[i]; },'for in');
time(()=>{ let sum=0; for (const v of arr) sum+=v; },'iterator for of');
time(()=>{ let sum=0; for (const v of (function*(){ for (let i=0; i<arr.length; i++) yield arr[i]; })()) sum+=v; },'generator for of');

time is a function I used to compute the speed of the execution. The array is 100 million integers. And here are the results:

for in: 155.12999997008592
for of: 1105.7250000303611
for of: 2309.88499999512

I haven't yet processed what it means, but I really thought using an iterator was going to be at least as fast as a for loop that uses index accessing to read values. Instead there is a whooping 7 to 14 times decrease in speed.

So from now on I will avoid for...of in high performance scenarios.

Just a thing I learned today: using the bitwise not operator (~) on a number in Javascript ignores its fractional part (it converts it to integer first), therefore using it twice gives you the integer part of original number. Thanks to fetishlace for clarifications.

Notes:

  • this is equivalent to (int)number in languages that support the int type
  • this is equivalent to Math.trunc for numbers in the integer range
  • this is equivalent to Math.floor only for positive numbers in the integer range

Examples:
~~1.3 = 1
~~-6.5432 = -6
~~(2 ** 32 + 0.5) = 0
~~10000000000 = 1410065408

P.S. If you don't know what ** is, it's called the Exponentiation operator and it came around in Javascript ES2016 (ES7) and it is the equivalent of Math.pow. 2 ** 3 = Math.pow(2,3) = 8

  Sometimes a good regular expression can save a lot of time and lead to a robust, yet flexible system that works very efficiently in terms of performance. It may feel like having superpowers. I don't remember when exactly I've decided they were useful, but it happened during my PHP period, when the Linux command line system and its multitude of tools made using regular expressions a pretty obvious decision. Fast forward (waaaay forward) and now I am a .NET developer, spoiled by the likes of Visual Studio and the next-next approach to solving everything, yet I still use regular expressions, well... regularly, sometimes even when I shouldn't. The syntax is concise, flexible and easy to use most of the time.

  And yet I see many senior developers avoiding regular expressions like they were the plague. Why is that? In this post I will argue that the syntax makes a regular pattern be everything developers hate: unreadable and hard to maintain. Having many of the characters common in both XML and JSON have special meaning doesn't help either. And even when bent on learning them, having multiple flavors depending on the operating system, language and even the particular tool you use makes it difficult. However, using small incremental steps to get the job done and occasionally look for references to less used features is usually super easy, barely an inconvenience. As for using them in your code, new language features and a few tricks can solve most problems one has with regular expressions.

 The Syntax Issue

The most common scenario for the use of regular expressions is wanting to search, search and replace or validate strings and you almost always start with a bunch of strings that you want to match. For example, you want to match both dates in the format 2020-01-21 and 21/01/2020. Immediately there are problems:

  • do you need to escape the slashes?
  • if you match the digits, are you going for two digit month and day segments or do you also accept something like 21/1/2020?
  • is there the possibility of having strings in your input that look like 321/01/20201, in which case you will still match 21/01/2020, but it's clearly not a date?
  • do you need to validate stuff like months being between 1-12 and days between 1-31? Or worse, validate stuff like 30 February?

But all of these questions, as valid as they are, can be boiled down to one: given my input, is my regular expression matching what I want it to match? And with that mindset, all you have to do is get a representative input and test your regular expression in a tester tool. There are many out there, free, online, all you have to do is open a web site and you are done. My favourite is RegexStorm, because it tests .NET style regex, but a simple Google search will find many and varied tools for reading and writing and testing regular expressions.

The syntax does present several problems that you will hit every time:

  • you cannot reuse parts of the pattern
    • in the example above, even if you have clearly three items that you look for - year, month, day - you will need to copy/paste the pattern for each variation you are looking for
  • checking the same part of the input string multiple times is not what regular expressions were created for and even those that support various methods to do that do it in a hackish way
    • example: find a any date in several formats, but not the ones that are in March or in 2017
    • look behind and look ahead expressions are usually used for scenarios like this, but they are not easy to use and reduce a lot of the efficiency of the algorithm
  • classic regular expression syntax doesn't support named groups, meaning you often need to find and maintain the correct index for the match
    • what index does one of the capturing groups have?
    • if you change something, how do other indexes change?
    • how do you count groups inside groups? Do they even count if they match nothing?
  • the "flags" indicating how the regular engine should interpret the pattern are different in every implementation
    • /x/g will look for all x characters in the string in Javascript
    • C# doesn't even need a global flag and the flags themselves, like CaseInsensitive, are .NET enums in code, not part of the regular pattern string
  • from the same class of issues as the two above (inconsistent implementation), many a tool uses a regular expression syntax that is particular to it
    • a glaring example is Visual Studio, which does not use a fully compatible .NET syntax

  The Escape Issue

The starting point for writing any regular expression has to be a regex escaped sample string that you want to match. Escaping means telling the regular expression engine that characters from your sample string that have meaning to it are just simple characters. Example 12.23.34.56, which is an IP address, if used exactly like that as a regular pattern, will match 12a23b34c56, because the dot is a catch all special character for regular expressions. The pattern working as expected would be 12\.23\.34\.56. Escaping brings several severe problems:

  • it makes the pattern less humanly readable
    • think of a phrase in which all white space has been replaced with \s+ to make it more robust (it\s+makes\s+the\s+pattern\s+less\s+humanly\s+readable)
  • you only escape with a backslash in every flavor of regular expressions, but the characters that are considered special are different depending on the specific implementation
  • many characters found in very widely used data formats like XML and JSON are special characters in regular expressions and the escaping character for regex is a special character in JSON and also string in many popular programming languages, forcing you to "double escape", which magnifies the problem
    • this is often an annoyance when you try to store regular patterns in config files

  Readability and Maintainability

Perhaps the biggest issue with regular expressions is that they are hard to maintain. Have you ever tried to understand what a complex regular expression does? The author of the code started with some input data and clear objectives of what they wanted to match, went to regular expression testers, found what worked, then dumped a big string in the code or configuration file. Now you have neither the input data or the things that they wanted matched. You have to decompile the regular pattern in your head and try to divine what it was trying to do. Even when you manage to do that, how often do developers redo the testing step so they verify the changes in a regular expressions do what was intended?

Combine this with the escape issue and the duplication of subpatterns issue and you get every developer's nightmare: a piece of code they can't understand and they are afraid to touch, one that is clearly breaking every tenet of their religion, like Don't Repeat Yourself or Keep It Simple Silly, but they can't change. It's like an itch they can't scratch. The usual solution for code like that is to unit test it, but regular expression unit tests are really really ugly:

  • they contain a lot of text variables, on many lines
  • they seem to test the functionality of the regular expression engine, not that of the regular expression pattern itself
  • usually regular expressions are used internally, they are not exposed outside a class, making it difficult to test by themselves

  Risk

Last, but not least, regular expressions can work poorly in some specific situations and people don't want to learn the very abstract computer science concepts behind regular expression engines in order to determine how to solve them.

  • typical example is lazy modifiers (.*? instead of .*) which tell the engine to not be greedy (get the least, not the most)
    • ex: for input "ineffective" the regular expression .*n will work a lot worse than .*?n, because it will first match the entire word, then see it doesn't end with n, then backtrack until it gets to "in" which it finally matches. The other syntax just stops immediately as it finds the n.
  • another typical example is people trying to find the HTML tag that has a an attribute and they do something like \<.*href=\"something\".*\/\> and what it matches is the entire HTML document up to a href attribute and the end of the last atomic tag in the document.
  • the golden hammer strikes again
    • people start with a simple regular expression in the proof of concept, they put it in a config file, then for the real life application they continuously tweak the configured pattern instead of trying to redesign the code, until they get to some horrible monstrosity
    • a regex in an online library that uses look aheads and look behinds solves the immediate problem you have, so you copy paste it in the code and forget about it. Then the production app has serious performance problems. 

  Solutions

  There are two major contexts in which to look for solutions. One is the search/replace situation in a tool, like a text editor. In this case you cannot play with code. The most you can hope for is that you will find a tester online that supports the exact syntax for regular expressions of the tool you are in. A social solution would be to throw shade on lazy developers that think only certain bits of regular expressions should be supported and implemented and then only in one particular flavor that they liked when they were children.

  The second provides more flexibility: you are writing the code and you want to use the power of regular expressions without sacrificing readability, testability and maintainability. Here are some possible solutions:

  • start with simple stuff and learn as you go
    • the overwhelming majority of the time you need only the very basic features of regular expressions, the rest you can look up when you need them
    • if the regular expression becomes too complex it is an indication that maybe it's not the best approach
  • store subpatterns in constants than you can then reuse using templated strings
    • ex: var yearPattern = @"(?<year>\d{4})"; var datePattern = $@"\b(?:{yearPattern}-(?<month>\d{2})-(?<day>\d{2})|(?<month>\d{2})\/(?<day>\d{2})\/{yearPattern})\b";
    • the example above only stores the year in another variable, but you can store the two different formats, the day, the month, etc
    • in the end your code might look more like the Backus-Naur (BNF) syntax, in which every separate component is described separately
  • use verbatim strings to make the patterns more readable by avoiding double escaping
    • in C# use @"\d+" not "\\d+"
    • in Javascript they are called template literals and use backticks instead of quotes, but they have the same mechanism for escaping characters as normal strings, so they are not a solution
  • use simple regular expressions or, if not, abstract their use
    • a good solution is using a fluent interface (check this discussion out) that allows the expressivity of human language for something that ends up being a regular expression
    • no, I am not advocating creating your own regular expression syntax... I just think someone probably already did it and you just have to find the right one for you :)
  • look for people who have solved the same problem with regular expression and you don't have to rewrite the wheel
  • always test your regular expressions on valid data and pay attention to the time it took to get the matches
  • double check any use of the string methods like IndexOf, StartsWith, Contains, even Substring. Could you use regular expressions?
    • note that you cannot really chain these methods. Replacing a regular expression like ^http[s]?:// with methods always involves several blocks of code and the introduction of cumbersome constants:
      if (text.StartsWith("http")) {
        // 4 works now, but when you change the string above, it stops working
        // you save "http" to a constant and then use .Length, you write yet another line of code
        text=text.Substring(4);
      } else {
        return false;
      }
      if (text.StartsWith("s")) {
        text=text.Substring(1);
      }
      return text.StartsWith("://");
      
      // this can be easily refactored to
      return text
               .IfStartsWith("http")
               .IfStartsWithOrIgnore("s")
               .IfStartsWith("://");
      // but you have to write your own helper methods
      // and you can't save the result in a config file​

  Conclusion

Regular expressions look daunting. Anyone not familiar with the subject will get scared by trying to read a regular expression. Yet most regular expression patterns in use are very simple. No one actually knows by heart the entire feature set of regular expressions: they don't need to. Language features and online tools can help tremendously to make regex readable, maintainable and even testable. Regular expressions shine for partial input validation and efficient string search and replace operations and can be easily stored in configuration files or data stores. When regular expressions become too complex and hard to write, it might be a sign you need to redesign your feature. Often you do not need to rewrite a regular expression, as many libraries with patterns to solve most common problems already exist.

  I was looking into the concept of partial sorting, something that would help in a scenario where you want the k smaller or bigger items from an array of n items and k is significantly smaller than n. Since I am tinkering with LInQer, my implementation for LINQ methods in Javascript, I wanted to tackle the OrderBy(...).Take(k) situation. Anyway, doing that I found out some interesting things.

  First, the default Javascript array .sort function has different implementations depending on browser and by that I mean different sort algorithms. Chrome uses insertion sort and Firefox uses merge sort. None of them is Quicksort, the one that would function best when the number of items is large, but that has worst case complexity O(n^2) when the array is already sorted (or filled with the same value).

I've implemented a custom function to do Quicksort and after about 30000 items it becomes faster than the default one.  For a million items the default sort was almost three times slower than the Quicksort implementation. To be fair, I only tested this on Chrome. I have suspicions that the merge sort implementation might be better.

  I was reporting in a previous version of this post that QuickSort, implemented in Javascript, was faster than the default .sort function, but it seems this was only an artifact of the unit tests I was using. Afterwards, I've found an optimized version of quicksort and it performed three times faster on ten million integers. I therefore conclude that the default sort implementation leaves a lot to be desired.

  Second, the .sort function is by default alphanumeric. Try it: [1,2,10].sort() will return [1,10,2]. In order to do it numerically you need to hack away at it with [1,2,10].sort((i1,i2)=>i1-i2). In order to sort the array based on the type of item, you need to do something like: [1,2,10].sort((i1,i2)=>i1>i2?1:(i1<i2?-1:0)).

  Javascript has two useful functions defined on the prototype of Object: toString and valueOf. They can also be overwritten, resulting in interesting hacks. For some reason, the default .sort function is calling toString on the objects in the array before sorting and not valueOf. Using the custom comparers functions above we force using valueOf. I can't test this on primitive types though (number, string, boolean) because for them valueOf cannot be overridden.

  And coming back to the partial sorting, you can't do that with the default implementation, but you can with Quicksort. Simply do not sort any partition that is above and below the indexes you need to get the items from. The increases in time are amazing!

  There is a difference between the browser implemented sort algorithms and QuickSort: they are stable. QuickSort does not guarantee the order of items with equal sort keys. Here is an example: [1,3,2,4,5,0].sort(i=>i%2) results in [2,4,0,1,3,5] (first even numbers, then odd, but in the same order as the original array). The QuickSort order depends on the choice of the pivot in the partition function, but assuming a clean down the middle index, the result is [4,2,0,5,1,3]. Note that in both cases the requirement has been met and the even numbers come first.

  A little gem I stumbled upon today that I didn't know about: https://placeholder.com/. You use it really simple like https://via.placeholder.com/300x150?text=Placeholder, but there is also a shorter URL that looks like this: https://placehold.it/1280x800/8020A0/FFFFFF?text=Siderite's blog.

  The simplest syntax is https://placehold.it/300 (a 300x300 image that displays "Placeholder" and "Powered by HTML.COM").

Update: The npm package can now be used in Node.js and Typescript (with Intellisense). Huge optimizations on the sorting and new extra functions published.

Update: Due to popular demand, I've published the package on npm at https://www.npmjs.com/package/@siderite/linqer. Also, while at it, I've moved the entire code to Typescript, fixed a few issues and also create the Linqer.Slim.js file, that only holds the most common methods. The minimized version can be found at https://siderite.github.io/LInQer/LInQer.slim.min.js and again you can use it freely on your websites. Now, I hope you can use it in Node JS as well, I have no idea how to test it and frankly I don't want to. I've also added a tests.slim.html that only tests the slim version of Linqer. This version is now official and I will try to limit any breaking changes from now on. WARNING: Now the type is Linqer.Enumerable, not just Enumerable as before. If you are already using it in code, just do const Enumerable = Linqer.Enumerable; and you are set to go.

Update: LInQer can now be found published at https://siderite.github.io/LInQer/LInQer.min.js and https://siderite.github.io/LInQer/LInQer.extra.min.js and you can freely use it in your web sites. Source code on GitHub.

Update: All the methods in Enumerable are now implemented, and some extra ones as well. Optimizations have been developed for operations like orderBy().take() and .count().

  This entire thing started from a blog post that explained something that seems obvious in retrospect, but you have to actually think about it to realise it: Array functions in Javascript may seem similar to LInQ methods in C#, but they work with arrays! Every time you filter or you map something you create a new array! The post suggested creating functions that would use the modern Javascript features of iterators and generator functions and supplant the standard Array functions. It was brilliant!

  And then the post abruptly veered and swerved into a tree. The author suggested we add the pipe operator to Javascript, just like they have in functional programming languages, so that we can use those static functions with hash characters as placeholders and... ugh! It was horrid! So I decided to solve the problem in a way that was more compatible with my own sensibilities: make it work just like LInQ (or at least like the Java streams).

This is how LInQer was born. You can find the sources on GitHub at /Siderite/LInQer. The library introduces a class called Enumerable, which can wrap any iterable (meaning arrays, but also generator functions or anything that supports for...of), then expose the methods that the Enumerable class in .NET exposes. There is even a unit test file called tests.html. Open it in the browser and see the supported methods. As you can see in the image of the blog post, the same logic (filtering, mapping and slicing a 10 million item array) took 2000ms using standard Array functions and 150ms using the Enumerable class.

"But wait! It says there that Enumerable exposes static methods! What exactly did you improve?", you will ask. In .NET we have something called "extension methods", which are indeed static methods that the language understands apply to specific classes or interfaces and allows you to call them like you would instance methods. So a method that looks like public static int Sum(this IEnumerable<int> enumerable) can be used statically Enumerable.Sum(arr) or like an instance method arr.Sum(). It's syntactic sugar. But enough about C#, in Javascript I've implemented Enumerable as a wrapper, as Javascript doesn't know about interfaces or static extension methods. This class then exposes prototype functions that work the same way as in LInQ.

Here is where the magic happens:

Enumerable.prototype = {
  [Symbol.iterator]() {
    const iterator = this._src[Symbol.iterator].bind(this._src);
    return iterator();
  },

In other words I wrap _src (the original iterable object) in my Enumerable by exposing the same iterator for both! Now both initial object and its wrapper can be used in for...of loops. The difference is that the wrapper now exposes all that juicy functionality. Let's see a simple example: select (which is the logical equivalent of Array.map):

select(op) {
  _ensureFunction(op);
  const gen = function* () {
    for (const item of this) {
      yield op(item);
    }
  }.bind(this);
  return new Enumerable(gen());
},

Here I am constructing a generator function which I bind to the Enumerable instance so that inside it 'this' is always that instance. Then I am returning the wrapper over the generator. In the function I am yielding the result of the op function on each item. Because of how generator functions work, only while iterating will it need to do its work, meaning that if I use the Enumerable into a for...of loop and breaking after three items, the op function will only be applied on those items, not on all of them. In order to use the Enumerable object in regular code, that works with arrays, you just use .toArray and you are done!

Let's see the code in the tests used to compare the standard Array function use with Enumerable:

// this is the large array we are using in both code blocks:
  const largeArray = Array(10000000).fill(10);

// Standard Array functions
  const someCalculation = largeArray
                             .filter(x=>x===10)
                             .map(x=>'v'+x)
                             .slice(100,110);
// Enumerable
  const someCalculation = Linqer.Enumerable.from(largeArray)
                             .where(x=>x===10)
                             .select(x=>'v'+x)
                             .skip(100)
                             .take(10)
                             .toArray();

There are differences from the C# Enumerable, of course. Some things don't make sense, like thenBy after orderBy. It can be done, but it's complicated and in my career I've only used thenBy twice! toDictionary, toHashSet, toLookup, toList have no meaning in Javascript. Use instead toMap, toSet, toObject, toArray. Last, but not least... join. Join is so complicated to use in LInQ that I almost never used it. Also, joining is usually done in the database, so I rarely needed it. I didn't see a point in implementing it. Cast and AsEnumerable also didn't make sense. But I implemented them anyway! :) Cast and OfType are using either a class or a string to determine if an item is "of type" and join works just like in C#.

But don't fret. Any function that you want to add to this can simply be added by your own code into Enumerable.prototype! So if you really need something custom, it's easy to add without modifying the original library.

There is one disadvantage for the prototype approach and the reason why the initial article suggested to use standalone functions: tree-shaking, the fancy word used to express the automatic elimination of unused code. But I think there is a solution for that, one that I won't implement since I believe the library is small enough and minified and compressed it will be very small indeed. The solution would involve separating each of the LINQ methods (or categories of methods, like first, firstOrDefault, last and lastOrDefault, for example) in different files. Then you can use only what you need and the files would attach the functions to the Enumerable prototype. 

In fact, there are a lot of LINQ methods I have never had use for, stuff like intersect and except or append and prepend. It makes sense to create a core Enumerable class and then add other stuff only when required. But as I said, it's small enough to not matter. As such I separated the functionality in Typescript files that contain the basic functionality, the complete functionality and some extra functions. The Javascript result is a Linqer, a Linqer.slim, a Linqer.extra and a Linqer.all, covering the entire spectrum of needs. It would have been user unfriendly to put every little thing in its own file.

Bottom line, I hope you find this library useful or at least it inspires you as it did me when I've read the original blog post from Dan Shappir.

P.S. I am not the only one that had this idea, you might also want to look at linq.js which uses a completely different approach, using regular expressions. I think the closest to what LINQ should be is Manipula, that uses custom iterators for each operation rather than use the same class. Another possibility is linq-collections, which does the work via TypeScript, apparently. Also linq.es6... I am sure there are more examples if one looks closely. Feel free to let me know if you did or know of similar work and also please give me whatever feedback you see fit. I would like this library to be useful, not just a code example for my blog.

I hope you at least heard of the concept of Unit Testing as it is one of the principal pillars of software development. Its purpose is to automatically check the functionality of isolated components of your code, but it adds a lot of benefits:

  • your code becomes more modular - you only worry about the code you change
  • your code becomes more readable - clear dependency chain and tests demonstrate in code what was the purpose of particular features
  • you gain confidence that your code works as you are making changes to it - refactoring without unit tests in place is usually dangerous
  • changing any component with another implementation does not affect the overall application

This post is a companion to the Programming a simple game in pure HTML and Javascript in the sense that it is that game that we will be testing. Also, as per the requirements of that project, we will not be using any of the numerous unit testing frameworks available for Javascript code.

When we left off the development of the game we had three files:

  • complementary.html - the structure of the page
  • complementary.css - the visual design of the page components
  • complementary.js - all the code of the game, including the initialization and the game starting bit

In order to test our individual components, we need to separate them. So let's split complementary.js in four files:

  • complementary.js - just the game start (instantiating Game and initializing it)
  • game.js - the Game class
  • color.js - the Color class
  • elements.js - the custom HTML elements and their registration

Obviously the HTML will change to load all of these Javascript files. When we get to modules, this will become a non issue.

There are things that we could test on the custom HTML elements, but let's leave that aside. Also the two lines in complementary.js will not need testing. The simplest component should be the first to be tested and that is Color.

We start by creating a new HTML file (color tests.html) and we fill it with code that checks the Color class works as expected. First the code and then the discussion:

<html>

<head>
    <script src="color.js"></script>
    <script>
        // this can easily be changed to display a nice report in this page
        const assert ={
            true:(value, message)=> {
                if (value) {
                    console.log('Test PASSED ('+message+')');
                } else {
                    console.warn('Test FAILED');
                    alert(message);
                    throw new Error(message);
                }
            }
        };
    </script>
</head>

<body>
    <script>
        // Arrange
        const color1 = new Color(123);
        const color2 = new Color(123);
        const color3 = new Color(234);
        // Act
        const equalsWorks = color1.equals(color2);
        const notEqualsWorks = !color1.equals(color3);
        // Assert
        assert.true(equalsWorks,'Expected two colors initialized with the same value to be equal');
        assert.true(notEqualsWorks,'Expected two colors initialized with different values not to be equal');
    </script>

    <script>
        // Arrange
        const color = new Color();
        const doubleInvertedColor = color.complement().complement();
        // Act
        const complementAndEqualsWork = color.equals(doubleInvertedColor);
        // Assert
        assert.true(complementAndEqualsWork,'Expected the complementary or a complementary color to be the original color');
    </script>

    <script>
        // Arrange
        const acolor = new Color(0x6789AB);
        const stringColor = acolor.toString();
        // Act
        const toStringWorks = stringColor==='#6789ab';
        // Assert
        assert.true(toStringWorks,'Expected the HTML representation of the color to be #6789ab and it was '+stringColor);
    </script>

</body>

</html>

A test should follow the AAA structure:

  • Arrange - sets up the necessary items for the test to run
    • instantiate classes to be tested
    • mock functionality of dependencies - we will see this when we test Game
  • Act - executes the code intended to be tested and acquires results
  • Assert - verifies that the test results are the ones expected

Normally, a framework would take tests written in a certain way and then produce some sort of report, with nice green and red rows, with messages, with information on where errors occurred and so on. However, I intend to demonstrate the basics of unit testing, so no framework is actually needed.

A unit test:

  • is a piece of code (who unit tests the unit test?!)
  • it requires effort, it is just as prone to bugs as normal code and requires maintenance just like any other code (shit doesn't just happen, it takes time and effort)
  • it requires infrastructure that uses it to periodically test your changes (either someone does it manually or there is some setup to run it automatically and display/email the report)

In the code above I created a script tag for each test in which I am following AAA to create Color instances and then check their functionality does what it should. A very basic assert object is handling the reporting part. It remains homework for the reader to update that part or to plug in an existing unit testing framework.

The Color class is very simple:

  • it supports initializing with a color integer representing the RGB values of the color
  • toString method that returns the HTML representation of the color
  • complement method returns the complement of the color
  • equals method checks if two colors are equal

There are no external dependencies, meaning that it needs nothing from the outside in order to work. Game, for example, requires Color, which is a dependency for Game.

Open the color tests.html file in the browser and open the development tools (F12 or Ctrl-Shift-I) and refresh the page. You should see in the console that all the tests passed. Change something in a test so it fails and it should both throw an error in the console and show you a dialog with the failing test.

Now, let's test Game. The code of the game tests.html file:

<html>

<head>
    <script src="game.js"></script>
    <script>
        // this can easily be changed to display a nice report in this page
        const assert ={
            true:(value, message)=> {
                if (value) {
                    console.log('Test PASSED ('+message+')');
                } else {
                    console.warn('Test FAILED');
                    alert(message);
                    throw new Error(message);
                }
            }
        };
    </script>
</head>

<body>
    <script>
        // Arrange
        const game = new Game();
        // Act
        // Assert
    </script>

</body>

</html>

I used the same structure, the same assert object, only I loaded game.js instead of color.js. Then I wrote a test in which I do nothing than instantiate a Game. If you execute this page it will work just fine. No errors because we have not, in fact, executed anything. We need to execute .init(document), remember?

And now it becomes apparent why I chose to initialize the document from init instead of using window.document in my code. window.document is now the document of the test page, it has no complementary-board element in it. We haven't even defined any custom HTML elements or a Color class. In fact, we can now test that calling init with no parameter will fail:

    <script>
        // Arrange
        const game = new Game();
        // Act
        let anyError = null;
        try {
            game.init();
        } catch(error) {
            anyError = error;
        }
        // Assert
        assert.true(anyError!=null,'Expected the init method of a Game class to fail if run with no parameters ('+anyError+')');
    </script>

And, indeed, if we open the page now we get a console log like this: 

Test PASSED (Expected the init method of a Game class to fail if run with no parameters (TypeError: Cannot read property 'addEventListener' of undefined))

You just learned another important characteristic of a unit test: it tests both what should work and what shouldn't. This is one of the reasons why unit testing is hard. For each piece of code you need to test when it works and when it fails as expected. Now we have to test how Game should work, and that means we get into mocking.

Mocking is when you replace a dependency with something that looks exactly the same, but does something else. For unit tests mock objects need to do the simplest things and those things must be predictable.

Let's see how one of these tests would look:

    <script>
        // Arrange
        const mockDoc = {
            addEventListener:function() {
            }
        };
        const game2 = new Game();
        // Act
        let anyError2 = null;
        try {
            game2.init(mockDoc);
        } catch(error) {
            anyError2 = error;
        }
        // Assert
        assert.true(anyError2==null,'Expected the init method of a Game class to not fail if run with correct parameters ('+anyError2+')');
    </script>

Just by providing an object with an addEventListener function, the initialization of the game now works. mockDoc is a mocked document and this is called mocking.

Let's look at how would a "happy path" test look, one that assumes everything goes correctly and moves through and entire flow:

    <script>
        // Arrange
        const mockDoc3 = {
            testData : {},
            addEventListener:function(eventName,eventHandler) {
                this.testData.eventName = eventName;
                this.onLoad = eventHandler;
            },
            getElementsByTagName:function(tagName) {
                this.testData.tagName = tagName;
                return [this.mockBoard];
            },
            mockBoard: {
                setChoiceHandler:function(handler) {
                    this.choiceHandler=handler;
                }
            }
        };
        const game3 = new Game();
        // Act
        let anyError3 = null;
        try {
            game3.init(mockDoc3);
        } catch(error) {
            anyError3 = error;
        }
        Math = {
            random:function() {
                return 0.4;    
            },
            floor:function(x) { return x; },
            round:function(x) { return x; },
            pow:function(x,p) { if (p==1) return x; else throw new Error('Expected 1 as the exponent, not '+p); }
        };
        Color = {
            index:0,
            new:function(value) {
                return {
                    val: value,
                    equals: function(x) { return x.val==this.val; },
                    complement: function() { return Color.new(1000-this.val); }
                };
            },
            random:function() {
                this.index++;
                return Color.new(this.index*10);
            }
        }
        mockDoc3.onLoad();
        mockDoc3.mockBoard.choiceHandler(3);
        // Assert
        assert.true(anyError3==null,'Expected the init method of a Game class to not fail if run with correct parameters ('+anyError3+')');
        assert.true(mockDoc3.testData.eventName==='DOMContentLoaded','DOMContentLoaded was not handled!');
        assert.true(mockDoc3.testData.tagName==='complementary-board','Game is not looking for a complementary-board element');
        assert.true(game3._roundData.guideColor.val === 10,'Guide color object expected to have val=10 ('+JSON.stringify(game3._roundData.guideColor)+')');
        assert.true(game3._roundData.tries.size === 1,'Expected 1 unsuccessful try ('+JSON.stringify(game3._roundData.tries)+')');
        assert.true(game3._roundData.tries.has(3),'Expected unsuccessful try with value of 3 ('+JSON.stringify(game3._roundData.tries)+')');
        assert.true(game3._log.length === 0,'Expected no score after one unsuccessful try ('+JSON.stringify(game3._log)+')');
        // Act 2 (heh!)
        mockDoc3.mockBoard.choiceHandler(2);
        // Assert 2
        assert.true(game3._roundData.guideColor.val === 60,'Guide color object expected to have val=60 ('+JSON.stringify(game3._roundData.guideColor)+')');
        assert.true(game3._roundData.tries.size === 0,'Expected no unsuccessful tries after correct response ('+JSON.stringify(game3._roundData.tries)+')');
        assert.true(game3._log.length === 1,'Expected one item of score after correct answer ('+JSON.stringify(game3._log)+')');
        assert.true(game3._log[0] === 50,'Expected 50 as the score after one fail and one correct answer ('+JSON.stringify(game3._log)+')');
        
    </script>

There is a lot to unpack here, including the lazy parts of the test. Here are some of the issues with it:

  • it doesn't follow the AAA pattern, it asserts, then acts again, then asserts again
    • while this works, it doesn't encapsulate testing of a single feature
    • it is the equivalent of the classes or methods with multiple responsibilities from normal code
    • the correct way to do it is to write another test, have two choiceHandler calls in the Act part and assert only that particular path
  • it tests internal functionality
    • in order to write the test I had to look at how the Game class works internally and then tailor the test so it works
    • it accesses private data like _log and _roundData
  • the Game class did not abstract all of its dependencies
    • this is painfully obvious when mocking the Math object - in Javascript this is easy, but in other languages the Math functionality comes as an interface (a declaration of available members with no implementation)
    • this is not a test problem, but a Game problem which makes testing tedious
  • test contains logic
    • look at the Math and Color mocks, how they compute values and return objects
  • some values there are particularly chosen to "work"
    • have you noticed how random returns 0.4 so that multiplied with 5 (number of choices) it returns 2, which then is equal with the correct choice?
  • test is not independent
    • Math and Color are replaced with some static objects, thus changing the environment for following tests

But it works, even if it's not very well written. Here are some of its good features:

  • it tests a full game round with one bad and one good choice
  • it doesn't try to go around randomness by adding more code in the assertion part
    • it could have easily gone that way in order to determine the correct color in a random list, thus replicating or reverse engineering the Game logic into the test
  • all the expected results are completely predictable (the score values, the indexes, etc)
  • it tests only Game functionality, not others
    • I could have not mocked Color, loading color.js and thus relying in a single test on functionality from a dependency

The lessons learned from this tell us that both the Game code and the test code could have been written better in regard to maintainability, with dependencies clearly declared, easy to mock or replace. In fact, the greatest gain of a company with hiring a senior developer is not on how well code works, but how much time is saved and how much risk is avoided when the code is written with separation of concerns and unit testing in mind.

It is funny, but when a piece of code is not testable, writing a single unit test forces you to refactor it into a good shape. The rest of the tests only test functionality, as the class has been rewritten with testing in mind. So that is my advice: whenever you write something, even a silly game like Complementary, write one "happy path" unit test per class.

Homework for you: rewrite the Game class and its test class so that testing becomes easier and more correct.

This is a side post to Programming a simple game in pure HTML and Javascript in the sense that we are going to add that project to source control in GitHub using Visual Studio Code.

The first step is to create a repository. You need to create a GitHub account first, then either go to Your Repositories

or click the green New button on the top of your repositories list in the main page.

Make sure you give the repository a name and a decent description. I recommend adding a README and a licence (I use MIT), too.

Now that you have a remote repo on GitHub, you need to remember its URL, you will need it later.

Then download Git from their website. The Windows installer is simple to find, download and Next-next-next to success.

Open Visual Studio Code in your project's folder. On the left side menu there are a bunch of vertical buttons, click on the Source Control one 

Then initialize your local repository by clicking on the plus sign and selecting your project folder: 

Now you need to connect your local repo to your remote repo and you do this by typing some stuff in the Visual Studio Code Terminal window:

  1. set the remote by typing git remote add origin <your GitHub repo URL>
  2. verify the remote by typing git remote -v
  3. run a git pull to make sure your project loads the readme and license file by typing git pull origin master

Last step is to push your local files to the remote repository. You do this by

  1. clicking on the check mark at the top of the Source Control window and giving a description to this first commit

  2. pushing/publishing the commit, by clicking on the ... mark and choosing Publish Branch

There you have it. Your project is on source control.

You do this in order to not lose your changes, in order to track your changes forwards and backwards in time and to publish your work for others to see it (like other devs or potential employers)

Verify that all went well by refreshing your GitHub project page. The files in your local folder should now be visible in the file list on the page.

Now, every time you make changes to your code you push the changes on GitHub by committing (the check mark) and then pushing/syncing/publishing (from the ... menu)

The code for this series of posts can be found at https://github.com/Siderite/Complementary

  I was helping a friend with basic programming and I realized that I've been so caught up with the newest fads and development techniques that I've forgotten about simple programming, for fun, with just the base principles and tools provided "out of the box". This post will demonstrate me messing up writing a game using HTML and Javascript only.

Mise en place

This French phrase is used in professional cooking to represent the preparation of ingredients and utensils before starting the actual cooking. We will need this before starting developing our game:

  • description: the game will show a color and the player must choose from a selection of other colors the one that is complementary
    • two colors are complementary if when they are mixed, they cancel each other out, resulting in a grayscale "color" like white, black or some shade of gray. Wait! Was that the metaphor in Fifty Shades of Grey?
  • technological stack: HTML, Javascript, CSS
    • flavor of Javascript: ECMAScript 2015 (also known as ES6)
    • using modules: no - this would be nice, but modules obey CORS, so you won't be able to run it with the browser from the local file system.
    • unit testing: yes, but we have to do it as simply as possible (no external libraries)
  • development IDE: Visual Studio Code
    • it's free and if you don't like it, you can just use Notepad to the same result
  • source control: Git (on GitHub)

Installing Visual Studio Code

Installing VS Code is just as simple as downloading the installer and running it.

Then, select the Open Folder option, create a project folder (let's call it Complementary), then click on Select Folder.

The vanilla installation will help you with syntax highlighting, code completion, code formatting.

Project structure

For starters we will need the following files:

  • complementary.html - the actual page that will be open by the browser
  • complementary.js - the Javascript code
  • complementary.css - the CSS stylesheet

Other files will be added afterwards, but this is the most basic separation of concerns: code and data in the .js file, structure in .html and presentation in .css.

Starting to code

First, let's link the three files together by writing the simplest HTML structure:

<html>
    <head>
        <link rel="stylesheet" href="complementary.css"/>
        <script src="complementary.js"></script>
    </head>
    <body>
        
    </body>
</html>

This instructs the browser to load the CSS and JS files. 

In the Javascript file we encapsulate out logic into a Game class:

"use strict";
class Game {
  init(doc) {
    this._document = doc;
    this._document.addEventListener('DOMContentLoaded',this.onLoad.bind(this),false);
  }
  onLoad() {

  }
}

const game=new Game();
game.init(document);

We declared a class (a new concept in Javascript ES6) and a method called init that receives a doc. The idea here is that when the script is loaded, a new Game will be created and the initialization function will receive the current document so it can interact with the user interface. We used the DOMContentLoaded event to call onLoad only when the page document object model (DOM) has been completely loaded, otherwise the script would run before the elements have been loaded.

Also, not the use of the bind method on a function. addEventListener expects a function as the event handler. If we only specify this.onLoad, it will run the function, but with the this context of the event, which would be window, not our game object. this.onLoad.bind(this), on the other hand, is a function that will be executed in the context of our game.

Now, let's consider how we want to game to play out:

  • a guide color must be shown
    • this means the color needs to be generated
  • a list of colors to choose from must be displayed
    • colors need to be generated
    • one color needs to be complementary to the guide color
    • color elements need to respond to mouse clicks
  • a result must be computed from the chosen color
    • the outcome of the user choice must be displayed
    • the score will need to be calculated

This gives us the structure of the game user interface. Let's add:

  • a guide element
  • a choice list element
  • a score element
<html>
    <head>
        <link rel="stylesheet" href="complementary.css"/>
        <script type="module" src="complementary.js"></script>
    </head>
    <body>
        <div id="guideColor"></div>
        <div id="choiceColors"></div>
        <div id="score"></div>
    </body>
</html>

Note that we don't need to choose how they look (that's the CSS) or what they do (that's the JS).

This is a top-down approach, starting from user expectations and then filling in more and more details until it all works out.

Let's write the logic of the game. I won't discuss that too much, because it's pretty obvious and this post is about structure and development, not the game itself.

"use strict";
class Game {
    constructor() {
        // how many color choices to have
        this._numberOfChoices = 5;
        // the list of user scores
        this._log = [];
    }
    init(doc) {
        this._document = doc;
        this._document.addEventListener('DOMContentLoaded', this.onLoad.bind(this), false);
    }
    onLoad() {
        this._guide = this._document.getElementById('guideColor');
        this._choices = this._document.getElementById('choiceColors');
        // one click event on the parent, but event.target contains the exact element that was clicked
        this._choices.addEventListener('click', this.onChoiceClick.bind(this), false);
        this._score = this._document.getElementById('score');
        this.startRound();
    }
    startRound() {
        // all game logic works with numeric data
        const guideColor = this.randomColor();
        this._roundData = {
            guideColor: guideColor,
            choiceColors: this.generateChoices(guideColor),
            tries: new Set()
        };
        // only this method transforms the data into visuals
        this.refreshUI();
    }
    randomColor() {
        return Math.round(Math.random() * 0xFFFFFF);
    }
    generateChoices(guideColor) {
        const complementaryColor = 0xFFFFFF - guideColor;
        const index = Math.floor(Math.random() * this._numberOfChoices);
        const choices = [];
        for (let i = 0; i < this._numberOfChoices; i++) {
            choices.push(i == index
                ? complementaryColor
                : this.randomColor());
        }
        return choices;
    }
    refreshUI() {
        this._guide.style.backgroundColor = '#' + this._roundData.guideColor.toString(16).padStart(6, '0');
        while (this._choices.firstChild) {
            this._choices.removeChild(this._choices.firstChild);
        }
        for (let i = 0; i < this._roundData.choiceColors.length; i++) {
            const color = this._roundData.choiceColors[i];
            const elem = this._document.createElement('span');
            elem.style.backgroundColor = '#' + color.toString(16).padStart(6, '0');
            elem.setAttribute('data-index', i);
            this._choices.appendChild(elem);
        }
        while (this._score.firstChild) {
            this._score.removeChild(this._score.firstChild);
        }
        const threshold = 50;
        for (let i = this._log.length - 1; i >= 0; i--) {
            const value = this._log[i];
            const elem = this._document.createElement('span');

            elem.className = value >= threshold
                ? 'good'
                : 'bad';
            elem.innerText = value;
            this._score.appendChild(elem);
        }
    }
    onChoiceClick(ev) {
        const elem = ev.target;
        const index = elem.getAttribute('data-index');
        // just a regular expression test that the attribute value is actually a number
        if (!/^\d+$/.test(index)) {
            return;
        }
        const result = this.score(+index);
        elem.setAttribute('data-result', result);
    }
    score(index) {
        const expectedColor = 0xFFFFFF - this._roundData.guideColor;
        const isCorrect = this._roundData.choiceColors[index] == expectedColor;
        if (!isCorrect) {
            this._roundData.tries.add(index);
        }
        if (isCorrect || this._roundData.tries.size >= this._numberOfChoices - 1) {
            const score = 1 / Math.pow(2, this._roundData.tries.size);
            this._log.push(Math.round(100 * score));
            this.startRound();
        }
        return isCorrect;
    }
}

const game = new Game();
game.init(document);

This works, but it has several problems, including having too many responsibilities (display, logic, handling clicks, generating color strings from numbers, etc).

And while we have the logic and the structure, the display leaves a lot to be desired. Let's fix this first (I am terrible with design, so I will just dump the result here and it will be a homework for the reader to improve on the visuals).

First, I will add a new div to contain the three others. I could work directly with body, but it would be ugly:

<html>

<head>
    <link rel="stylesheet" href="complementary.css" />
    <script src="complementary.js"></script>
</head>

<body>
    <div class="board">
        <div id="guideColor"></div>
        <div id="choiceColors"></div>
        <div id="score"></div>
    </div>
</body>

</html>

Then, let's fill in the CSS:

body {
    width: 100vw;
    height: 100vh;
    margin: 0;
}
.board {
    width:100%;
    height:100%;
    display: grid;
    grid-template-columns: 50% 50%;
    grid-template-rows: min-content auto;
}
#score {
    grid-column-start: 1;
    grid-column-end: 3;
    grid-row: 1;
    display: flex;
    flex-direction: row;
    flex-wrap: nowrap;
}
#score span {
    display: inline-block;
    padding: 1rem;
    border-radius: 0.5rem;
    background-color: darkgray;
    margin-left: 2px;
}
#score span.good {
    background-color: darkgreen;
}
#score span.bad {
    background-color: red;
}
#guideColor {
    grid-column: 1;
    grid-row: 2;
}
#choiceColors {
    grid-column: 2;
    grid-row: 2;
    display: flex;
    flex-direction: column;
}
#choiceColors span {
    flex-grow: 1;
    cursor: pointer;
}
#choiceColors span[data-result=false] {
    opacity: 0.3;
}

I used a lot of flex and grid to display things.

The game should now do the following:

  • displays a left side color
  • displays five rows of different colors in the right side
  • clicking on any of them modifies the score (each wrong choice halves the maximum score of 100)
  • when there are no more moves left or the correct choice is clicked, the score is added to a list at the top of the board
  • the score tiles are either green (score>=50) or red

However, I am dissatisfied with the Javascript code. If Game has too many responsibilities it is a sign that new classes need to be created.

Refactoring the code

First, I will encapsulate all color logic into a Color class.

class Color {
    constructor(value = 0  /* black */) {
        this._value = value;
    }
    toString() {
        return '#' + this._value.toString(16).padStart(6, '0');
    }
    complement() {
        return new Color(0xFFFFFF - this._value);
    }
    equals(anotherColor) {
        return this._value === anotherColor._value;
    }
    static random() {
        return new Color(Math.round(Math.random() * 0xFFFFFF));
    }
}

This simplifies the Game class like this:

class Game {
    constructor() {
        // how many color choices to have
        this._numberOfChoices = 5;
        // the list of user scores
        this._log = [];
    }
    init(doc) {
        this._document = doc;
        this._document.addEventListener('DOMContentLoaded', this.onLoad.bind(this), false);
    }
    onLoad() {
        this._guide = this._document.getElementById('guideColor');
        this._choices = this._document.getElementById('choiceColors');
        // one click event on the parent, but event.target contains the exact element that was clicked
        this._choices.addEventListener('click', this.onChoiceClick.bind(this), false);
        this._score = this._document.getElementById('score');
        this.startRound();
    }
    startRound() {
        // all game logic works with numeric data
        const guideColor = Color.random();
        this._roundData = {
            guideColor: guideColor,
            choiceColors: this.generateChoices(guideColor),
            tries: new Set()
        };
        // only this method transforms the data into visuals
        this.refreshUI();
    }
    generateChoices(guideColor) {
        const complementaryColor = guideColor.complement();
        const index = Math.floor(Math.random() * this._numberOfChoices);
        const choices = [];
        for (let i = 0; i < this._numberOfChoices; i++) {
            choices.push(i == index
                ? complementaryColor
                : Color.random());
        }
        return choices;
    }
    refreshUI() {
        this._guide.style.backgroundColor = this._roundData.guideColor.toString();
        while (this._choices.firstChild) {
            this._choices.removeChild(this._choices.firstChild);
        }
        for (let i = 0; i < this._roundData.choiceColors.length; i++) {
            const color = this._roundData.choiceColors[i];
            const elem = this._document.createElement('span');
            elem.style.backgroundColor = color.toString();
            elem.setAttribute('data-index', i);
            this._choices.appendChild(elem);
        }
        while (this._score.firstChild) {
            this._score.removeChild(this._score.firstChild);
        }
        const threshold = 50;
        for (let i = this._log.length - 1; i >= 0; i--) {
            const value = this._log[i];
            const elem = this._document.createElement('span');

            elem.className = value >= threshold
                ? 'good'
                : 'bad';
            elem.innerText = value;
            this._score.appendChild(elem);
        }
    }
    onChoiceClick(ev) {
        const elem = ev.target;
        const index = elem.getAttribute('data-index');
        // just a regular expression test that the attribute value is actually a number
        if (!/^\d+$/.test(index)) {
            return;
        }
        const result = this.score(+index);
        elem.setAttribute('data-result', result);
    }
    score(index) {
        const expectedColor = this._roundData.guideColor.complement();
        const isCorrect = this._roundData.choiceColors[index].equals(expectedColor);
        if (!isCorrect) {
            this._roundData.tries.add(index);
        }
        if (isCorrect || this._roundData.tries.size >= this._numberOfChoices - 1) {
            const score = 1 / Math.pow(2, this._roundData.tries.size);
            this._log.push(Math.round(100 * score));
            this.startRound();
        }
        return isCorrect;
    }
}

But it's still not enough. Game is still doing a lot of UI stuff. Can we fix that? Yes, with custom HTML elements!

Here is the code. It looks verbose, but what it does is completely encapsulate UI logic into UI elements:

class GuideColor extends HTMLElement {
    set color(value) {
        this.style.backgroundColor = value.toString();
    }
}

class ChoiceColors extends HTMLElement {
    connectedCallback() {
        this._clickHandler = this.onChoiceClick.bind(this);
        this.addEventListener('click', this._clickHandler, false);
    }
    disconnectedCallback() {
        this.removeEventListener('click', this._clickHandler, false);
    }
    onChoiceClick(ev) {
        const elem = ev.target;
        if (!(elem instanceof ChoiceColor)) {
            return;
        }
        const result = this._choiceHandler(elem.choiceIndex);
        elem.choiceResult = result;
    }
    setChoiceHandler(handler) {
        this._choiceHandler = handler;
    }
    set colors(value) {
        while (this.firstChild) {
            this.removeChild(this.firstChild);
        }
        for (let i = 0; i < value.length; i++) {
            const color = value[i];
            const elem = new ChoiceColor(color, i);
            this.appendChild(elem);
        }
    }
}

class ChoiceColor extends HTMLElement {
    constructor(color, index) {
        super();
        this.color = color;
        this.choiceIndex = index;
    }
    get choiceIndex() {
        return +this.getAttribute('data-index');
    }
    set choiceIndex(value) {
        this.setAttribute('data-index', value);
    }
    set choiceResult(value) {
        this.setAttribute('data-result', value);
    }
    set color(value) {
        this.style.backgroundColor = value.toString();
    }
}

class Scores extends HTMLElement {
    set scores(log) {
        while (this.firstChild) {
            this.removeChild(this.firstChild);
        }
        for (let i = log.length - 1; i >= 0; i--) {
            const value = log[i];
            const elem = new Score(value);
            this.appendChild(elem);
        }
    }
}

class Score extends HTMLElement {
    constructor(value) {
        super();
        this.innerText = value;
        this.className = value > 50
            ? 'good'
            : 'bad';
    }
}

class Board extends HTMLElement {
    constructor() {
        super();
        this._guide = new GuideColor();
        this._choices = new ChoiceColors();
        this._score = new Scores();
    }
    connectedCallback() {
        this.appendChild(this._guide);
        this.appendChild(this._choices);
        this.appendChild(this._score);
    }
    setChoiceHandler(handler) {
        this._choices.setChoiceHandler(handler);
    }
    set guideColor(value) {
        this._guide.color = value;
    }
    set choiceColors(value) {
        this._choices.colors = value;
    }
    set scores(value) {
        this._score.scores = value;
    }
}

window.customElements.define('complementary-board', Board);
window.customElements.define('complementary-guide-color', GuideColor);
window.customElements.define('complementary-choice-colors', ChoiceColors);
window.customElements.define('complementary-choice-color', ChoiceColor);
window.customElements.define('complementary-scores', Scores);
window.customElements.define('complementary-score', Score);

With this, the HTML becomes:

<html>

<head>
    <link rel="stylesheet" href="complementary.css" />
    <script src="complementary.js"></script>
</head>

<body>
    <complementary-board>
    </complementary-board>
</html>

and the CSS:

body {
    width: 100vw;
    height: 100vh;
    margin: 0;
}
complementary-board {
    width:100%;
    height:100%;
    display: grid;
    grid-template-columns: 50% 50%;
    grid-template-rows: min-content auto;
}
complementary-scores {
    grid-column-start: 1;
    grid-column-end: 3;
    grid-row: 1;
    display: flex;
    flex-direction: row;
    flex-wrap: nowrap;
}
complementary-score {
    display: inline-block;
    padding: 1rem;
    border-radius: 0.5rem;
    background-color: darkgray;
    margin-left: 2px;
}
complementary-score.good {
    background-color: darkgreen;
}
complementary-score.bad {
    background-color: red;
}
complementary-guide-color {
    grid-column: 1;
    grid-row: 2;
}
complementary-choice-colors {
    grid-column: 2;
    grid-row: 2;
    display: flex;
    flex-direction: column;
}
complementary-choice-color {
    flex-grow: 1;
    cursor: pointer;
}
complementary-choice-color[data-result=false] {
    opacity: 0.3;
}

Next

In the next blog posts we will see how we can test our code (we have to make it more testable first!) and how we can use Git as source control. Finally we should have a working game that can be easily modified independently: the visual design, the working code, the structural elements.

First of all, what is TaskCompletionSource<T>? It's a class that returns a task that does not finish immediately and then exposes methods such as TrySetResult. When the result is set, the task completes. We can use this class to turn an event based programming model to an await/async one.

In the example below I will use a Windows Forms app, just so I have access to the Click handler of a Button. Only instead of using the normal EventHandler approach, I will start a thread immediately after InitializeComponent that will react to button clicks.

Here is the Form constructor. Note that I am using Task.Factory.StartNew instead of Task.Run because I need to specify the TaskScheduler in order to have access to a TextBox object. If it were to log something or otherwise not involve the UI, a Task.Run would have been sufficient.

    public Form1()
    {
        InitializeComponent();
        Task.Factory.StartNew(async () =>
        {
            while (true)
            {
                await ClickAsync(button1);
                textBox1.AppendText($"I was clicked at {DateTime.Now:HH:mm:ss.fffff}!\r\n");
            }
        },
        CancellationToken.None,
        TaskCreationOptions.DenyChildAttach,
        TaskScheduler.FromCurrentSynchronizationContext());
    }

What's going on here? I have a while (true) block and inside it I am awaiting a method then write something in a text box. Since await is smart enough to not use CPU and not block threads, this approach doesn't have any performance drawbacks.

Now, for the ClickAsync method:

    private Task ClickAsync(Button button1)
    {
        var tcs = new TaskCompletionSource<object>();
        void handler(object s, EventArgs e) => tcs.TrySetResult(null);
        button1.Click += handler;
        return tcs.Task.ContinueWith(_ => button1.Click -= handler);
    }

Here I am creating a task completion source, I am adding a handler to the Click event, then I am returning the task, which I continue with removing the handler. The handler just sets the result on the task source, thus completing the task.

The flow comes as follows:

  1. the source is created
  2. the handler is attached
  3. the task is returned, but does not complete, thus the loop is halted in await
  4. when the button is clicked, the source result is set, then the handler is removed
  5. the task completed, the await finishes and the text is appended to the text box
  6. the loop continues

It would have been cool if the method to turn an event to an async method would have worked like this: await button1.Click.MakeAsync(), but events are not first class citizens in .NET. Instead, something more cumbersome can be used to make this more generic (note that there is no error handling, for demo purposes):

    public Form1()
    {
        InitializeComponent();
        Task.Factory.StartNew(async () =>
        {
            while (true)
            {
                await EventAsync(button1, nameof(Button.Click));
                textBox1.AppendText($"I was clicked at {DateTime.Now:HH:mm:ss.fffff}!\r\n");
            }
        },
        CancellationToken.None,
        TaskCreationOptions.DenyChildAttach,
        TaskScheduler.FromCurrentSynchronizationContext());
    }

    private Task EventAsync(object obj, string eventName)
    {
        var eventInfo = obj.GetType().GetEvent(eventName);
        var tcs = new TaskCompletionSource<object>();
        EventHandler handler = delegate (object s, EventArgs e) { tcs.TrySetResult(null); };
        eventInfo.AddEventHandler(obj, handler);
        return tcs.Task.ContinueWith(_ => eventInfo.RemoveEventHandler(obj, handler));
    }

Notes:

  • is this a better method of doing things? That depends on what you want to do.
  • If you were to use Reactive Extensions, you can turn an event into an Observable with Observable.FromEventPattern.
  • I see it useful not for button clicks (that while true loop scratches at my brain), but for classes that have Completed events.
  • obviously the EventAsync method is not optimal and has no exception handling

  You are writing some code and you find yourself needing to call an async method in your event handler. The event handler, obviously, has a void return type and is not async, so when using await in it, you will get a compile error. I actually had a special class to execute async method synchronously and I used that one, but I didn't actually need it.

  The solution is extremely simple: just mark the event handler as async.

  You should never use async void methods, instead use async Task or async Task<T1,...>. The exception, apparently, is event handlers. And it kind of makes sense: an event handler is designed to be called asynchronously.

  More details here: Tip 1: Async void is for top-level event-handlers only

  And bonus: but what about constructors? They can't be marked as async, they have no return type!

  First, why are you executing code in your constructor? And second, if you absolutely must, you can also create an async void method that you call from the constructor. But the best solution is to make the constructor private and instead use a static async method to create the class, which will execute whatever code you need and then return new YourClass(values returned from async methods).

  That's a very good pattern, regardless of asynchronous methods: if you need to execute code in the constructor, consider hiding it and using a creation method instead.

As you possibly know, I am sending automated Tweets whenever I am writing a new blog post, containing the URL of the post. And since I've changed the blog engine, I was always seeing the Twitter card associated with the URL having the same image, the blog tile. So I made changes to display a separate image for each post, by populating the og:image meta property. It didn't work. The Twitter card now refused to show the image. And it was all because the URL I was using was relative to the blog site.

First of all, in order to test how a URL will look in Twitter, use the Twitter Card validator. Second, the Open Graph doesn't specify that the URL in og:image should be absolute, but in practice both Twitter and Facebook expect it to be.

So whenever populating Open Graph meta tags with URLs, make sure they are absolute, not relative.

Recently I found out about custom task schedulers and I wanted to blog about all the wonderful things you can do with them. I also imagined new ways of doing await/async by tweaking task schedulers. After hours of attempts, I've come to the conclusion that custom task schedulers are incompatible with await/async and should not be used. Here is why:

  • a task scheduler is used to execute synchronous code inside tasks while async/await code is already asynchronous
  • while async/await code is transformed by the compiler into a state machine with the code that follows being turned into a task that is scheduled on TaskScheduler.Current, the state machine has nothing to do with the task scheduler (see Dissecting the async methods in C#)
  • there are no methods that are both aware of await/async code and a custom task scheduler; by design they are incompatible (see Task.Run vs Task.Factory.StartNew)
  • while a stubborn developer could reproduce the functionality of Task.Run and specify a custom task scheduler, or detect tasks that return tasks and Unwrap them, there are easier and safer ways of doing the same thing without a custom task scheduler
  • as the scheduler will be used not only by the tasks run by the developer, but also by the code separated by await boundaries, the results will be unpredictable except the most simple of scenarios

And a pretty diagram from Microsoft representing the order of the operations and how complex they are. It's not just a case of method executed somewhere, but a complex flow that uses the ThreadPoolTaskScheduler as the default task scheduler as a fundamental low level functionality that should not be changed.

If you need more convincing, consider that the code after an await instruction may not even execute on the same thread (or indeed thread pool) as the one before, even if as written appears part of the same method (see async - stay on the current thread? for more details). More on thread pools from Jon Skeet here: The Thread Pool and Asynchronous Methods.

I have been asking this of people at the interviews I am conducting and I thought I should document the correct answer and the expected behavior. And yes, we've filled the position in for this interview question, so you can't cheat :)

The question is quite banal: given two tables (TableA and TableB) both having a column ID, select the rows in TableA that don't have any corresponding row in TableB with the same ID.

Whenever you are answering an interview question, remember that your thinking process is just as important as the answer. So saying nothing, while better than "so I am adding 1 and 1 and getting 2", may not be your best option. Assuming you don't know the answer, a reasonable way of tackling any problem is to take it apart and try to solve every part separately. Let's do this here.

As the question requires the rows in A, select them:

SELECT * FROM TableA

Now, a filter should be applied, but which one? Here are some ideas:

  1. WHERE ID NOT IN (SELECT ID FROM TableB)
  2. WHERE NOT EXISTS (SELECT * FROM TableB WHERE TableA.ID=TableB.ID)
  3. EXCEPT SELECT ID FROM TableB -- this requires to select only ID from TableA, as well (EXCEPT and INTERSECT are new additions to SQL 2019)

Think about it. Any issues with any of them? Any other options?

To test performance, I've used two tables with approximately 35 million rows. Here are the results:

  1. After 17 minutes I had to stop the query. Also, NOT IN has issues with NULL as a value is nether equal or unequal to NULL. SELECT * FROM Table WHERE Value NOT IN (NULL) for example, will always return no rows.
  2. It finished within 4 seconds. There are still issues with NULL, though, as a simple equality would not work with NULL. Assuming we wanted the non-null values of TableA, we're good.
  3. It finished within 5 seconds. This doesn't have any issues with NULL. SELECT NULL EXCEPT SELECT NULL will return no rows, while SELECT 1 EXCEPT SELECT NULL will return a row with the value 1. The syntax is pretty ugly though and works badly if the tables have other columns

What about another solution? We've exhausted simple filtering, how about another avenue? Whenever we want to combine information from two tables we use JOIN, but is that the case here?

SELECT * FROM TableA a
JOIN TableB b
ON a.ID = b.ID -- again, while I would ask people in the interview about null values, we will assume for this post that the values are not nullable

I've used a JOIN keyword, which translates to an INNER JOIN. The query above will select rows from A, but only those that have a correspondence in B. A funny solution to a slightly different question: count the items in A that do not have corresponding items in B:

SELECT (SELECT COUNT(*) FROM TableA) - (SELECT COUNT(*) FROM TableA a JOIN TableB b ON a.ID = b.ID)

However, we want the inverse of the INNER JOIN. What other types of JOINs are there? Bonus interview question! And the answers are:

  • INNER JOIN - returns only rows that have been successfully joined
  • OUTER JOIN (LEFT AND RIGHT) - returns all rows of one table joined to the corresponding values of the other table or NULLs if none
  • CROSS JOIN - returns all the rows in A joined with all the rows in B and all the rows in B that have no match in A

INNER would not work, as demonstrated, so what about a CROSS JOIN? Clearly not, as it will generate 100 trillion rows before filtering anything. SQL Server would optimize a lot of the query, but it would look really weird anyway.

Is there a solution with OUTER JOIN? RIGHT OUTER JOIN will get the rows in B, not in A, so LEFT OUTER JOIN, by elimination, is the only remaining possible solution.

SELECT a.* FROM TableA a 
LEFT OUTER JOIN TableB b
ON a.ID=b.ID

This returns ALL the rows in table A and for each of them, rows in table B that have the same id. In case of a mismatch, though, for a row in table A with no correspondence in table B, we get a row of NULL values. So all we have to do is filter for those. We know that there are no NULLs in the tables, so here is another working solution, solution 4:

SELECT a.* FROM TableA a 
LEFT OUTER JOIN TableB b
ON a.ID=b.ID
WHERE b.ID IS NULL

This solves the problem, as well, in about 4 seconds. However, the other working solution within the same time (solution 2 above) only works as well because newer versions of SQL server are optimizing the execution. Maybe it's a personal preference from the times solution 4 was clearly the best in terms of performance, but I would chose that as the winner.

Summary

  • You can either use NOT EXISTS (and not NOT IN!) or a LEFT OUTER JOIN with a filter on NULL b values.
  • It's important to know if you have NULL values in the joining columns and it's extra points for asking that from your interviewer
  • If not asking, I would penalize solutions that do not take NULL values in consideration. Extra complexity of code, as one cannot simply check for NULL for solution 4. Also a decision has to be made on the expected behavior when working with NULL values
  • When trying to find the solution to a problem in an interview:
    • think of concrete examples of the problem so you can test your solutions
    • break the problems into manageable bits if possible
    • think aloud, but to the point
  • Also, there is nothing more annoying than doing that thing pupils in school do: looking puppy eyed at the teacher while listing solutions to elicit a response. You're not in school anymore. Examples are dirty, time is important, no one cares about your grades.
  • Good luck out there!

I got this exception that doesn't appear anywhere on Google while I was debugging a .NET Core web app. You just have to enable Windows Authentication in the project properties (Debug tab). Duh!

System.InvalidOperationException: The Negotiate Authentication handler cannot be used on a server that directly supports Windows Authentication. Enable Windows Authentication for the server and the Negotiate Authentication handler will defer to it.
   at Microsoft.AspNetCore.Authentication.Negotiate.PostConfigureNegotiateOptions.PostConfigure(String name, NegotiateOptions options)
   at Microsoft.Extensions.Options.OptionsFactory`1.Create(String name)
   at Microsoft.Extensions.Options.OptionsMonitor`1.<>c__DisplayClass11_0.<Get>b__0()
   at System.Lazy`1.ViaFactory(LazyThreadSafetyMode mode)
   at System.Lazy`1.ExecutionAndPublication(LazyHelper executionAndPublication, Boolean useDefaultConstructor)
   at System.Lazy`1.CreateValue()
   at System.Lazy`1.get_Value()
   at Microsoft.Extensions.Options.OptionsCache`1.GetOrAdd(String name, Func`1 createOptions)
   at Microsoft.Extensions.Options.OptionsMonitor`1.Get(String name)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.InitializeAsync(AuthenticationScheme scheme, HttpContext context)
   at Microsoft.AspNetCore.Authentication.AuthenticationHandlerProvider.GetHandlerAsync(HttpContext context, String authenticationScheme)
   at Microsoft.AspNetCore.Authentication.AuthenticationService.ChallengeAsync(HttpContext context, String scheme, AuthenticationProperties properties)
   at Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context)

This translates to a change in Properties/launchSettings.json like this:

{
  "iisSettings": {
    "windowsAuthentication": true,
    "anonymousAuthentication": true,
    //...
  },
  //...
}