The point of regular expression character classes is to simplify your expressions, but they can introduce subtle bugs or efficiency issues.

Let's check out this StackOverflow answer to question \d less efficient than [0-9]

\d checks all Unicode digits, while [0-9] is limited to these 10 characters. For example, Persian digits, ۱۲۳۴۵۶۷۸۹, are an example of Unicode digits which are matched with \d, but not [0-9].

This makes sense, only it has never occurred to me until this very moment. I would never use a [0-9] notation and I would replace it with a \d if found in code.

What does that mean?

One simple consequence of such a class would be performance: searching for a large list of characters is less efficient. Another would be introducing the possibility for bugs or even malicious attacks. Let's see the code for a calculator that adds two numbers. It's a silly piece of code, but imagine that a more complex one would take the user content and save it into a database, try to process it or display it.

static void Main(string[] args)
{
    Console.InputEncoding = Encoding.Unicode;
    var firstNumber = GetNumberString();
    var secondNumber = GetNumberString();
    Console.WriteLine("Sum = "+(int.Parse(firstNumber) + int.Parse(secondNumber)));
}

private static string GetNumberString()
{
    string result=null;
    var isNumber = false;
    while (!isNumber)
    {
        Console.Write("Enter a number: ");
        result = Console.ReadLine();
        isNumber = Regex.IsMatch(result, @"^\d+$");
        if (!isNumber)
        {
            Console.WriteLine($"{result} is not a number! Try again.");
        }
    }
    return result;
}

This will try to get numbers as a string and test it using the regular expression ^\d+$, which means the string has to consist of one or more digits. Note that I had to set the console input encoding to Unicode in order to be able to paste Persian numbers. This code works fine until I use Arabic or Persian digits, where it breaks in the int.Parse method. Using ^[0-9]$ as the regular expression pattern would solve this issue.

Same issue will occur with \w (warning: \w is letters AND digits) and [a-zA-Z] (or just [a-z] and using RegexOptions.IgnoreCase).

If one uses code to determine the number of matches for each regular expression pattern

var regexPattern = @"\d";
var nr = 0;
for (int i = 0; i < ushort.MaxValue; i++)
{
    string str = Convert.ToChar(i).ToString();
    if (Regex.IsMatch(str, regexPattern))
        nr++;
}
Console.WriteLine(nr);

we get this:

  • for \d : 370
    • ALL digits
  • for \w : 50320
    • ALL word characters (including digits) 
  • for [^\W\d] : 49950
    • ALL word characters, but not the digits 
  • for \p{L} : 48909
    • ALL letters
  • for [A-Za-z] : 52
    • letters from a to z
  • for [0-9] : 10
    • digits from 0 to 9

I hope this helps.

Let's start with a simple requirement and the obvious first draft of the code. The requirement is "build a method that receives a string and writes to the console a URL using that string as a parameter".

The obvious first attempt and working in every version of C# would be:

private static void ShowUrl(string something)
{
    var url = "https://siderite.dev/blog/search/formattablestring?param1="+something;
    Console.WriteLine(url);
}

But then you get some problems. You want to use a parameter that has strange characters in it, like an email, another URL, arithmetic symbols, etc. So you realize that you need to use an URL encode class. Next version is:

private static void ShowUrl(string something)
{
    var url = "https://siderite.dev/blog/search/formattablestring?param1="+Uri.EscapeDataString(something);
    Console.WriteLine(url);
}

Now that works for a second. One might argue about the difference between System.Web.HttpUtility.UrlEncode, System.Net.WebUtility.UrlEncode and System.Net.Uri.EscapeDataString. Yet, his is not the post to do that. Follow the above links for more information.

Then someone decides to use a null for the parameter and you get a ArgumentNullException and you rage in frustration "This is complicated and ugly! The first version looked much better and handled nulls, too. And now I have to add this long Uri.EscapeDataString to every URL parameter in every URL building code I write and also check for null!".

But... what if you could write the method just like the first version and get rid of every problem?

First of all, let's get rid of that plus sign and use interpolated strings. They've been around for awhile:

private static void ShowUrl(string something)
{
    var url = $"https://siderite.dev/blog/search/formattablestring?param1={something}";
    Console.WriteLine(url);
}

Now, this has the same problem of the first version: no URL encoding. And you don't want to add an escape function to every parameter (imagine there would be 10 parameters in this URL). How about we use the fact that the URL is an interpolated string?

Check out this code:

private static void ShowUrl(string something)
{
    var url = UrlHelper.Url($"https://siderite.dev/blog/search/formattablestring?param1={something}");
    Console.WriteLine(url);
}

It uses an UrlHelper class to fix the problems with the URL in the string we generate. Or is it a string? I was writing a long time ago a post about FormattableString and how I didn't see a real use scenario for it. Well, this is it! Let's see what UrlHelper looks like:

public static class UrlHelper
{
    public static string Url(FormattableString url)
    {
        return string.Format(
            url.Format, 
            url.GetArguments()
                .Select(a => Uri.EscapeDataString(a?.ToString()??""))
                .ToArray());
    }
}

The Url method receives not a string, but a FormattableString. When an interpolated string is used as a FormattableString, it gives the developer access to a Format string and a list of arguments in GetArguments. In order to convert it to a string, one must only do String.Format(s.Format,url.GetArguments()).  In our case, the Format property will have the value "https://siderite.dev/blog/search/formattablestring?param1={0}" and the arguments list will contain one value: the value of the something parameter.

Therefore, the Url method will be able to format the string, but first URL encode every single parameter in it.

I admit, this might not be the most readable code in the world, which is my pet peeve with FormattableString. Reading the ShowUrl method you have no clue that UrlHelper.Url receives a FormattableString as parameter. One might even look at that string and say "Hey, wait a minute! That's a bug waiting to happen!" and then proceed to fix it. Perhaps the name of the method could be made more intuitive like EncodeUrlParameters.

But wait! We can do better. I don't know how it might help in the future, but perhaps someone might want to process the parameters of the URL further. By returning a string we eliminate some of the information of the incoming parameter and we don't want to do that.

Also, careful people will notice that using String.Format introduces a subtle bug (or feature): we format the string without specifying a culture. This might be fine, using the current one by default, but it might also cause some issues. The FormattableString class does provide the static CurrentCulture and Invariant methods and, of course, the ToString method would work just fine as well, but then we lose the ability to process the parameters.

So here is the final version of the UrlHelper class:

public static class UrlHelper
{
    public static FormattableString EncodeUrlParameters(FormattableString url)
    {
        return FormattableStringFactory.Create(
            url.Format,
            url.GetArguments()
                .Select(a => Uri.EscapeDataString(a?.ToString() ?? ""))
                .ToArray());
    }
}

By using FormattableStringFactory from System.Runtime.CompilerServices we get a FormattableString as an argument and we return one, with parameters URL encoded. Now, if the result of the method is used as a string, it will be automatically handled by ToString and if it will be used as a FormattableString it will contain all the information provided by the original parameters.

Hope that helps!

Bonus time! There's more!

What if you could make this behavior built in by introducing a new type? Methods from, let's say, a REST client class could use FormattableStrings as URL parameters. But what if we could specify the type of the parameters and get the implicit (hint, hint!) result already URL encoded?

public class UrlString
{
    private FormattableString _formattableString;

    public UrlString(FormattableString formattableString)
    {
        _formattableString = formattableString;
    }

    public static implicit operator FormattableString(UrlString urlString)
    {
        if (urlString == null) return null;
        return UrlHelper.EncodeUrlParameters(urlString.ToFormattableString());
    }

    public static implicit operator string(UrlString urlString)
    {
        if (urlString == null) return null;
        return ((FormattableString)urlString).ToString();
    }

    public static implicit operator UrlString(FormattableString formattableString)
    {
        return new UrlString(formattableString);
    }

    private FormattableString ToFormattableString()
    {
        return _formattableString;
    }
}

This UrlString class will implicitly be converted into a FormattableString or a string and a FormattableString will be converted into a UrlString. So now your code might look like this:

// used like this
RestClient.Get($"https://test.com?x={something}");

// this method does not URL encode anything
public RestResponse Get(string url) {
  ...
}

// this method does URL encode everything, just by changing the type
public RestResponse Get(UrlString url) {
  ...
}

Intro

Dependency injection is baked in the ASP.Net Core projects (yes, I still call it Core), but it's missing from console app templates. And while it is easy to add, it's not that clear cut on how to do it. I present here three ways to do it:

  1. The fast one: use the Worker Service template and tweak it to act like a console application
  2. The simple one: use the Console Application template and add dependency injection to it
  3. The hybrid: use the Console Application template and use the same system as in the Worker Service template

Tweak the Worker Service template

It makes sense that if you want a console application you would select the Console Application template when creating a new project, but as mentioned above, it's just the default template, as old as console apps are. Yet there is another default template, called Worker Service, which almost does the same thing, only it has all the dependency injection goodness baked in, just like an ASP.Net Core Web App template.

So start your Visual Studio, add a new project and choose Worker Service:

It will create a project containing a Program.cs, a Worker.cs and an appsettings.json file. Program.cs holds the setup and Worker.cs holds the code to be executed.

Worker.cs has an ExecuteAsync method that logs some stuff every second, but even if we remove the while loop and add our own code, the application doesn't stop. This might be a good thing, as sometimes we just want stuff to work until we press Ctrl-C, but it's not a console app per se.

In order to transform it into something that works just like a console application you need to follow these steps:

  1. inject an IHost instance into your worker
  2. specifically instruct the host to stop whenever your code has finished

So, you go from:

public class Worker : BackgroundService
{
    private readonly ILogger<Worker> _logger;

    public Worker(ILogger<Worker> logger)
    {
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            _logger.LogInformation("Worker running at: {time}", DateTimeOffset.Now);
            await Task.Delay(1000, stoppingToken);
        }
    }
}

to:

public class Worker : BackgroundService
{
    private readonly ILogger<Worker> _logger;
    private readonly IHost _host;

    public Worker(ILogger<Worker> logger, IHost host)
    {
        _logger = logger;
        _host = host;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        Console.WriteLine("Hello world!");
        _host.StopAsync();
    }
}

Note that I did not "await" the StopAsync method because I don't actually need to. You are telling the host to stop and it will do it whenever it will see fit.

If we look into the Program.cs code we will see this:

public class Program
{
    public static void Main(string[] args)
    {
        CreateHostBuilder(args).Build().Run();
    }

    public static IHostBuilder CreateHostBuilder(string[] args) =>
        Host.CreateDefaultBuilder(args)
            .ConfigureServices((hostContext, services) =>
            {
                services.AddHostedService<Worker>();
            });
}

I don't know why they bothered with creating a new method and then writing it as an expression body, but that's the template. You see that there is a lambda adding dependencies (by default just the Worker class), but everything starts with Host.CreateDefaultBuilder. In .NET source code, this leads to HostingHostBuilderExtensions.ConfigureDefaults, which adds a lot of stuff:

  • environment variables to config
  • command line arguments to config (via CommandLineConfigurationSource)
  • support for appsettings.json configuration
  • logging based on operating system

That is why, if you want these things by default, your best bet is to tweak the Worker Service template

Add Dependency Injection to an existing console application

Some people want to have total control over what their code is doing. Or maybe you already have a console application doing stuff and you just want to add Dependency Injection. In that case, these are the steps you want to follow:

  1. create a ServiceCollection instance (needs a dependency to Microsoft.Extensions.DependencyInjection)
  2. register all dependencies you want to use to it
  3. create a starting class that will execute stuff (just like Worker above)
  4. register starting class in the service collection
  5. build a service provider from the collection
  6. get an instance of the starting class and execute its one method

Here is an example:

class Program
{
    static void Main(string[] args)
    {
        var services = new ServiceCollection();
        ConfigureServices(services);
        services
            .AddSingleton<Executor,Executor>()
            .BuildServiceProvider()
            .GetService<Executor>()
            .Execute();
    }

    private static void ConfigureServices(IServiceCollection services)
    {
        services
            .AddSingleton<ITest, Test>();
    }
}

public class Executor
{
    private readonly ITest _test;

    public Executor(ITest test)
    {
        _test = test;
    }

    public void Execute()
    {
        _test.DoSomething();
    }
}

The only reason we register the Executor class is in order to get an instance of it later, complete with constructor injection support. You can even make Execute an async method, so you can get full async/await support. Of course, for this example appsettings.json configuration or logging won't work until you add them.

Mix them up

Of course, one could try to get the best of both worlds. This would work kind of like this:

  1. use Host.CreateDefaultBuilder() anyway (needs a dependency to Microsoft.Extensions.Hosting), but in a normal console app
  2. use the resulting service provider to instantiate a starting class
  3. start it

Here is an example:

class Program
{
    static void Main(string[] args)
    {
        Host.CreateDefaultBuilder()
            .ConfigureServices(ConfigureServices)
            .ConfigureServices(services => services.AddSingleton<Executor>())
            .Build()
            .Services
            .GetService<Executor>()
            .Execute();
    }

    private static void ConfigureServices(HostBuilderContext hostContext, IServiceCollection services)
    {
        services.AddSingleton<ITest,Test>();
    }
}

The Executor class would be just like in the section above, but now you get all the default logging and configuration options from the Worker Service section.

Conclusion

What the quickest and best solution is depends on your situation. One could just as well start with a Worker Service template, then tweak it to never Run the builder and instead configure it as above. One can create a Startup class, complete with Configure and ConfigureServices as in an ASP.Net Core Web App template or even start with such a template, then tweak it to work as a console app/web hybrid. In .NET Core everything is a console app anyway, it's just depends on which packages you load and how you configure them.

  We have been told again and again that our code should be split into smaller code blocks that are easy to understand and read. The usual example is a linear large method that can easily be split into smaller methods that just execute one after the other. But in real life just selecting a logical bit of our method and applying an Extract Method refactoring doesn't often work because code rarely has a linear flow.

  Let's take a classic example of "exiting early", another software pattern that I personally enjoy. It says that you extract the functionality of retrieving some data into a method and, inside it, return from the method as soon as you fail a necessary condition for extracting that data. Something like this:

public Order OrderCakeForClient(int clientId) {
  var client = _clientRepo.GetClient(clientId);
  if (client == null) {
    _logger.Log("Client doesn't exist");
    return null;
  }
  var preferences = _clientRepo.GetPreferences(client);
  if (preferences == null) {
    _logger.Log("Client has not submitted preferences");
    return null;
  }
  var existingOrder = _orderRepo.GetOrder(client, preferences);
  if (existingOrder != null) {
    _logger.Log("Client has already ordered their cake");
    return existingOrder;
  }
  var cake = _inventoryRepo.GetCakeByPreferences(preferences);
  if (cake == null) {
    cake = MakeNewCake(preferences);
  }
  if (cake == null) {
    _logger.Log("Cannot make cake as requested by client");
    return null;
  }
  var order = _orderRepo.GenerateOrder(client, cake);
  if (order == null) {
    _logger.Log("Generating order failed");
    throw new OrderFailedException();
  }
  return order;
}

This is a long method that appears to be linear, but in fact it is not. At every step there is the opportunity to exit the method therefore this is like a very thin tree, not a line. And it is a contrived example. A real life flow would have multiple decision points, branching and then merging back, with asynchronous logic and so on and so on.

If you want to split it into smaller methods you will have some issues:

  • the most of the code is not actual logic code, but logging code - one could make a method called getPreferences(client), but it would just proxy the _clientRepo method and gain nothing
  • one could try to extract a method that gets the preferences, logs and decides to exit - one cannot do that directly, because you cannot make an exit decision in a child method

Here is a way that this could work. I personally like it, but as you will see later on, it's not sufficient. Let's encode the exit decision as a boolean and get the information we want as out parameters. Then the code would look like this:

public Order OrderCakeForClient(int clientId) {
  if clientExists(clientId, out Client client)
    && clientSubmittedPreferences(client, out Preferences preferences)
    && orderDoesntExist(client, preferences, out Order existingOrder)
    && getACake(preferences, out Cake cake) {
    return generateOrder(client,cake);
  }
  return existingOrder;
}

private bool clientExists(clientId, out Client client) {
  client = _clientRepo.GetClient(clientId);
  if (client == null) {
    _logger.Log("Client doesn't exist");
    return false;
  }
  return true; 
}

private bool clientSubmittedPreferences(Client client, out Preferences preferences) {
  preferences = _clientRepo.GetPreferences(client);
  if (preferences == null) {
    _logger.Log("Client has not submitted preferences");
    return false;
  }
  return true;
}

private bool orderDoesntExist(Client client, Preferences preferences,
                               out Order existingOrder) {
  existingOrder = _orderRepo.GetOrder(client, preferences);
  if (existingOrder != null) {
    _logger.Log("Client has already ordered their cake");
    return false;
  }
  return true;
}

private bool getACake(Preferences preferences, out Cake cake)
  cake = _inventoryRepo.GetCakeByPreferences(preferences);
  if (cake == null) {
    cake = MakeNewCake(preferences);
  }
  if (cake == null) {
    _logger.Log("Cannot make cake as requested by client");
    return false;
  }
  return true;
}

private Order generateOrder(Client client, Cake cake)
  var order = _orderRepo.GenerateOrder(client, cake);
  if (order == null) {
    _logger.Log("Generating order failed");
    throw new OrderFailedException();
  }
  return order;
}

Now reading OrderCakeForClient is a joy! Other than the kind of weird assignment of existingOrder in a condition method, which is valid in C# 7, but not that easy to read, one can grasp the logic of the method from a single glance as a process that only runs if four conditions are met.

However, there are some problems here. One of them is that the out syntax only works for non-async methods and it is generally frowned upon these days. The method assumes a cake can just be made instantly, when we know it takes time, care, effort and kitchen cleaning. Also, modern code rarely features synchronous database access.

So how do we fix it? The classic solution for getting rid of the out syntax is to create custom return types that contain everything the method returns. Let's try to rewrite the method with async in mind:

public async Task<Order> OrderCakeForClient(int clientId) {
  var clientExistsResult = await clientExists(clientId);
  if (!clientExistsResult.Success) return null;
  var orderDoesntExistResult = await orderDoesntExist(client, preferences)
  if (!orderDoesntExistResult.Success) return orderDoesntExistResult.Order;
  var getACakeResult = await getACake(preferences);
  if (!getACakeResult.Success) return null;
  return generateOrder(client,cake);
}

private async Task<ClientExistsResult> clientExists(clientId) {
  client = await _clientRepo.GetClient(clientId);
  if (client == null) {
    _logger.Log("Client doesn't exist");
    return new ClientExistsResult { Success = false };
  }
  return new ClientExistsResult { Client = client, Success = false }; 
}

// the rest ommitted for brevity

Wait... this doesn't look good at all! We wrote a ton of extra code and we got... something that is more similar to the original code than the easy to read one. What happened? Several things:

  • because we couldn't return a bool, he had to revert to early exiting - not bad in itself, but adds code that feels to duplicate the logic in the condition methods
  • we added a return type for each condition method - extra code that provides no extra functionality, not to mention ugly
  • already returning bool values was awkward, returning these result objects is a lot more so
  • the only reason why we needed to return a result value (bool or something more complex) was to move the logging behavior in the smaller methods

"Step aside!" the software architect will say and quickly draw on the whiteboard a workflow framework that will allow us to rewrite any flow as a combination of classes that can be configured in XML. He is dragged away from the room, spitting and shouting.

The problem in the code is that we want to do three things at the same time:

  1. write a simple, easy to read, flow-like code
  2. make decisions based on partial results of that code
  3. store the partial results of that code

There is a common software pattern that is normally used for flow like code: the builder pattern. However, it feels unintuitive to use it here. First of all, the builder pattern needs some masterful code writing to make it work with async/await. Plus, most people, when thinking about a builder, they think of a separate reusable class that handles logic that is independent from the place where it is used. But it doesn't need to be so. Let's rewrite this code using a builder like logic:

public async Task<Order> OrderCakeForClient(int clientId)
{
    SetClientId(clientId);
    await IfClientExists();
    await IfClientSubmittedPreferences();
    await IfOrderDoesntExist();
    await IfGotACake();
    return await GenerateOrder();
}

private bool _continueFlow = true;
private int _clientId;
private Client _client;
private Preferences _preferences;
private Order _existingOrder;
private Cake _cake;

public void SetClientId(int clientId)
{
    _clientId = clientId;
    _existingOrder = null;
    _continueFlow = true;
}

public async Task IfClientExists()
{
    if (!_continueFlow) return;
    _client = await _clientRepo.GetClient(_clientId);
    test(_client != null, "Client doesn't exist");
}

public async Task IfClientSubmittedPreferences()
{
    if (!_continueFlow) return;
    _preferences = await _clientRepo.GetPreferences(_client);
    test(_preferences != null, "Client has not submitted preferences");
}

public async Task IfOrderDoesntExist()
{
    if (!_continueFlow) return;
    _existingOrder = await _orderRepo.GetOrder(_client, _preferences);
    test(_existingOrder == null, "Client has already ordered their cake");
}

public async Task IfGotACake()
{
    if (!_continueFlow) return;
    _cake = await _inventoryRepo.GetCakeByPreferences(_preferences)
        ?? await MakeNewCake(_preferences);
    test(_cake != null, "Cannot make cake as requested by client");
}

public async Task<Order> GenerateOrder()
{
    if (!_continueFlow) return _existingOrder;
    var order = await _orderRepo.GenerateOrder(_client, _cake);
    if (order == null)
    {
        _logger.Log("Generating order failed");
        throw new OrderFailedException();
    }
    return order;
}

private void test(bool condition, string logText)
{
    if (!condition)
    {
        _continueFlow = false;
        _logger.Log(logText);
    }
}

No need for a new builder class, just add builder like methods to the existing class. No need to return the instance of the class, since it's only used inside it. No need for chaining because we don't return anything and chaining with async is a nightmare.

The ugly secret is, of course, using fields of the class instead of local variables. That may add other problems, like multithread safety issues, but those are not in the scope of this post.

I've shown a pretty stupid piece of code that anyway was difficult to properly split into smaller methods. I've demonstrated several ways to do that, the choice of which depends on the actual requirements of your code. Hope that helps!

  Every month or so I see another article posted by some dev, usually with a catchy title using words like "demystifying" or "understanding" or "N array methods you should be using" or "simplify your Javascript" or something similar. It has become so mundane and boring that it makes me mad someone is still trying to cache on these tired ideas to try to appear smart. So stop doing it! There is no need to explain methods that were introduced in 2009!

  But it gets worse. These articles are partially misleading because Javascript has evolved past the need to receive or return data as arrays. Let me demystify the hell out of you.

  First of all, the methods we are discussing here are .filter and .map. There is of course .reduce, but that one doesn't necessarily return an array. Ironically, one can write both .filter and .map as a reduce function, so fix that one and you can get far. There is also .sort, which for performance reasons works a bit differently and returns nothing, so it cannot be chained as the others can. All of these methods from the Array object have something in common: they receive functions as parameters that are then applied to all of the items in the array. Read that again: all of the items.

  Having functions as first class citizens of the language has always been the case for Javascript, so that's not a great new thing to teach developers. And now, with arrow functions, these methods are even easier to use because there are no scope issues that caused so many hidden errors in the past.

  Let's take a common use example for these methods for data display. You have many data records that need to be displayed. You have to first filter them using some search parameters, then you have to order them so you can take just a maximum of n records to display on a page. Because what you display is not necessarily what you have as a data source, you also apply a transformation function before returning something. The code would look like this:

var colors = [
  {    name: 'red',    R: 255,    G: 0,    B: 0  },
  {    name: 'blue',   R: 0,      G: 0,    B: 255  },
  {    name: 'green',  R: 0,      G: 255,  B: 0  },
  {    name: 'pink',   R: 255,    G: 128,  B: 128  }
];

// it would be more efficient to get the reddish colors in an array
// and sort only those, but we want to discuss chaining array methods
colors.sort((c1, c2) => c1.name > c2.name ? 1 : (c1.name < c2.name ? -1 : 0));

const result = colors
  .filter(c => c.R > c.G && c.R > c.B)
  .slice(page * pageSize, (page + 1) * pageSize)
  .map(c => ({
      name: c.name,
      color: `#${hex(c.R)}${hex(c.G)}${hex(c.B)}`
  }));

This code takes a bunch of colors that have RGB values and a name and returns a page (defined by page and pageSize) of the colors that are "reddish" (more red than blue and green) order by name. The resulting objects have a name and an HTML color string.

This works for an array of four elements, it works fine for arrays of thousands of elements, too, but let's look at what it is doing:

  • we pushed the sort up, thus sorting all colors in order to get the nice syntax at the end, rather than sorting just the reddish colors
  • we filtered all colors, even if we needed just pageSize elements
  • we created an array at every step (three times), even if we only needed one with a max size of pageSize

Let's write this in a classical way, with loops, to see how it works:

const result = [];
let i=0;
for (const c of colors) {
	if (c.R<c.G || c.R<c.B) continue;
	i++;
	if (i<page*pageSize) continue;
	result.push({
      name: c.name,
      color: `#${hex(c.R)}${hex(c.G)}${hex(c.B)}`
    });
	if (result.length>=pageSize) break;
}

And it does this:

  • it iterates through the colors array, but it has an exit condition
  • it ignores not reddish colors
  • it ignores the colors of previous pages, but without storing them anywhere
  • it stores the reddish colors in the result as their transformed version directly
  • it exits the loop if the result is the size of a page, thus only going through (page+1)*pageSize loops

No extra arrays, no extra iterations, only some ugly ass code. But what if we could write this as nicely as in the first example and make it work as efficiently as the second? Because of ECMAScript 6 we actually can!

Take a look at this:

const result = Enumerable.from(colors)
  .where(c => c.R > c.G && c.R > c.B)
  //.orderBy(c => c.name)
  .skip(page * pageSize)
  .take(pageSize)
  .select(c => ({
      name: c.name,
      color: `#${hex(c.R)}${hex(c.G)}${hex(c.B)}`
  }))
  .toArray();

What is this Enumerable thing? It's a class I made to encapsulate the methods .where, .skip, .take and .select and will examine it later. Why these names? Because they mirror similar method names in LINQ (Language Integrated Queries from .NET) and because I wanted to clearly separate them from the array methods.

How does it all work? If you look at the "classical" version of the code you see the new for..of loop introduced in ES6. It uses the concept of "iterable" to go through all of the elements it contains. An array is an iterable, but so is a generator function, also an ES6 construct. A generator function is a function that generates values as it is iterated, the advantage being that it doesn't need to hold all of the items in memory (like an array) and any operation that needs doing on the values is done only on the ones requested by code.

Here is what the code above does:

  • it creates an Enumerable wrapper over array (performs no operation, just assignments)
  • it filters by defining a generator function that only returns reddish colors (but performs no operation) and returns an Enumerable wrapper over the function
  • it ignores the items from previous pages by defining a generator function that counts items and only returns items after the specified number (again, no operation) and returns an Enumerable wrapper over the function
  • it then takes a page full of items, stopping immediately after, by defining a generator function that does that (no operation) and returns an Enumerable wrapper over the function
  • it transforms the colors in output items by defining a generator function that iterates existing items and returns the transformed values (no operation) and returns an Enumerable wrapper over the function
  • it iterates the generator function in the current Enumerable and fills an array with the values (all the operations are performed here)

And here is the flow for each item:

  1. .toArray enumerates the generator function of .select
  2. .select enumerates the generator function of .take
  3. .take enumerates the generator function of .skip
  4. .skip enumerates the generator function of .where
  5. .where enumerates the generator function that iterates over the colors array
  6. the first color is red, which is reddish, so .where "yields" it, it passes as the next item in the iteration
  7. the page is 0, let's say, so .skip has nothing to skip, it yields the color
  8. .take still has pageSize items to take, let's assume 20, so it yields the color
  9. .select yields the color transformed for output
  10. .toArray pushes the color in the result
  11. go to 1.

If for some reason you would only need the first item, not the entire page (imagine using a .first method instead of .toArray) only the steps from 1. to 10. would be executed. No extra arrays, no extra filtering, mapping or assigning.

Am I trying too hard to seem smart? Well, imagine that there are three million colors, a third of them are reddish. The first code would create an array of a million items, by iterating and checking all three million colors, then take a page slice from that (another array, however small), then create another array of mapped objects. This code? It is the equivalent of the classical one, but with extreme readability and ease of use.

OK, what is that .orderBy thing that I commented out? It's a possible method that orders items online, as they come, at the moment of execution (so when .toArray is executed). It is too complex for this blog post, but there is a full implementation of Enumerable that I wrote containing everything you will ever need. In that case .orderBy would only order the minimal number of items required to extract the page ((page+1) * pageSize). The implementation can use custom sorting algorithms that take into account .take and .skip operators, just like in LiNQer.

The purpose of this post was to raise awareness on how Javascript evolved and on how we can write code that is both readable AND efficient.

One actually doesn't need an Enumerable wrapper, and can add the methods to the prototype of all generator functions, as well (see LINQ-like functions in JavaScript with deferred execution). As you can see, this was written 5 years ago, and still people "teach" others that .filter and .map are the Javascript equivalents of .Where and .Select from .NET. NO, they are NOT!

The immense advantage for using a dedicated object is that you can store information for each operator and use it in other operators to optimize things even further (like for orderBy). All code is in one place, it can be unit tested and refined to perfection, while the code using it remains the same.

Here is the code for the simplified Enumerable object used for this post:

class Enumerable {
  constructor(generator) {
	this.generator = generator || function* () { };
  }

  static from(arr) {
	return new Enumerable(arr[Symbol.iterator].bind(arr));
  }

  where(condition) {
    const generator = this.generator();
    const gen = function* () {
      let index = 0;
      for (const item of generator) {
        if (condition(item, index)) {
          yield item;
        }
        index++;
      }
    };
    return new Enumerable(gen);
  }

  take(nr) {
    const generator = this.generator();
    const gen = function* () {
      let nrLeft = nr;
      for (const item of generator) {
        if (nrLeft > 0) {
          yield item;
          nrLeft--;
        }
        if (nrLeft <= 0) {
          break;
        }
      }
    };
    return new Enumerable(gen);
  }

  skip(nr) {
    const generator = this.generator();
    const gen = function* () {
      let nrLeft = nr;
      for (const item of generator) {
        if (nrLeft > 0) {
          nrLeft--;
        } else {
          yield item;
        }
      }
    };
    return new Enumerable(gen);
  }

  select(transform) {
    const generator = this.generator();
    const gen = function* () {
      for (const item of generator) {
		yield transform(item);
      }
    };
    return new Enumerable(gen);
  }

  toArray() {
	return Array.from(this.generator());
  }
}

The post is filled with links and for whatever you don't understand from the post, I urge you to search and learn.

I like Dave Farley's video channel about the software industry quality, Continuous Deployment, so I will share this video about how tech interviews should be like. Not a step by step tutorial as much as asking what is the actual purpose of a tech interview and how to get to achieve it:

[youtube:osnOY5zgdMI]

Top Software Engineering Interview Tips - Dave Farley

  As any software developer with any self respect, I periodically search the Internet for interesting things that would help me become better as a professional and as a human being. I then place those things in a TODO list and play some silly game. Here is a list of things that I want to pursue sooner or later, possibly later, probably never:

  • Astro: Ship Less JavaScript - static site builder that delivers lightning-fast performance with a modern developer experience.
    In the design of Astro:
    • Bring Your Own Framework (BYOF): Build your site using React, Svelte, Vue, Preact, web components, or just plain ol’ HTML + JavaScript.
    • 100% Static HTML, No JS: Astro renders your entire page to static HTML, removing all JavaScript from your final build by default.
    • On-Demand Components: Need some JS? Astro can automatically hydrate interactive components when they become visible on the page. If the user never sees it, they never load it.
    • Fully-Featured: Astro supports TypeScript, Scoped CSS, CSS Modules, Sass, Tailwind, Markdown, MDX, and any of your favorite npm packages.
    • SEO Enabled: Automatic sitemaps, RSS feeds, pagination and collections take the pain out of SEO and syndication.
  • https://brython.info/ - Brython is a Python 3 implementation adapted to the HTML5 environment
    • Designed to replace Javascript as the scripting language for the Web
    • You can take it for a test drive through a web console
    • Speed of execution is similar to CPython for most operations
  • https://github.com/yggdrasil-network/yggdrasil-go - an early-stage implementation of a fully end-to-end encrypted IPv6 network
    • It is lightweight, self-arranging, supported on multiple platforms and allows pretty much any IPv6-capable application to communicate securely with other Yggdrasil nodes.
    • Yggdrasil does not require you to have IPv6 Internet connectivity - it also works over IPv4.
  • React-Table - Build and design powerful datagrid experiences while retaining 100% control over markup and styles.
    • Designed to have zero design
    • Built to materialize, filter, sort, group, aggregate, paginate and display massive data sets using a very small API surface.
    • Has its very own plugin system allowing you to override or extend any logical step, stage or process happening under the hood.
  • Implement a DisposeAsync method - this implementation allows for asynchronous cleanup operations. The DisposeAsync() returns a ValueTask that represents the asynchronous dispose operation.
  • https://dev.to/neethap/how-to-publish-a-static-react-node-js-app-using-cpanel-the-easy-way-199o - Forget the Setup Node.js App feature on your cPanel. Instead, you want to focus your attention on the build folder
  • https://blog.m3o.com/2021/06/24/micro-apis-for-everyday-use.html - Micro has evolved from an open source framework to a full blown API platform, one that continues to focus on the developer experience first and foremost. The majority of users are building APIs for end public consumption but having to rebuild many of the building blocks they need wherever they go.
  • itsi Meet - Secure, Simple and Scalable Video Conferences
    • The Jitsi Meet client runs in your browser, without installing anything on your computer. You can try it out at https://meet.jit.si.
    • Jitsi Meet allows for very efficient collaboration. Users can stream their desktop or only some windows. It also supports shared document editing with Etherpad.
  • Mumble - Open Source voice-chat software
    • The client works on Windows, Linux, FreeBSD and macOS, while the server should work on anything Qt can be installed on.
  • Swagger UI with login form and role-based api visibility - This post show how to customize Swagger UI in a Blazor WASM project using Swashbuckle library, implement a custom authentication UI and manage api visibility based on user roles.
  • Complete Introduction to React-Redux - third part of a series by Kumar Harsh
  • Wav2vec: Semi and Unsupervised Speech Recognition

Tell me, what will this Visual Basic .NET code display?

For x as Integer = 1 to 2
  Dim b as Boolean
  Console.WriteLine("b is " & b)
  b = true
Next

 Even if you never coded in Visual Basic it's a relatively easy code to translate to something like C#:

for (var x = 1; x <= 2; x++)
{
	bool b;
	Console.WriteLine("b is " + b);
	b = true;
}

Now what will the code display? Let's start with this one. This code will not compile and instead you will get: Use of unassigned local variable 'b'.

But what about the VB one? Even with Option Strict ON it will not even warn you. And the output will be: False, True. What?!

In conclusion, VB.Net will let you declare variables without assigning any value to them, but their value will be inconsistent. Always set the variables you declare. 

Learning from React series:

  • Part 1 - why examining React is useful even if you won't end up using it
  • Part 2 - what Facebook wanted to do with React and how to get a grasp on it
  • Part 3 - what is Reactive Programming all about?
  • Part 4 - is React functional programming?
  • Part 5 - Typescript, for better and for worse
  • Part 6 (this one) - Single Page Applications are not where they wanted to be

We cannot discuss React without talking about Single Page Applications, even if one can make a React based web site that isn't a SPA and SPAs that don't use a framework or library. What are SPAs? Let's start with what they are not.

SPAs are not parallax background, infinite scroll pages where random flashy things jump at you from the top and bottom and the sides like in a bloody ghost train ride! If you ever considered doing that, this is my personal plea for you to stop. For the love of all that is decent, don't do it!

SPAs are desktop applications for the web. They attempt to push the responsive, high precision actions, high CPU usage and fancy graphics to the client while maintaining the core essentials on the server, like security and sensitive data, while trying to assert full control over the interface and execution flow. In case connectivity fails, the cached data allows the app to work just fine offline until you reconnect or you need uncached data. And with React (or Angular and others), SPAs encapsulate UI in components, just like Windows Forms.

You know who tried (and continues to try) to make Windows Forms on the web? Microsoft. They started with ASP.Net Web Forms, which turned into ASP.Net MVC, which turned into ASP.Net Web API for a while, then turned to Blazor. At their heart, all of these are attempts to develop web applications like one would desktop applications.

And when they tried to push server side development models to the web they failed. They might succeed in the future and I wish them all the luck, but I doubt Microsoft will make it without acknowledging the need to put web technologies first and give developers full and direct access to the browser resources.

Ironically, SPAs (and modern web development in general) put web technologies first to a degree that makes them take over functionality already existing in the browser, like location management, URL handling and rendering components, but ignore server technologies.

It is relevant to make the comparison between SPAs and desktop applications because no matter how much they change browsers to accommodate this programming style, there are fundamental differences between the web and local systems.

For one, the way people have traditionally been trained to work on the web is radically different from how modern web development is taught.

Remember Progressive Enhancement? It was all about serving as much of the client facing, relevant content to the browser first, then enhancing the page with Javascript and CSS. It started from the idea that Javascript is slow and might not be enabled. Imagine that in 2021! When first visiting a page you don't want to keep the users waiting for all the fancy stuff to load before they can do anything. And SEO, even if nowadays the search engine(s?) know how to execute Javascript to get the content as a user would, still cares a lot about the first load experience.

Purely client tools like React, Angular, Vue, etc cannot help with that. All they can do is optimize the Javascript render performance and hope for the best. There are solutions cropping up: check out SSR and ReactDomServer and React Server Components. Or Astro. Or even Blazor. The takeaway here is that a little bit of server might go a long way without compromising the purity of the browser based solution.

Remember jQuery and before? The whole idea back then was to access the DOM as a singular UI store and select or make changes to any element on the entire page. Styling works the same way. Remember CSS Zen Garden? You change one global CSS file and your website looks and feels completely different. Of course, that comes with horrid things like CSS rule precedence or !important [Shudder], yet treating the page as a landscape that one can explore and change at will is a specifically browser mindset. I wasn't even considering the possibility when I was doing Windows Forms.

In React, when I was thinking of a way to add help icons to existing controls via a small script, the React gurus told me to not break encapsulation. That was "not the way". Well, great, Mandalorian! That's how you work a lot more to get to the same thing we have done for years before your way was even invented! In the end I had to work out wrapper elements that I had to manually add to each form control I wanted to enhance.

In the same app I used Material Design components, which I thought only needed a theme to change the way they look and feel, only to learn that React controls have to be individually styled and that the theme itself controls very few things. Even if there is support for theming, if you want to significantly change the visuals and behaviour you will have to create your own controls that take what they need (much more than what Material UI controls do) from the theme provider.

A local desktop application is supposed to take most of the resources that are available to it. You can talk about multitasking all you want, but normal people focus on one complex application at a time. At its core a SPA is still a browser tab, using one thread. That means even with the great performance of React, you still get only one eighth (or something, based on the number of processors) from the total computer resources. There are ways of making an application use multiple threads, but that is not baked in React either. Check out Neo.js for an attempt to do just that.

You can't go too far in the other direction either. Web user experience is opening many tabs and switching from one to the other, refreshing and closing and opening others and even closing the browser with all the tabs open or restoring an entire group of bookmarks at once. And while we are at the subject of URLs and bookmarks, you will find that making a complex SPA consistently alter the address location so that a refresh or a bookmark gets you to the same place you were in is really difficult.

A local Windows app usually has access to a lot of the native resources of the computer. A browser is designed to be sandboxed from them. Moreover, some users don't have correct settings or complete access to those settings, like in corporate environments for example. You can use the browser APIs, but you can't fully rely on them. And a browser tab is subject to firewall rules and network issues, local policies, browser extensions and ad blockers, external ad providers and so on.

You may think I am taking things to an unreasonable extreme. You will tell me that the analogy to desktop apps breaks not despite, but because of all of the reasons above and thus SPAs are something else, something more light, more reusable, webbier, with no versioning issues and instant access and bookmarkable locations. You will tell me that SPAs are just normal web sites that work better, not complex applications. I will cede this point.

However! I submit that SPAs are just SPAs because that's all they could be. They tried to replace fully fledged native apps and failed. That's why React Native exists, starting as a way to do more performant apps for mobiles and now one can write even Windows applications with it.

Single Page Applications are great. I am sure they will become better and better with time until we will forget normal HTML pages exist and that servers can render and so on. But that's going in the wrong direction. Instead of trying to emulate desktop or native apps, SPAs should embrace their webbiness.

Is Javascript rendering bad? No. In fact it's just another type of text interpreted by the browser, just like HTML would be, but we can do better.
Is Javascript URL manipulation bad? No. It's the only way to alter the address location without round trips to the server, but sometimes we need the server. Perhaps selective loading of component resources and code as needed will help.
Is single threaded execution bad? No, but we are not restricted to it.
Is component encapsulation bad? Of course not, as long as we recognize that in the end it will be rendered in a browser that doesn't care about your encapsulation.
The only thing that I am still totally against is CSS in Javascript, although I am sure I haven't seen the best use of it yet.

React is good for SPAs and SPAs are good for React, but both concepts are trying too hard to take things into a very specific direction, one that is less and less about the browser and more about desktop-like components and control of the experience. Do I hate SPAs? No. But as they are now and seeing where they are going, I can't love them either. Let's learn from them, choose the good bits and discard the chaff.  

A year or so ago I wrote a Javascript library that I then ported (badly) to Typescript and which was adding the sweet versatility and performance of LINQ to the *script world. Now I've rewritten the entire thing into a Typescript library. I've abandoned the separation into three different Javascript files. It is just one having everything you need.

I haven't tested it in the wild, but you can get the new version here:

Github: https://github.com/Siderite/LInQer-ts

NPM: https://www.npmjs.com/package/@siderite/linqer-ts

Documentation online: https://siderite.github.io/LInQer-ts

Using the library in the browser directly: https://siderite.github.io/LInQer-ts/lib/LInQer.js and https://siderite.github.io/LInQer-ts/lib/LInQer.min.js

Please give me all the feedback you can! I would really love to hear from people who use this in:

  • Typescript
  • Node.js
  • browser web sites

The main blog post URL for the project is still https://siderite.dev/blog/linq-in-javascript-linqer as the official URL for both libraries.

Have fun using it!

  Learning from React series:

  • Part 1 - why examining React is useful even if you won't end up using it
  • Part 2 - what Facebook wanted to do with React and how to get a grasp on it
  • Part 3 - what is Reactive Programming all about?
  • Part 4 - is React functional programming?
  • Part 5 (this one) - Typescript, for better and for worse
  • Part 6 - Single Page Applications are not where they wanted to be

  Typescript is a programming language developed by Microsoft. It is a superset of Javascript that allows a lot of type checking and manipulation, hence the name. React and Vue fully support it while Angular requires it. So what is the reason for the adoption of this new language? What are its advantages and disadvantages?

  First of all, what is it? I would start metaphorically, if you can forgive that. Imagine a vast jungle, grown organically since time immemorial, chaotic and wild. Many developers went in, but few have come out unscathed, some never to be seen again. That's Javascript for you. It was released in 1995 as a basic scripting language for browsers, but it was designed as so flexible and complete that it could be used as a programming language in any context with minor modifications. For a very long time tightly coupled with its (very inefficient) browser implementations, it was dismissed from being a proper programming language. But that ended pretty much when V8 was launched, a performant Javascript engine that could be used separately to run the language in whatever situation the developer wanted. With V8, Chrome was launched and soon enough Node.js, which ran Javascript on the server as a proper language.

  The worst and best feature of Javascript is flexibility. You can do pretty much whatever you want in it, as it is a dynamic language unencumbered by such silly things as encapsulation, classes, types and so on. So if you started in a structured way, you could do a lot, if not - like most people unfamiliar with the language - you created a mess that no one could understand, including yourself. So if Javascript is a jungle, Typescript is Duke Nukem coming to cut the trees, wall off vast swathes of forest and only allow a narrow path for life to exist. Only, on that narrow path you get the same chaotic and wild proliferation. A lot fewer software developers traverse the forest and come out with PTSD, but a lot more people go through than before and mistakes can and will be made.

  I guess what I am trying to say is that Typescript sometimes feels like a square peg forced into a round hole. It is not a bad language. In fact, it is amazing in some parts. The type system introduced by Microsoft acts like a kind of system of annotations that inform on what you are actually doing. Tools are aware now of the types of values you use, can optimize code, find errors, warn devs, autocomplete code, help with development, etc. And I am willing to bet that people working on the language are having the time of their lives, because it has to be fun to work on abstract computer science and getting paid, too.

  But what does that mean for the frontend industry? It means that people are getting pushed on that narrow jungle path, for better or for worse. As a small business, you will have to either accept a shitty website created by cheap Javascript and vanilla HTML cavemen or get a lot out of your pocket to hire people who spend time and effort to understand Typescript and some, if not most, of the frontend frameworks that are fashionable at the moment. As a large company you will get tectonic shifts in technology, leaving a large part of your workforce in limbo, while having to spend a lot on hiring and redesigning flows. As an industry, we become dependent on several companies that spend the effort of keeping their frameworks up to date and documented. 

  Let me give you some Typescript questions (that I will not answer) to test your knowledge:

  • can you tell me what all of these types are and how they differ from each other: undefined, null, any, unknown, never, void ?
  • how can you tell if a Typescript object is of a specific form (the equivalent of the .NET 'is' or 'as' functionality)?
  • what is the difference between a union of literal types and an enum?
  • what are and how can you use BigInt, ReadOnlyArray, Partial, NonNullable, Required?
  • what is the difference between a private member of a Typescript class and one starting with #?
  • do you know how to use unions in interpolated strings?
  • what is the difference between interface, type, class, type intersection, class expression and module?

 I could go on and on. On how the possibility of null is now something you have to declare explicitly, for example. I didn't (dare to) ask about type guards and how narrowing works and what conditional types are. And there are so many gotchas for developers coming from other languages, because the language features have been added by people who worked on C#, so they are kind of the same, but actually not. Type meaning and conversion is a large bit of confusing difference between Typescript and C#/Java. For example you can define a class and then cast some data to it, but you don't get what you expect:

class MyClass {
  public name: string='';
  public sayHello() { console.log(`Hello ${this.name}`); }
}

const my:MyClass = { name: 'Siderite' } as MyClass;
console.log(my); // { "name": "Siderite" }
console.log(typeof(my)); // "object"
console.log(my instanceof MyClass) // false
console.log(my.sayHello()); // ERR: my.sayHello is not a function 

There are still web sites dedicated to the inconsistencies of Javascript. Typescript doesn't solve these issues, it mostly hides them. I am sure it's fun to play with types, but is that the optimal solution for the problem at hand, mainly the many ways you can do Javascript wrong? I would argue no. It's fun to work in, but there is a clear dependency between Typescript and Javascript, which forced so many changes in Typescript from Javascript and the other way around, as they have to be kept in sync. All while Javascript needs to remain backwards compatible, too.

"But what about React? Weren't you talking about that, Siderite?"

Yes, I was. I only looked deeper into Typescript because I did this project in React. Before, I had used it with Angular and frankly I didn't feel the friction that I felt now. Angular is designed with Typescript in mind, the development experience is smoother. Angular also uses two directional bindings to propagate changes and that means less Typescript code. The only code you actually need to write is network API code, for which you have out of the box HTTP services, and some limited interface logic. React doesn't do that.

First of all, React has been designed within a kind of declarative/functional mindset, as I explained in previous chapters of this series. It focuses a lot on immutability and functions that are passed around and declaring what your expectations are. Typescript is fundamentally an imperative language. After forcing it through the round hole, the square peg now has to go through a triangular hole, too. The immutability forces one to use a lot of code for changes coming from the UI towards the Typescript logic.

Then, React is a library. It was designed as such and has less levers to force the developer in a direction or another. Even when following a clear development strategy, there are many of which to choose from, all tried and tested and valid, but very different from one another. The jungle was tamed, but now you must consider a multiverse of jungles, each with a different shape.

Finally, React started out in Javascript. Many documentation pages are still just about Javascript. New innovations in the field of React are developed and tested out independently, by various people with various skills and motivations. The learning curve is not steep, but the paths are many.

So in the end, Typescript is an interesting experiment in programming languages, one that will probably surprise me in the near future with ideas that can only be implemented using it. However it is not perfect and its dependency on Javascript is unfortunate, even if its inspiration was not. The purpose of the language was to guide and help developers mired in Javascript confusion, but using it with React goes against that very purpose, as React is still something relatively new and evolving wildly in all directions, so React doesn't help Typescript. Does Typescript help React? I would say yes. However I don't feel that it is enough in its current form. The friction between the two concepts is palpable and I dare any of you to prove me wrong.

It seems I've talked a lot about the problems of React rather than its benefits. I blamed it on things ranging from confusing and obsolete documentation to inconsistent goals of the library and underlying programming language. That's the way I work, focusing on problems so I can find one I can fix. In the next chapter I want to discuss about React in the wild and what are the good things people are saying about it. The most interesting question however, the one that I want to answer with this entire series, is how can we improve our work by adapting lessons learned, either from React to whatever we do or the other way around. What concrete ideas should we adopt from React and which we should condemn to the pit of failed concepts?

  Learning from React series:

  • Part 1 - why examining React is useful even if you won't end up using it
  • Part 2 - what Facebook wanted to do with React and how to get a grasp on it
  • Part 3 - what is Reactive Programming all about?
  • Part 4 (this one) - is React functional programming?
  • Part 5 - Typescript, for better and for worse
  • Part 6 - Single Page Applications are not where they wanted to be

  React was designed just when classes and modules were making their way into Javascript, so it made sense to use them. Developers that are not coming from the Javascript or dynamic languages world are used to the type safety and hierarchical structure that classes provide. And it also made sense from the standpoint of the product. If you want to encapsulate state, logic and presentation why not used existing functioning models like classes, components and so on.

  However, at the same time ideas like functions being first class citizens of programming languages and functional programming were making a comeback, mostly because of big data. That meant that lambdas (arrow functions) were popping up everywhere. If you are a C# developer, you already are familiar with them. Something like Func<int,int> func = (int x)=> x*2; represents a lambda function, which is the same as something written like private int f2(int x) { return x*2; }, yet lambda functions can be declared inside code blocks, can be implicitly cast to Expressions and manipulated and they are brilliant as method parameters. Check out the lambda version in C# compared to the function version in VB:

// C#
var items = allItems.Where(i=>!i.deleted);
// C# function body
var items = allItems.Where(i=>{
                             return !i.deleted
                           });
// VB
Dim items = allItems.Where(Function(i) Not i.deleted)
// VB function body
Dim items = allItems.Where(Function(i) 
			      Return Not i.deleted
			   End Function)

 Similarly, Javascript had only function syntax, even if functions were designed to be first class citizens of the language since its inception. Enter arrow functions in Javascript:

// before
var self = this;
var items = allItems.filter(function(i) {
  return self.validate(i);
});

// after
var items = allItems.filter(i=>this.validate(i));

Note how arrow functions don't have an internal 'this' so you don't need to bind functions or create self variables.

So at this point, React changed and instead of classes, they implemented "functional syntax" in React Hooks. Behind the scenes a component is still generated as a class which React uses and the old syntax is still valid. For example at this time there is no way to create an error boundary component using functional syntax. The result is a very nice simplification of the code:

// React classic (pardon the pun)
export class ShowCount extends React.Component {
  constructor(props) {
    super(props);
    this.state = {
      count: 0
    };
  }
  componentDidMount() {
    this.setState({
      count: this.props.count
    })
  }

  render() {
    return (
      <div> 
        <h1> Count : {this.state.count} </h1>
      </div>
    );
  }
}

// React Hooks
export function ShowCount(props) {
  const [count, setCount] = useState();

  useEffect(() => {
    setCount(props.count);
  }, [props.count]);

  return (
    <div>
      <h1> Count : {count} </h1>
    </div>
  );
}

// courtesy of https://blog.bitsrc.io/6-reasons-to-use-react-hooks-instead-of-classes-7e3ee745fe04

  But this does not just provide a better syntax, it also changes the way development is done. Inheritance is basically eliminated in favor of composition and people are starting to use the word "functional" in sentences uttered in the real world. And while the overall design of React to use unidirectional binding and immutable variables was there since inception, I do feel like this is just one more step towards a functional programming approach and the reason for so many functional purists popping up lately.

  What is functional programming, though? Wikipedia defines it as "a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program." Sound familiar?

  I will have you know that I have friends that have rebelled and gone to the other side, making applications (including UI) with F# and refusing to submit to the Galactic Imperative. After playing with React I can say that I understand why this approach has appeal. One declares what they need, ignore flow and constrain their efforts inside components that are more or less independent. A program looks and feels like a big function that uses other functions and to which you just provide inputs and out comes UI ready to use. If the same input is provided, the same output results. You can test it to perfection, you can infer what happens with an entire tree of such functions and make optimizations in the transpiler without changing the code. You can even use a diff algorithm on the output tree and just update what changed in the UI.

  But it is time to call bullshit. We've used functions that receive pure data on one side and output user interface on the other side since forever. They are called views. One could even argue that an API is a data provider and the application is the function that uses the data to output UI. You don't ignore flow, you move it up! You will still have to model the interactions between every piece of data you have and all the events that come in. One might even say the unforgivable and assert that React is just another Model-View thingie with the extra constraint that it will forcibly re-render a component when its input state changes.

  That is my main takeaway from React: the idea that forcing re-rendering of components forces the developer to move the state up, closer to where it should be. No one can store stuff in browser variables, in element attributes and data, because all of it will be lost on next render. That is good news, but also very bad news. Let me get you through an example:

  We have data that we need shown in a grid. Every row has an expand/collapse button that will show another grid under it, with details related to that row. The React way of doing things would take us through these steps:

  • create a component that represents the grid and receives an array as input
  • it will contain code that maps the array to a list of row components which receive each row as the input
  • the row component will render a button that will dispatch an expand event for the row when clicked
  • on click the row expanded state will be changed and the data for the row detail grid retrieved

  It sounds great, right? OK, where do you store the state of row expansion? How do we push it to the row component? Let's use a map/dictionary of row id and boolean, why don't we? Does that mean that when you expand/collapse a row only the boolean changes or the entire structure? What will get re-rendered? The row component in question or all the row components?

  What happens when we go to the next page in the grid and then go back? Should we return to the same row expansion states? Where should the scrollbar in the grid be? Should we keep that in the state as well and how do we push it to the grid component? Do row details grids have scroll? Doesn't the size of each component affect the scroll size, so how do we store the scroll position? What is the user resizes the browser or zooms in or out?

  What happens when we resize a grid column? Doesn't that mean that all row components need to be re-rendered? If yes, why? If no, why? What if you resize the column of a detail grid? Should all detail grids have the same resizing applied? How do you control which does what?

  Many grids I've seen are trying to store the expansion, the details, everything in the object sent as a parameter to the row. This seems reasonable until you realize that adding anything to the object changes it, so it should trigger a re-render. And then there is Typescript, which expects an object to keep to its type or else you need to do strange casts from something you know to "unknown", something that could be anything. That's another story, though.

  Suddenly, the encapsulation of components doesn't sound so great anymore. You have to keep count of everything, everywhere, and this data cannot be stored inside the component, but outside. Oh, yes, the component does take care of its own state, but you lose it when you change the input data. In fact, you don't have encapsulation in components, but in pairs of data (what React traditionally calls props) and component. And the props must change otherwise you have a useless component, therefore the data is not really immutable and the façade of functional programming collapses.

  There are ways of controlling when a component should update, but this is not a React tutorial, only a lessons learned blog post. Every complexity of interaction that you have ever had in a previous programming model is still there, only pushed up, where one can only hope it is completely decoupled from UI, to which you add every quirk and complexity coming from React itself. And did we really decouple UI or did we break it into pieces, moving the simplest and less relevant one out and keeping the messy and complex one that gave us headaches in the first place in? It feels to me like React is actually abstracting the browser from you, rather than decoupling it and letting the developer keep control of it.

  After just a month working in this field I cannot tell you that I understood everything and have all the answers, but my impression as of now is that React brings very interesting ideas to the table, yet there is still a lot of work to be done to refine them and maybe turn them into something else.

  Next time I will write about Typescript and how it helps (and hinders) React and maybe even Angular development. See you there!

 Learning from React series:

  • Part 1 - why examining React is useful even if you won't end up using it
  • Part 2 - what Facebook wanted to do with React and how to get a grasp on it
  • Part 3 (this one) - what is Reactive Programming all about?
  • Part 4 - is React functional programming?
  • Part 5 - Typescript, for better and for worse
  • Part 6 - Single Page Applications are not where they wanted to be

The name React is already declaring that it is used in reactive programming, but what is that? Wikipedia is defining it as "a declarative programming paradigm concerned with data streams and the propagation of change". It expands on that to say that it declares the relationship between elements and updates them when either change. You can easily imagine a graph of elements magically updating as any of them changes. However, the implementation details of that magic matter.

  In 2011 Microsoft revealed a free .Net library called Reactive Extensions, or ReactiveX or RX. It was based on a very interesting observation that the observer/observable patterns are the mirror images of iterator/iterable. When the iterator moves through an iterable, the observer reacts to events in the observable; one is imperative, the other reactive. The library was so popular that it was immediately adopted for a bunch of programming languages, including Javascript. It also allowed for operations traditionally used for arrays and collections to work with a similar syntax on observables. This is a great example of reactive programming because instead of deciding when to perform a data access (and having to check if it is possible and everything is in range and so on), the code would just wait for something to happen, for an event that provided data, then act on the data.

  One might argue that Verilog, a hardware description language, is also reactive, as it is based on actions being performed on certain events and it even uses non-blocking assignments, which are like declarations of state change which happen at the same time. Reminds me of the way React is implementing state management.

  Of course, reactive programming is also modern UI and when I say modern, I mean everything in the last twenty years. Code gets executed when elements in the user interface change state: on click, on change, on mouse move, on key press etc. That is why, the developers at Facebook argue, browser based UI programming should be reactive at the core. This is not new, it's something you might even be already very familiar with in other contexts. Code that is triggered by events is also called event-driven programming.

  But at the same time, others also claim their software is reactive. Microservices are now very fashionable. The concept revolves around organizing your product into completely independent modules that only have one external responsibility, which then one wires together via some sort of orchestrator. The biggest advantage of this is obviously separation of concerns, a classic divide and conquer strategy generalized over all software, but also the fact that you can independently test and deploy each microservice. You don't even have to shut down running ones or you can start multiple instances, perhaps with multiple versions and in different locations. This is also distributed programming. The way the communication between microservices is done is usually via some sort of message queue, like Rabbit MQ, but I am working on a really old software, written like 15 years ago, which uses IBM MQ to communicate between different portions of the software - let's call them macroservices :) Well, this is supposed to be reactive programming, too, because the microservices are reacting to the messages arriving on the queue and/or sent from others.

  The observer pattern is old, it's one of the patterns in the original design patterns book Design Patterns: Elements of Reusable Object-Oriented Software, which started the software design pattern craze which rages on even now. Anybody who ever used it extensively in their software can (and many do) claim that they did reactive programming. Then there is something called the actor model (which will probably confuse your Google if you search for it), which is actually a mathematical concept and originated in 1973! Implementations of actors are eerily similar to the microservices concept from above.

  And speaking of events, there is another pattern that is focusing on declaring the flow of changes from a given state, given an event. It's called a state machine. It also boasts separation of concerns because you only care about what happens in any state in case of an event. You can visualize all the possible flows in a state machine, too, as names arrows from any state to another, given that such a transition is defined. The implementation of the state machine engine is irrelevant as long as it enables these state transitions as defined by the developer. 

  Everything above, and probably some other concepts that are named differently but kind of mean the same thing, is reactive programming. Let me give you another example: a method or a software function. Can one say it is reactive? After all, it only executes code when you call it! Couldn't we say that the method reacts to an event that contains the parameters the method needs? What about Javascript, which is designed to be single threaded and where each piece of code is executed based on a queue of operations? Isn't it a reactive programming language using an event bus to determine which actions to perform?

  And that's the rub. The concept of reactivity is subjective and generally irrelevant. The only thing that changes and matters is the implementation of the event transport mechanism and the handling of state.

  In a traditional imperative program we take for granted that the execution of methods will be at the moment of the call and that all methods on that thread will be executed one after the other and that setting a value in memory is atomic and can be read immediately after by any other piece of code and you can even lock that value so it is only read by one entity at a time. Now imagine that you are writing the same program, only we can't make the assumptions above. Calling methods can result in their code getting executed at an arbitrary time or maybe not at all. Whatever you change in a method is only available to that method and there is no way for another method to read the values from another. The result? Your code will take a lot of care to maintain state locally and will start to look more like a state machine, modelling transitions rather than synchronous flows. The order of operations will also be ensured by consuming and emitting the right sort of events. Permanent and/or shared storage will become the responsibility of some of the modules and the idea of "setting data" will become awkward. Keeping these modules in sync will become the greatest hurdle.

  That's all it is! By eliminating assumptions about how your code is executed, the result is something more robust, more generic, more compartmentalized. Is it the golden hammer that will solve all problems? Of course it isn't. We've seen how the concepts at the core of reactive programming have been there since forever. If that was the best way, everybody would already be working like that. The biggest problems of this kind of thinking are resource duplication, as everybody has to keep all the data they use locally, and synchronization, as one cannot assume there exists any source of absolute truth that can be accessed by all at the same time. Debugging the system also becomes a bit complicated.

  In 2014 a bunch of people created something called "The Reactive Manifesto" and anyone can sign it. In it, they define a reactive system as:

  • Responsive - responds quickly
  • Resilient - remains responsive in case of failure
  • Elastic - remains responsive regardless of workload
  • Message Driven - async message passing as the boundary between components

  As you can see, reactive mostly means responsive for them and the concept is more one of organization management than a specific set of programming languages and tools. Note that, in fact, you can create a system that communicates via async message passing between components on the client, on the server, between client and server and so on. There can be multiple types of technology and different stacks. As long as they can react to asynchronous messages, they can be integrated into such a system. 

  This post has reached already a big size and I haven't even touched on functional programming and how it tries to solve... well, everything. I will do that in the next chapter. I have to say that I find the concept of a programming language that eliminates global variable scope and public fields and introduces a delay and a random order of execution of methods or properties from other classes fascinating. Imagine testing and debugging that, then moving the working code to production, where the delay is removed. You will also see that a lot of the ideas above influence how React development is done and perhaps you will understand purists telling everybody how things are not correct until you implement this or that in a certain way. Till next time!

Learning from React series:

  • Part 1 - why examining React is useful even if you won't end up using it
  • Part 2 (this one) - what Facebook wanted to do with React and how to get a grasp on it
  • Part 3 - what is Reactive Programming all about?
  • Part 4 - is React functional programming?
  • Part 5 - Typescript, for better and for worse
  • Part 6 - Single Page Applications are not where they wanted to be

In order to understand React we must consider what were the advantages that Facebook found in the concept when they created the library. There a numerous presentations and articles that they pushed to explain it, but it kind of distills to this:

  • mutation of values complicates flows in complex applications
  • traditional Model-View patterns promote mutation through two directional data binding
  • solution for mutation:
    • use unidirectional data binding and re-render views as the data changes
    • send inputs as events that a higher level entity will interpret
  • solution for slow browser render overhead:
    • code is organized in smaller and smaller components that depend on a few values in the overall data
    • there is an intermediate layer of rendering between React and the actual browser DOM called a Virtual DOM, which only sends to the browser the changes that renders affected

But wait, you will say! How do the values change if they are immutable? The point here is that your components have an immutable internal state. The values that are passed to the components can still change, but on a higher level. They declare a label that have a text and for the label the text never changes. When the text changes in the logic of the application, a new label instance with a new text is rendered. The advantage of this is that the declaration of the render for a component defines the way the component will always render. It's declarative of your intent of how that component should look. They wanted to replace mutation with what they called reconciliation between the previous state and the current state. They almost treated web development like game development or remote access software.

So when you look at a JSX syntax and you see Javascript mixed with HTML and switching back and forth, that is different from the ASPX spaghetti code where server logic was mixed up with UI declarations. JSX is less an XML flavor and more a Javascript flavor. In fact, it is transpiled to Javascript code when executed and the changes to the UI are imperative commands that tell it how to change based on how the render code affected the component. But for the developer it's just a story of how the component should look given some data.

React was released in May 2013 and it went through some transformations on the way. Internally it probably started as an object oriented approach which was cumbersome and fought with the overall design of Javascript. They fixed that by the time of the release by using the JSX syntax, but still with code looking like

var component = React.createClass({ render: function() { return <something/>; } });

And further down the line they came up with React hooks, which move further from the class oriented approach and more towards a functional one with stuff like

const component = (props) => (<something />);

which of course is lighter still. Note that Javascript changed during that time, allowing for more stuff to happen. And now they added Typescript support, so you get something like

const Component = (props:{text:string})=>(<something />);

This evolution of the library is also one of the biggest obstacles against learning how to use it as in your Goggle search you might find solutions and problems from any time in the history of the library. You will find the perfect tool for what you wanted, but all the examples are in Javascript and in Typescript it works differently or the answers refer to previous versions of absolutely everything and you don't know which of them is the one that should apply to your task. Or even worse, some package that you use and it made your life so easy conflicts with the one you just want to use because they were never meant to work together.

As opposed to Angular, which was designed as a framework, to constrain and control what you do, React is a library. And it evolves not only through the efforts of Facebook, but also of a very active community. People would add React to their existing project rather than rewriting it from scratch. This means they will have used any of the various versions of React as well as any of the various versions of all the packages that they use in conjunction with it. And these packages are wild! Imagine a developer coming from the world of WPF or vanilla web design with HTML and CSS as the primary focus and Javascript an after thought or from Angular. They will work with React, but use it in a way they are more familiar with. That means that even if the library has a backing philosophy, there is little it can do to force you to use it in that way.

Let's take just one example: MobX, a package that takes existing data objects and wraps them in proxies that notify of changes. With React you use it as you would a ViewModel in WPF. Instead of sending an event called CLICK from a click event and handling it in some state machine looking code you can just modify the "immutable" property of the data you were sent and behind the scene a setter is executed which lets everybody know that it has changed and so it triggers a re-render which will take the value of the actual property from the behind-the-scene getter. You get your way, React kind of gets its way.

To recap:

  • components live in a tree, rather than a graph, and they all render based on the properties they are passed
  • components declare how they should look like
  • any input from the components is sent up as an event to which the system "reacts"
  • slow browser rendering is replaced with a virtual representation of the UI and updated with specific commands to reflect only the changes from one state to the other
  • React is an evolving beast with many heads and finding your own way of doing things lies at the root of any successful development effort

While exploring development with this library, developers have found various tools and concepts useful to integrate. As we usually do, they started to generalize these attempts and come up with some new principles or get to the root of the underlying flow. I want to spend the next part of the series discussing some of these more general or architectural ideas and try to see what they are worth.

Learning from React series:

  • Part 1 (this one) - why examining React is useful even if you won't end up using it
  • Part 2 - what Facebook wanted to do with React and how to get a grasp on it
  • Part 3 - what is Reactive Programming all about?
  • Part 4 - is React functional programming?
  • Part 5 - Typescript, for better and for worse
  • Part 6 - Single Page Applications are not where they wanted to be

  A billion years ago, Microsoft was trying to push a web development model that simulated Windows Forms development: ASP.Net Web Forms. It featured several ideas:

  • component based design (input fields were a component, you could bundle two together into another component, the page was a component, etc)
  • each component was rendering itself
  • components were defined using both HTML-like language, Javascript, CSS and server .Net code bundled together, sometimes in the same file
  • the rendering of the component was done on the server side and pushed to the client
  • data changes or queries came from the client to the server via event messages
  • partial rendering was possible using UpdatePanels, which were a wrapper over ajax calls that called for partial content
    • at the time many juniors were putting the entire page into an UpdatePanel and said they were doing AJAX while senior devs smugly told them how bad that was and that it shouldn't be done. I agreed with the senior devs, but I really disliked their uninformed condescending attitude, so I created a method of diffing the content sent previously and the new content and sending only the difference. This minimized the amount of data sent via the network about a hundred times. 

  Sound familiar? Because for me, learning React made me think of that almost immediately. React features:

  • component based design
  • each component renders itself
  • components are defined by bundling together HTML, Javascript/Typescript and CSS
  • the rendering of the component is done in the virtual DOM and pushed to the actual browser DOM
  • data changes or queries are sent from the browser to the React code via event messages
  • partial rendering is built in the system, by diffing the existing render tree with a newly generated one from data changes

  At first look, an old guy like me would say "been there, done that. It's bad design and it will soon go away". But I was also motivated enough at the time of ASP.Net Forms to look into it, even under the hood, and understand the thing. To say it was badly designed would be stupid. It worked for many years and powered (and still does) thousands of big applications. The reason why Forms failed is because better ideas came along, not because it was a bad idea when it was created.

  Let's look a little bit on what made Forms become obsolete: the MVC pattern, more specifically implemented by Ruby developers and taking the world by storm and ending up being adopted by Microsoft, too. But Model View Controller wasn't a new pattern, it has been used forever on desktop applications, so why was it such a blow to Forms? It was a lot of fashion elitism, but also that MVC molded itself better on web applications:

  • a clear separation of concerns: data, logic and display
  • the ability to push the display more towards the client, which was new but becoming increasingly easy in browsers
  • a clear separation of programming languages: HTML in html files, Javascript in .js files, server code in .cs files
  • full (and simple) control over how things were rendered, displayed and sent to the server

  Yet in the case of React, things are going in the opposite direction, going from MVC applications with clearly separated language files to these .jsx or .tsx files that contain javascript, html and even css in the same file, mixed together into components. Is that bad? Not really. It kind of looks bad, but the entire work is done on the client. There is no expensive interface between a server and a browser that would make the model, otherwise used successfully for decades in desktop applications, ineffective. It is basically Windows Forms in the browser, with some new ideas thrown in. As for the combination of languages in a single syntax, it can be abused, just like any technology, but it can also be done right: with state, logic and UI represented by different files and areas of the application. Yes, you need script to hide or show something based on data, but that script is part of the UI and different from the script used to represent logic. Just the language is the same. 

  "Is Siderite joining the React camp, then? Switching sides, going frontend and doing nasty magic with Javascript and designing web pages?" people will ask. A reasonable question, considering most of my close friends still think React is for people who can't code or too young to remember the .aspx hell. The answer is no! Yet just as with the UpdatePanel seniors that couldn't stop for a second to look a bit deeper into an idea and see how it could be done and then cruelly dismissing curious juniors, thinking that React can't teach you anything is just dumb.

  In this series I will explore not only React ideas, but also the underlying principles. React is just an example of reactive programming, which has been in use, even if less popular, for decades. It is now making a comeback because of microservices, another fashionable fad that has had implementations since 1990, but no one gave them the time of day. Ideas of data immutability are coming from functional programming, which is also making a comeback as it works great with big data. So why not try this thing out, iron out the kinks and learn what they did right?