and has 0 comments

  So I was happily minding my own business after a production release only for everything to go BOOM! Apparently, maybe because of something we did, but maybe not, the memory of the production servers was running out. Exception looked something like:

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
 at System.Reflection.Emit.TypeBuilder.SetMethodIL(RuntimeModulemodule, Int32tk ,BooleanisInitLocals, Byte[] body, Int32 bodyLength, 
    Byte[] LocalSig ,Int32sigLength, Int32maxStackSize, ExceptionHandler[] exceptions, Int32numExceptions ,Int32[] tokenFixups, Int32numTokenFixups)
 at System.Reflection.Emit.TypeBuilder.CreateTypeNoLock() 
 at System.Reflection.Emit.TypeBuilder.CreateType()
 at System.Xml.Serialization.XmlSerializationReaderILGen.GenerateEnd(String []methods, XmlMapping[] xmlMappings, Type[] types) 
 at System.Xml.Serialization.TempAssembly.GenerateRefEmitAssembly(XmlMapping []xmlMappings, Type[] types, StringdefaultNamespace ,Evidenceevidence)
 at System.Xml.Serialization.TempAssembly..ctor(XmlMapping []xmlMappings, Type[] types, StringdefaultNamespace ,Stringlocation, Evidenceevidence)
 at System.Xml.Serialization.XmlSerializer.GenerateTempAssembly(XmlMappingxmlMapping, Typetype ,StringdefaultNamespace, Stringlocation, Evidence evidence)
 at System.Xml.Serialization.XmlSerializer..ctor(Typetype, XmlAttributeOverrides overrides, Type[] extraTypes, 
     XmlRootAttributeroot, StringdefaultNamespace, Stringlocation, Evidence evidence)
 at System.Xml.Serialization.XmlSerializer..ctor(Typetype, XmlAttributeOverrides overrides) 

At first I thought there was something else eating away the memory, but the exception was repeatedly thrown at this specific point. And I did what every senior dev does: googled it! And I found this answer: "When an XmlSerializer is created, an assembly is dynamically generated and loaded into the AppDomain. These assemblies cannot be garbage collected until their AppDomain is unloaded, which in your case is never." It also referenced a Microsoft KB886385 from 2007 which, of course, didn't exist at that URL anymore, but I found it archived by some nice people.

What was going on? I would tell you, but Gergely Kalapos explains things much better in his article How the evil System.Xml.Serialization.XmlSerializer class can bring down a server with 32Gb ram. He also explains what commands he used to debug the issue, which is great!

But since we already know links tend to vanish over time (so much for stuff on the Internet living forever), here is the gist of it all:

  • XmlSerializer generates dynamic code (as dynamic assemblies) in its constructors
  • the most used constructors of the class have a caching mechanism in place:
    • XmlSerializer.XmlSerializer(Type)
    • XmlSerializer.XmlSerializer(Type, String)
  • but the others do not, so every time you use one of those you create, load and never unload another dynamic assembly

I know this is an old class in an old framework, but some of us still work in companies that are firmly rooted in the middle ages. Also since I plan to maintain my blog online until I die, it will live on the Internet for the duration.

Hope it helps!

  I haven't been working on the Sift string distance algorithm for a while, but then I was reminded of it because someone wanted it to use it to suggest corrections to user input. Something like Google's: "Did you mean...?" or like an autocomplete application. And it got me thinking of ways to use Sift for bulk searching. I am still thinking about it, but in the meanwhile, this can be achieved using the Sift4 algorithm, with up to 40% improvement in speed to the naïve comparison with each item in the list.

  Testing this solution, I've realized that the maxDistance parameter did not work correctly. I apologize. The code is now fixed on the algorithm's blog post, so go and get it.

  So what is this solution for mass search? We can use two pieces of knowledge about the problem space:

  • the minimum possible distance between two string of length l1 and l2 will always abs(l1-l2)
    • it's very easy to understand the intuition behind it: one cannot generate a string of size 5 from a string of size 3 without at least adding two new letters, so the minimum distance would be 2
  • as we advance through the list of strings, we have a best distance value that we keep updating
    • this molds very well on the maxDistance option of Sift4

  Thus armed, we can find the best matches for our string from a list using the following steps:

  1. set a bestDistance variable to a very large value
  2. set a matches variable to an empty list
  3. for each of the strings in the list:
    1. compare the minimum distance between the search string and the string in the list (abs(l1-l2)) to bestDistance
      1. if the minimum distance is larger than bestDistance, ignore the string and move to the next
    2. use Sift4 to get the distance between the search string and the string in the list, using bestDistance as the maxDistance parameter
      1. if the algorithm reaches a temporary distance that is larger than bestDistance, it will break early and report the temporary distance, which we will ignore
    3. if distance<bestDistance, then clear the matches list and add the string to it, updating bestDistance to distance
    4. if distance=bestDistance, then add the string to the list of matches

  When using the common Sift4 version, which doesn't compute transpositions, the list of matches is retrieved 40% faster on average than simply searching through the list of strings and updating the distance. (about 15% faster with transpositions) Considering that Sift4 is already a lot faster than Levenshtein, this method will allow searching through hundreds of thousands of strings really fast. The gained time can be used to further refine the matches list using a slower, but more precise algorithm, like Levenshtein, only on a lot smaller set of possible matches.

  Here is a sample written in JavaScript, where we search a random string in the list of English words:

search = getRandomString(); // this is the search string
let matches=[];             // the list of found matches
let bestDistance=1000000;   // the smaller distance to our search found so far
const maxOffset=5;          // a common value for searching similar strings
const l = search.length;    // the length of the search string
for (let word of english) {
    const minDist=Math.abs(l-word.length); // minimum possible distance
    if (minDist>bestDistance) continue;    // if too large, just exit
    const dist=sift4(search,word,maxOffset,bestDistance);
    if (dist<bestDistance) {
        matches = [word];                  // new array with a single item
        bestDistance=dist;
        if (bestDistance==0) break;        // if an exact match, we can exit (optional)
    } else if (dist==bestDistance) {
        matches.push(word);                // add the match to the list
    }
}

  There are further optimizations that can be added, beyond the scope of this post:

  • words can be grouped by length and the minimum distance check can be done on entire buckets of strings of the same lengths
  • words can be sorted, and when a string is rejected as a match, reject all string with the same prefix
    • this requires an update of the Sift algorithm to return the offset at which it stopped (to which the maxOffset must be added)

  I am still thinking of performance improvements. The transposition table gives more control over the precision of the search, but it's rather inefficient and resource consuming, not to mention adding code complexity, making the algorithm harder to read. If I can't find a way to simplify and improve the speed of using transpositions I might give up entirely on the concept. Also, some sort of data structure could be created - regardless of how much time and space is required, assuming that the list of strings to search is large and constant and the number of searches will be very big.

  Let me know what you think in the comments!

and has 0 comments

  Tracing and logging always seem simple, an afterthought, something to do when you've finished your code. Only then you realize that you would want to have it while you are testing your code or when an unexpected issue occurs in production. And all you have to work with is an exception, something that tells you something went wrong, but without any context. Here is a post that attempts to create a simple method to enhance exceptions without actually needing to switch logging level to Trace or anything like that and without great performance losses.

  Note that this is a proof of concept, not production ready code.

  First of all, here is an example of usage:

public string Execute4(DateTime now, string str, double dbl)
{
    using var _ = TraceContext.TraceMethod(new { now, str, dbl });
    throw new InvalidOperationException("Invalid operation");
}

  Obviously, the exception is something that would occur in a different way in real life. The magic, though, happens in the first line. I am using (heh!) the new C# 8.0 syntax for top level using statements so that there is no extra indentation and, I might say, one of the few situations where I would want to use this syntax. In fact, this post started from me thinking of a good place to use it without confusing any reader of the code.

  Also, TraceContext is a static class. That might be OK, since it is a very special class and not part of the business logic. With the new Roslyn source generators, one could insert lines like this automatically, without having to write them by hand. That's another topic altogether, though.

  So, what is going on there? Since there is no metadata information about the names of the currently executing method (without huge performance issues), I am creating an anonymous object that has properties with the same names and values as the arguments of the method. This is the only thing that might differ from one place to another. Then, in TraceMethod I return an IDisposable which will be disposed at the end of the method. Thus, I am generating a context for the entire method run which will be cleared automatically at the end.

  Now for the TraceContext class:

/// <summary>
/// Enhances exceptions with information about their calling context
/// </summary>
public static class TraceContext
{
    static ConcurrentStack<MetaData> _stack = new();

    /// <summary>
    /// Bind to FirstChanceException, which occurs when an exception is thrown in managed code,
    /// before the runtime searches the call stack for an exception handler in the application domain.
    /// </summary>
    static TraceContext()
    {
        AppDomain.CurrentDomain.FirstChanceException += EnhanceException;
    }

    /// <summary>
    /// Add to the exception dictionary information about caller, arguments, source file and line number raising the exception
    /// </summary>
    /// <param name="sender"></param>
    /// <param name="e"></param>
    private static void EnhanceException(object? sender, FirstChanceExceptionEventArgs e)
    {
        if (!_stack.TryPeek(out var metadata)) return;
        var dict = e.Exception.Data;
        if (dict.IsReadOnly) return;
        dict[nameof(metadata.Arguments)] = Serialize(metadata.Arguments);
        dict[nameof(metadata.MemberName)] = metadata.MemberName;
        dict[nameof(metadata.SourceFilePath)] = metadata.SourceFilePath;
        dict[nameof(metadata.SourceLineNumber)] = metadata.SourceLineNumber;
    }

    /// <summary>
    /// Serialize the name and value of arguments received.
    /// </summary>
    /// <param name="arguments">It is assumed this is an anonymous object</param>
    /// <returns></returns>
    private static string? Serialize(object arguments)
    {
        if (arguments == null) return null;
        var fields = arguments.GetType().GetProperties();
        var result = new Dictionary<string, object>();
        foreach (var field in fields)
        {
            var name = field.Name;
            var value = field.GetValue(arguments);
            result[name] = SafeSerialize(value);
        }
        return JsonSerializer.Serialize(result);
    }

    /// <summary>
    /// This would require most effort, as one would like to serialize different types differently and skip some.
    /// </summary>
    /// <param name="value"></param>
    /// <returns></returns>
    private static string SafeSerialize(object? value)
    {
        // naive implementation
        try
        {
            return JsonSerializer.Serialize(value).Trim('\"');
        }
        catch (Exception ex1)
        {
            try
            {
                return value?.ToString() ?? "";
            }
            catch (Exception ex2)
            {
                return "Serialization error: " + ex1.Message + "/" + ex2.Message;
            }
        }
    }

    /// <summary>
    /// Prepare to enhance any thrown exception with the calling context information
    /// </summary>
    /// <param name="args"></param>
    /// <param name="memberName"></param>
    /// <param name="sourceFilePath"></param>
    /// <param name="sourceLineNumber"></param>
    /// <returns></returns>
    public static IDisposable TraceMethod(object args,
                                            [CallerMemberName] string memberName = "",
                                            [CallerFilePath] string sourceFilePath = "",
                                            [CallerLineNumber] int sourceLineNumber = 0)
    {
        _stack.Push(new MetaData(args, memberName, sourceFilePath, sourceLineNumber));
        return new DisposableWrapper(() =>
        {
            _stack.TryPop(out var _);
        });
    }

    /// <summary>
    /// Just a wrapper over a method which will be called on Dipose
    /// </summary>
    public class DisposableWrapper : IDisposable
    {
        private readonly Action _action;

        public DisposableWrapper(Action action)
        {
            _action = action;
        }

        public void Dispose()
        {
            _action();
        }
    }

    /// <summary>
    /// Holds information about the calling context
    /// </summary>
    public class MetaData
    {
        public object Arguments { get; }
        public string MemberName { get; }
        public string SourceFilePath { get; }
        public int SourceLineNumber { get; }

        public MetaData(object args, string memberName, string sourceFilePath, int sourceLineNumber)
        {
            Arguments = args;
            MemberName = memberName;
            SourceFilePath = sourceFilePath;
            SourceLineNumber = sourceLineNumber;
        }
    }
}

Every call to TraceMethod adds a new MetaData object to a stack and every time the method ends, the stack will pop an item. The static constructor of TraceMethod will have subscribed to the FirstChangeException event of the current application domain and, whenever an exception is thrown (caught or otherwise), its Data dictionary is getting enhanced with:

  • name of the method called
  • source file name
  • source file line number where the exception was thrown.
  • serialized arguments (remember Exceptions need to be serializable, including whatever you put in the Data dictionary, so that is why we serialize it all)

(I have written another post about how .NET uses code attributes to get the first three items of information during build time) 

This way, you get information which would normally be "traced" (detailed logging which is usually detrimental to performance) in any thrown exception, but without filling some trace log or having to change production configuration and reproduce the problem again. Assuming your application does not throw exceptions all over the place, this adds very little complexity to the executed code.

Moreover, this will enhance exception with the source code file name and line number even in Release mode!

I am sure there are some issues with code that might fail and it is not caught in a try/catch and of course the serialization code is where people should put a lot of effort, since different types get to be serialized for inspection differently (think async methods and the like). And more methods should be added so that people trace whatever they like in thrown exceptions. Yet, as I said, this is a POC, so I hope it gets you inspired.

and has 0 comments

I got this exception at my work today, a System.ArgumentException with the message "Argument passed in is not serializable.", that I could not quite understand. Where does it come from, since the .NET source repository does not contain the string? How can I fix it?

The stack trace ended up at System.Collection.ListDictionaryInternal.set_Item(Object key, Object value) in a method where, indeed, I was setting a value in a dictionary. But this is not how dictionaries behave! The dictionary in question was the Exception.Data property. It makes sense, because Exception objects are supposed to be serializable, and I was adding a value of type HttpMethod which, even if extremely simple and almost always used as an Enum, it is actually a class of its own which is not serializable!

So, there you have it, always make sure you add serializable objects in an exception's Data dictionary.

But why is this happening? The implementation of the Data property looks like this:

public virtual IDictionary Data { 
  [System.Security.SecuritySafeCritical]
  get {
    if (_data == null)
      if (IsImmutableAgileException(this))
        _data = new EmptyReadOnlyDictionaryInternal();
      else
        _data = new ListDictionaryInternal();
    return _data;
  }
}

Now, EmptyReadOnlyDictionaryInternal is just a dictionary you can't add to. The interesting class is ListDictionaryInternal. Besides being an actual linked list implementation (who does that in anything but C++ classrooms?) it contains this code:

#if FEATURE_SERIALIZATION
  if (!key.GetType().IsSerializable)                 
    throw new ArgumentException(Environment.GetResourceString("Argument_NotSerializable"), "key");                    
  if( (value != null) && (!value.GetType().IsSerializable ) )
    throw new ArgumentException(Environment.GetResourceString("Argument_NotSerializable"), "value");                    
#endif

So both key and value of the Data dictionary property in an Exception instance need to be serializable.

But why didn't I find the string in the source reference? While the Microsoft reference website doesn't seem to support simple string search, it seems Google does NOT index the code GitHub pages either. You have to:

  • manually go to GitHub and search
  • get no results
  • notice that the "Code" section of the results has a question mark instead of a number next to it
  • click on it
  • then it asks you to log in
  • and only then you get results!

So bonus thing: if you are searching for some string in the .NET source code, first of all use the GitHub repo, then make sure you log in when you search.

and has 0 comments

The point of regular expression character classes is to simplify your expressions, but they can introduce subtle bugs or efficiency issues.

Let's check out this StackOverflow answer to question \d less efficient than [0-9]

\d checks all Unicode digits, while [0-9] is limited to these 10 characters. For example, Persian digits, ۱۲۳۴۵۶۷۸۹, are an example of Unicode digits which are matched with \d, but not [0-9].

This makes sense, only it has never occurred to me until this very moment. I would never use a [0-9] notation and I would replace it with a \d if found in code.

What does that mean?

One simple consequence of such a class would be performance: searching for a large list of characters is less efficient. Another would be introducing the possibility for bugs or even malicious attacks. Let's see the code for a calculator that adds two numbers. It's a silly piece of code, but imagine that a more complex one would take the user content and save it into a database, try to process it or display it.

static void Main(string[] args)
{
    Console.InputEncoding = Encoding.Unicode;
    var firstNumber = GetNumberString();
    var secondNumber = GetNumberString();
    Console.WriteLine("Sum = "+(int.Parse(firstNumber) + int.Parse(secondNumber)));
}

private static string GetNumberString()
{
    string result=null;
    var isNumber = false;
    while (!isNumber)
    {
        Console.Write("Enter a number: ");
        result = Console.ReadLine();
        isNumber = Regex.IsMatch(result, @"^\d+$");
        if (!isNumber)
        {
            Console.WriteLine($"{result} is not a number! Try again.");
        }
    }
    return result;
}

This will try to get numbers as a string and test it using the regular expression ^\d+$, which means the string has to consist of one or more digits. Note that I had to set the console input encoding to Unicode in order to be able to paste Persian numbers. This code works fine until I use Arabic or Persian digits, where it breaks in the int.Parse method. Using ^[0-9]$ as the regular expression pattern would solve this issue.

Same issue will occur with \w (warning: \w is letters AND digits) and [a-zA-Z] (or just [a-z] and using RegexOptions.IgnoreCase).

If one uses code to determine the number of matches for each regular expression pattern

var regexPattern = @"\d";
var nr = 0;
for (int i = 0; i < ushort.MaxValue; i++)
{
    string str = Convert.ToChar(i).ToString();
    if (Regex.IsMatch(str, regexPattern))
        nr++;
}
Console.WriteLine(nr);

we get this:

  • for \d : 370
    • ALL digits
  • for \w : 50320
    • ALL word characters (including digits) 
  • for [^\W\d] : 49950
    • ALL word characters, but not the digits 
  • for \p{L} : 48909
    • ALL letters
  • for [A-Za-z] : 52
    • letters from a to z
  • for [0-9] : 10
    • digits from 0 to 9

I hope this helps.

and has 0 comments

Let's start with a simple requirement and the obvious first draft of the code. The requirement is "build a method that receives a string and writes to the console a URL using that string as a parameter".

The obvious first attempt and working in every version of C# would be:

private static void ShowUrl(string something)
{
    var url = "https://siderite.dev/blog/search/formattablestring?param1="+something;
    Console.WriteLine(url);
}

But then you get some problems. You want to use a parameter that has strange characters in it, like an email, another URL, arithmetic symbols, etc. So you realize that you need to use an URL encode class. Next version is:

private static void ShowUrl(string something)
{
    var url = "https://siderite.dev/blog/search/formattablestring?param1="+Uri.EscapeDataString(something);
    Console.WriteLine(url);
}

Now that works for a second. One might argue about the difference between System.Web.HttpUtility.UrlEncode, System.Net.WebUtility.UrlEncode and System.Net.Uri.EscapeDataString. Yet, his is not the post to do that. Follow the above links for more information.

Then someone decides to use a null for the parameter and you get a ArgumentNullException and you rage in frustration "This is complicated and ugly! The first version looked much better and handled nulls, too. And now I have to add this long Uri.EscapeDataString to every URL parameter in every URL building code I write and also check for null!".

But... what if you could write the method just like the first version and get rid of every problem?

First of all, let's get rid of that plus sign and use interpolated strings. They've been around for awhile:

private static void ShowUrl(string something)
{
    var url = $"https://siderite.dev/blog/search/formattablestring?param1={something}";
    Console.WriteLine(url);
}

Now, this has the same problem of the first version: no URL encoding. And you don't want to add an escape function to every parameter (imagine there would be 10 parameters in this URL). How about we use the fact that the URL is an interpolated string?

Check out this code:

private static void ShowUrl(string something)
{
    var url = UrlHelper.Url($"https://siderite.dev/blog/search/formattablestring?param1={something}");
    Console.WriteLine(url);
}

It uses an UrlHelper class to fix the problems with the URL in the string we generate. Or is it a string? I was writing a long time ago a post about FormattableString and how I didn't see a real use scenario for it. Well, this is it! Let's see what UrlHelper looks like:

public static class UrlHelper
{
    public static string Url(FormattableString url)
    {
        return string.Format(
            url.Format, 
            url.GetArguments()
                .Select(a => Uri.EscapeDataString(a?.ToString()??""))
                .ToArray());
    }
}

The Url method receives not a string, but a FormattableString. When an interpolated string is used as a FormattableString, it gives the developer access to a Format string and a list of arguments in GetArguments. In order to convert it to a string, one must only do String.Format(s.Format,url.GetArguments()).  In our case, the Format property will have the value "https://siderite.dev/blog/search/formattablestring?param1={0}" and the arguments list will contain one value: the value of the something parameter.

Therefore, the Url method will be able to format the string, but first URL encode every single parameter in it.

I admit, this might not be the most readable code in the world, which is my pet peeve with FormattableString. Reading the ShowUrl method you have no clue that UrlHelper.Url receives a FormattableString as parameter. One might even look at that string and say "Hey, wait a minute! That's a bug waiting to happen!" and then proceed to fix it. Perhaps the name of the method could be made more intuitive like EncodeUrlParameters.

But wait! We can do better. I don't know how it might help in the future, but perhaps someone might want to process the parameters of the URL further. By returning a string we eliminate some of the information of the incoming parameter and we don't want to do that.

Also, careful people will notice that using String.Format introduces a subtle bug (or feature): we format the string without specifying a culture. This might be fine, using the current one by default, but it might also cause some issues. The FormattableString class does provide the static CurrentCulture and Invariant methods and, of course, the ToString method would work just fine as well, but then we lose the ability to process the parameters.

So here is the final version of the UrlHelper class:

public static class UrlHelper
{
    public static FormattableString EncodeUrlParameters(FormattableString url)
    {
        return FormattableStringFactory.Create(
            url.Format,
            url.GetArguments()
                .Select(a => Uri.EscapeDataString(a?.ToString() ?? ""))
                .ToArray());
    }
}

By using FormattableStringFactory from System.Runtime.CompilerServices we get a FormattableString as an argument and we return one, with parameters URL encoded. Now, if the result of the method is used as a string, it will be automatically handled by ToString and if it will be used as a FormattableString it will contain all the information provided by the original parameters.

Hope that helps!

Bonus time! There's more!

What if you could make this behavior built in by introducing a new type? Methods from, let's say, a REST client class could use FormattableStrings as URL parameters. But what if we could specify the type of the parameters and get the implicit (hint, hint!) result already URL encoded?

public class UrlString
{
    private FormattableString _formattableString;

    public UrlString(FormattableString formattableString)
    {
        _formattableString = formattableString;
    }

    public static implicit operator FormattableString(UrlString urlString)
    {
        if (urlString == null) return null;
        return UrlHelper.EncodeUrlParameters(urlString.ToFormattableString());
    }

    public static implicit operator string(UrlString urlString)
    {
        if (urlString == null) return null;
        return ((FormattableString)urlString).ToString();
    }

    public static implicit operator UrlString(FormattableString formattableString)
    {
        return new UrlString(formattableString);
    }

    private FormattableString ToFormattableString()
    {
        return _formattableString;
    }
}

This UrlString class will implicitly be converted into a FormattableString or a string and a FormattableString will be converted into a UrlString. So now your code might look like this:

// used like this
RestClient.Get($"https://test.com?x={something}");

// this method does not URL encode anything
public RestResponse Get(string url) {
  ...
}

// this method does URL encode everything, just by changing the type
public RestResponse Get(UrlString url) {
  ...
}

Intro

Dependency injection is baked in the ASP.Net Core projects (yes, I still call it Core), but it's missing from console app templates. And while it is easy to add, it's not that clear cut on how to do it. I present here three ways to do it:

  1. The fast one: use the Worker Service template and tweak it to act like a console application
  2. The simple one: use the Console Application template and add dependency injection to it
  3. The hybrid: use the Console Application template and use the same system as in the Worker Service template

Tweak the Worker Service template

It makes sense that if you want a console application you would select the Console Application template when creating a new project, but as mentioned above, it's just the default template, as old as console apps are. Yet there is another default template, called Worker Service, which almost does the same thing, only it has all the dependency injection goodness baked in, just like an ASP.Net Core Web App template.

So start your Visual Studio, add a new project and choose Worker Service:

It will create a project containing a Program.cs, a Worker.cs and an appsettings.json file. Program.cs holds the setup and Worker.cs holds the code to be executed.

Worker.cs has an ExecuteAsync method that logs some stuff every second, but even if we remove the while loop and add our own code, the application doesn't stop. This might be a good thing, as sometimes we just want stuff to work until we press Ctrl-C, but it's not a console app per se.

In order to transform it into something that works just like a console application you need to follow these steps:

  1. inject an IHost instance into your worker
  2. specifically instruct the host to stop whenever your code has finished

So, you go from:

public class Worker : BackgroundService
{
    private readonly ILogger<Worker> _logger;

    public Worker(ILogger<Worker> logger)
    {
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            _logger.LogInformation("Worker running at: {time}", DateTimeOffset.Now);
            await Task.Delay(1000, stoppingToken);
        }
    }
}

to:

public class Worker : BackgroundService
{
    private readonly ILogger<Worker> _logger;
    private readonly IHost _host;

    public Worker(ILogger<Worker> logger, IHost host)
    {
        _logger = logger;
        _host = host;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        Console.WriteLine("Hello world!");
        _host.StopAsync();
    }
}

Note that I did not "await" the StopAsync method because I don't actually need to. You are telling the host to stop and it will do it whenever it will see fit.

If we look into the Program.cs code we will see this:

public class Program
{
    public static void Main(string[] args)
    {
        CreateHostBuilder(args).Build().Run();
    }

    public static IHostBuilder CreateHostBuilder(string[] args) =>
        Host.CreateDefaultBuilder(args)
            .ConfigureServices((hostContext, services) =>
            {
                services.AddHostedService<Worker>();
            });
}

I don't know why they bothered with creating a new method and then writing it as an expression body, but that's the template. You see that there is a lambda adding dependencies (by default just the Worker class), but everything starts with Host.CreateDefaultBuilder. In .NET source code, this leads to HostingHostBuilderExtensions.ConfigureDefaults, which adds a lot of stuff:

  • environment variables to config
  • command line arguments to config (via CommandLineConfigurationSource)
  • support for appsettings.json configuration
  • logging based on operating system

That is why, if you want these things by default, your best bet is to tweak the Worker Service template

Add Dependency Injection to an existing console application

Some people want to have total control over what their code is doing. Or maybe you already have a console application doing stuff and you just want to add Dependency Injection. In that case, these are the steps you want to follow:

  1. create a ServiceCollection instance (needs a dependency to Microsoft.Extensions.DependencyInjection)
  2. register all dependencies you want to use to it
  3. create a starting class that will execute stuff (just like Worker above)
  4. register starting class in the service collection
  5. build a service provider from the collection
  6. get an instance of the starting class and execute its one method

Here is an example:

class Program
{
    static void Main(string[] args)
    {
        var services = new ServiceCollection();
        ConfigureServices(services);
        services
            .AddSingleton<Executor,Executor>()
            .BuildServiceProvider()
            .GetService<Executor>()
            .Execute();
    }

    private static void ConfigureServices(IServiceCollection services)
    {
        services
            .AddSingleton<ITest, Test>();
    }
}

public class Executor
{
    private readonly ITest _test;

    public Executor(ITest test)
    {
        _test = test;
    }

    public void Execute()
    {
        _test.DoSomething();
    }
}

The only reason we register the Executor class is in order to get an instance of it later, complete with constructor injection support. You can even make Execute an async method, so you can get full async/await support. Of course, for this example appsettings.json configuration or logging won't work until you add them.

Mix them up

Of course, one could try to get the best of both worlds. This would work kind of like this:

  1. use Host.CreateDefaultBuilder() anyway (needs a dependency to Microsoft.Extensions.Hosting), but in a normal console app
  2. use the resulting service provider to instantiate a starting class
  3. start it

Here is an example:

class Program
{
    static void Main(string[] args)
    {
        Host.CreateDefaultBuilder()
            .ConfigureServices(ConfigureServices)
            .ConfigureServices(services => services.AddSingleton<Executor>())
            .Build()
            .Services
            .GetService<Executor>()
            .Execute();
    }

    private static void ConfigureServices(HostBuilderContext hostContext, IServiceCollection services)
    {
        services.AddSingleton<ITest,Test>();
    }
}

The Executor class would be just like in the section above, but now you get all the default logging and configuration options from the Worker Service section.

Conclusion

What the quickest and best solution is depends on your situation. One could just as well start with a Worker Service template, then tweak it to never Run the builder and instead configure it as above. One can create a Startup class, complete with Configure and ConfigureServices as in an ASP.Net Core Web App template or even start with such a template, then tweak it to work as a console app/web hybrid. In .NET Core everything is a console app anyway, it's just depends on which packages you load and how you configure them.

and has 0 comments

  We have been told again and again that our code should be split into smaller code blocks that are easy to understand and read. The usual example is a linear large method that can easily be split into smaller methods that just execute one after the other. But in real life just selecting a logical bit of our method and applying an Extract Method refactoring doesn't often work because code rarely has a linear flow.

  Let's take a classic example of "exiting early", another software pattern that I personally enjoy. It says that you extract the functionality of retrieving some data into a method and, inside it, return from the method as soon as you fail a necessary condition for extracting that data. Something like this:

public Order OrderCakeForClient(int clientId) {
  var client = _clientRepo.GetClient(clientId);
  if (client == null) {
    _logger.Log("Client doesn't exist");
    return null;
  }
  var preferences = _clientRepo.GetPreferences(client);
  if (preferences == null) {
    _logger.Log("Client has not submitted preferences");
    return null;
  }
  var existingOrder = _orderRepo.GetOrder(client, preferences);
  if (existingOrder != null) {
    _logger.Log("Client has already ordered their cake");
    return existingOrder;
  }
  var cake = _inventoryRepo.GetCakeByPreferences(preferences);
  if (cake == null) {
    cake = MakeNewCake(preferences);
  }
  if (cake == null) {
    _logger.Log("Cannot make cake as requested by client");
    return null;
  }
  var order = _orderRepo.GenerateOrder(client, cake);
  if (order == null) {
    _logger.Log("Generating order failed");
    throw new OrderFailedException();
  }
  return order;
}

This is a long method that appears to be linear, but in fact it is not. At every step there is the opportunity to exit the method therefore this is like a very thin tree, not a line. And it is a contrived example. A real life flow would have multiple decision points, branching and then merging back, with asynchronous logic and so on and so on.

If you want to split it into smaller methods you will have some issues:

  • the most of the code is not actual logic code, but logging code - one could make a method called getPreferences(client), but it would just proxy the _clientRepo method and gain nothing
  • one could try to extract a method that gets the preferences, logs and decides to exit - one cannot do that directly, because you cannot make an exit decision in a child method

Here is a way that this could work. I personally like it, but as you will see later on, it's not sufficient. Let's encode the exit decision as a boolean and get the information we want as out parameters. Then the code would look like this:

public Order OrderCakeForClient(int clientId) {
  if clientExists(clientId, out Client client)
    && clientSubmittedPreferences(client, out Preferences preferences)
    && orderDoesntExist(client, preferences, out Order existingOrder)
    && getACake(preferences, out Cake cake) {
    return generateOrder(client,cake);
  }
  return existingOrder;
}

private bool clientExists(clientId, out Client client) {
  client = _clientRepo.GetClient(clientId);
  if (client == null) {
    _logger.Log("Client doesn't exist");
    return false;
  }
  return true; 
}

private bool clientSubmittedPreferences(Client client, out Preferences preferences) {
  preferences = _clientRepo.GetPreferences(client);
  if (preferences == null) {
    _logger.Log("Client has not submitted preferences");
    return false;
  }
  return true;
}

private bool orderDoesntExist(Client client, Preferences preferences,
                               out Order existingOrder) {
  existingOrder = _orderRepo.GetOrder(client, preferences);
  if (existingOrder != null) {
    _logger.Log("Client has already ordered their cake");
    return false;
  }
  return true;
}

private bool getACake(Preferences preferences, out Cake cake)
  cake = _inventoryRepo.GetCakeByPreferences(preferences);
  if (cake == null) {
    cake = MakeNewCake(preferences);
  }
  if (cake == null) {
    _logger.Log("Cannot make cake as requested by client");
    return false;
  }
  return true;
}

private Order generateOrder(Client client, Cake cake)
  var order = _orderRepo.GenerateOrder(client, cake);
  if (order == null) {
    _logger.Log("Generating order failed");
    throw new OrderFailedException();
  }
  return order;
}

Now reading OrderCakeForClient is a joy! Other than the kind of weird assignment of existingOrder in a condition method, which is valid in C# 7, but not that easy to read, one can grasp the logic of the method from a single glance as a process that only runs if four conditions are met.

However, there are some problems here. One of them is that the out syntax only works for non-async methods and it is generally frowned upon these days. The method assumes a cake can just be made instantly, when we know it takes time, care, effort and kitchen cleaning. Also, modern code rarely features synchronous database access.

So how do we fix it? The classic solution for getting rid of the out syntax is to create custom return types that contain everything the method returns. Let's try to rewrite the method with async in mind:

public async Task<Order> OrderCakeForClient(int clientId) {
  var clientExistsResult = await clientExists(clientId);
  if (!clientExistsResult.Success) return null;
  var orderDoesntExistResult = await orderDoesntExist(client, preferences)
  if (!orderDoesntExistResult.Success) return orderDoesntExistResult.Order;
  var getACakeResult = await getACake(preferences);
  if (!getACakeResult.Success) return null;
  return generateOrder(client,cake);
}

private async Task<ClientExistsResult> clientExists(clientId) {
  client = await _clientRepo.GetClient(clientId);
  if (client == null) {
    _logger.Log("Client doesn't exist");
    return new ClientExistsResult { Success = false };
  }
  return new ClientExistsResult { Client = client, Success = false }; 
}

// the rest ommitted for brevity

Wait... this doesn't look good at all! We wrote a ton of extra code and we got... something that is more similar to the original code than the easy to read one. What happened? Several things:

  • because we couldn't return a bool, he had to revert to early exiting - not bad in itself, but adds code that feels to duplicate the logic in the condition methods
  • we added a return type for each condition method - extra code that provides no extra functionality, not to mention ugly
  • already returning bool values was awkward, returning these result objects is a lot more so
  • the only reason why we needed to return a result value (bool or something more complex) was to move the logging behavior in the smaller methods

"Step aside!" the software architect will say and quickly draw on the whiteboard a workflow framework that will allow us to rewrite any flow as a combination of classes that can be configured in XML. He is dragged away from the room, spitting and shouting.

The problem in the code is that we want to do three things at the same time:

  1. write a simple, easy to read, flow-like code
  2. make decisions based on partial results of that code
  3. store the partial results of that code

There is a common software pattern that is normally used for flow like code: the builder pattern. However, it feels unintuitive to use it here. First of all, the builder pattern needs some masterful code writing to make it work with async/await. Plus, most people, when thinking about a builder, they think of a separate reusable class that handles logic that is independent from the place where it is used. But it doesn't need to be so. Let's rewrite this code using a builder like logic:

public async Task<Order> OrderCakeForClient(int clientId)
{
    SetClientId(clientId);
    await IfClientExists();
    await IfClientSubmittedPreferences();
    await IfOrderDoesntExist();
    await IfGotACake();
    return await GenerateOrder();
}

private bool _continueFlow = true;
private int _clientId;
private Client _client;
private Preferences _preferences;
private Order _existingOrder;
private Cake _cake;

public void SetClientId(int clientId)
{
    _clientId = clientId;
    _existingOrder = null;
    _continueFlow = true;
}

public async Task IfClientExists()
{
    if (!_continueFlow) return;
    _client = await _clientRepo.GetClient(_clientId);
    test(_client != null, "Client doesn't exist");
}

public async Task IfClientSubmittedPreferences()
{
    if (!_continueFlow) return;
    _preferences = await _clientRepo.GetPreferences(_client);
    test(_preferences != null, "Client has not submitted preferences");
}

public async Task IfOrderDoesntExist()
{
    if (!_continueFlow) return;
    _existingOrder = await _orderRepo.GetOrder(_client, _preferences);
    test(_existingOrder == null, "Client has already ordered their cake");
}

public async Task IfGotACake()
{
    if (!_continueFlow) return;
    _cake = await _inventoryRepo.GetCakeByPreferences(_preferences)
        ?? await MakeNewCake(_preferences);
    test(_cake != null, "Cannot make cake as requested by client");
}

public async Task<Order> GenerateOrder()
{
    if (!_continueFlow) return _existingOrder;
    var order = await _orderRepo.GenerateOrder(_client, _cake);
    if (order == null)
    {
        _logger.Log("Generating order failed");
        throw new OrderFailedException();
    }
    return order;
}

private void test(bool condition, string logText)
{
    if (!condition)
    {
        _continueFlow = false;
        _logger.Log(logText);
    }
}

No need for a new builder class, just add builder like methods to the existing class. No need to return the instance of the class, since it's only used inside it. No need for chaining because we don't return anything and chaining with async is a nightmare.

The ugly secret is, of course, using fields of the class instead of local variables. That may add other problems, like multithread safety issues, but those are not in the scope of this post.

I've shown a pretty stupid piece of code that anyway was difficult to properly split into smaller methods. I've demonstrated several ways to do that, the choice of which depends on the actual requirements of your code. Hope that helps!

  Every month or so I see another article posted by some dev, usually with a catchy title using words like "demystifying" or "understanding" or "N array methods you should be using" or "simplify your Javascript" or something similar. It has become so mundane and boring that it makes me mad someone is still trying to cache on these tired ideas to try to appear smart. So stop doing it! There is no need to explain methods that were introduced in 2009!

  But it gets worse. These articles are partially misleading because Javascript has evolved past the need to receive or return data as arrays. Let me demystify the hell out of you.

  First of all, the methods we are discussing here are .filter and .map. There is of course .reduce, but that one doesn't necessarily return an array. Ironically, one can write both .filter and .map as a reduce function, so fix that one and you can get far. There is also .sort, which for performance reasons works a bit differently and returns nothing, so it cannot be chained as the others can. All of these methods from the Array object have something in common: they receive functions as parameters that are then applied to all of the items in the array. Read that again: all of the items.

  Having functions as first class citizens of the language has always been the case for Javascript, so that's not a great new thing to teach developers. And now, with arrow functions, these methods are even easier to use because there are no scope issues that caused so many hidden errors in the past.

  Let's take a common use example for these methods for data display. You have many data records that need to be displayed. You have to first filter them using some search parameters, then you have to order them so you can take just a maximum of n records to display on a page. Because what you display is not necessarily what you have as a data source, you also apply a transformation function before returning something. The code would look like this:

var colors = [
  {    name: 'red',    R: 255,    G: 0,    B: 0  },
  {    name: 'blue',   R: 0,      G: 0,    B: 255  },
  {    name: 'green',  R: 0,      G: 255,  B: 0  },
  {    name: 'pink',   R: 255,    G: 128,  B: 128  }
];

// it would be more efficient to get the reddish colors in an array
// and sort only those, but we want to discuss chaining array methods
colors.sort((c1, c2) => c1.name > c2.name ? 1 : (c1.name < c2.name ? -1 : 0));

const result = colors
  .filter(c => c.R > c.G && c.R > c.B)
  .slice(page * pageSize, (page + 1) * pageSize)
  .map(c => ({
      name: c.name,
      color: `#${hex(c.R)}${hex(c.G)}${hex(c.B)}`
  }));

This code takes a bunch of colors that have RGB values and a name and returns a page (defined by page and pageSize) of the colors that are "reddish" (more red than blue and green) order by name. The resulting objects have a name and an HTML color string.

This works for an array of four elements, it works fine for arrays of thousands of elements, too, but let's look at what it is doing:

  • we pushed the sort up, thus sorting all colors in order to get the nice syntax at the end, rather than sorting just the reddish colors
  • we filtered all colors, even if we needed just pageSize elements
  • we created an array at every step (three times), even if we only needed one with a max size of pageSize

Let's write this in a classical way, with loops, to see how it works:

const result = [];
let i=0;
for (const c of colors) {
	if (c.R<c.G || c.R<c.B) continue;
	i++;
	if (i<page*pageSize) continue;
	result.push({
      name: c.name,
      color: `#${hex(c.R)}${hex(c.G)}${hex(c.B)}`
    });
	if (result.length>=pageSize) break;
}

And it does this:

  • it iterates through the colors array, but it has an exit condition
  • it ignores not reddish colors
  • it ignores the colors of previous pages, but without storing them anywhere
  • it stores the reddish colors in the result as their transformed version directly
  • it exits the loop if the result is the size of a page, thus only going through (page+1)*pageSize loops

No extra arrays, no extra iterations, only some ugly ass code. But what if we could write this as nicely as in the first example and make it work as efficiently as the second? Because of ECMAScript 6 we actually can!

Take a look at this:

const result = Enumerable.from(colors)
  .where(c => c.R > c.G && c.R > c.B)
  //.orderBy(c => c.name)
  .skip(page * pageSize)
  .take(pageSize)
  .select(c => ({
      name: c.name,
      color: `#${hex(c.R)}${hex(c.G)}${hex(c.B)}`
  }))
  .toArray();

What is this Enumerable thing? It's a class I made to encapsulate the methods .where, .skip, .take and .select and will examine it later. Why these names? Because they mirror similar method names in LINQ (Language Integrated Queries from .NET) and because I wanted to clearly separate them from the array methods.

How does it all work? If you look at the "classical" version of the code you see the new for..of loop introduced in ES6. It uses the concept of "iterable" to go through all of the elements it contains. An array is an iterable, but so is a generator function, also an ES6 construct. A generator function is a function that generates values as it is iterated, the advantage being that it doesn't need to hold all of the items in memory (like an array) and any operation that needs doing on the values is done only on the ones requested by code.

Here is what the code above does:

  • it creates an Enumerable wrapper over array (performs no operation, just assignments)
  • it filters by defining a generator function that only returns reddish colors (but performs no operation) and returns an Enumerable wrapper over the function
  • it ignores the items from previous pages by defining a generator function that counts items and only returns items after the specified number (again, no operation) and returns an Enumerable wrapper over the function
  • it then takes a page full of items, stopping immediately after, by defining a generator function that does that (no operation) and returns an Enumerable wrapper over the function
  • it transforms the colors in output items by defining a generator function that iterates existing items and returns the transformed values (no operation) and returns an Enumerable wrapper over the function
  • it iterates the generator function in the current Enumerable and fills an array with the values (all the operations are performed here)

And here is the flow for each item:

  1. .toArray enumerates the generator function of .select
  2. .select enumerates the generator function of .take
  3. .take enumerates the generator function of .skip
  4. .skip enumerates the generator function of .where
  5. .where enumerates the generator function that iterates over the colors array
  6. the first color is red, which is reddish, so .where "yields" it, it passes as the next item in the iteration
  7. the page is 0, let's say, so .skip has nothing to skip, it yields the color
  8. .take still has pageSize items to take, let's assume 20, so it yields the color
  9. .select yields the color transformed for output
  10. .toArray pushes the color in the result
  11. go to 1.

If for some reason you would only need the first item, not the entire page (imagine using a .first method instead of .toArray) only the steps from 1. to 10. would be executed. No extra arrays, no extra filtering, mapping or assigning.

Am I trying too hard to seem smart? Well, imagine that there are three million colors, a third of them are reddish. The first code would create an array of a million items, by iterating and checking all three million colors, then take a page slice from that (another array, however small), then create another array of mapped objects. This code? It is the equivalent of the classical one, but with extreme readability and ease of use.

OK, what is that .orderBy thing that I commented out? It's a possible method that orders items online, as they come, at the moment of execution (so when .toArray is executed). It is too complex for this blog post, but there is a full implementation of Enumerable that I wrote containing everything you will ever need. In that case .orderBy would only order the minimal number of items required to extract the page ((page+1) * pageSize). The implementation can use custom sorting algorithms that take into account .take and .skip operators, just like in LiNQer.

The purpose of this post was to raise awareness on how Javascript evolved and on how we can write code that is both readable AND efficient.

One actually doesn't need an Enumerable wrapper, and can add the methods to the prototype of all generator functions, as well (see LINQ-like functions in JavaScript with deferred execution). As you can see, this was written 5 years ago, and still people "teach" others that .filter and .map are the Javascript equivalents of .Where and .Select from .NET. NO, they are NOT!

The immense advantage for using a dedicated object is that you can store information for each operator and use it in other operators to optimize things even further (like for orderBy). All code is in one place, it can be unit tested and refined to perfection, while the code using it remains the same.

Here is the code for the simplified Enumerable object used for this post:

class Enumerable {
  constructor(generator) {
	this.generator = generator || function* () { };
  }

  static from(arr) {
	return new Enumerable(arr[Symbol.iterator].bind(arr));
  }

  where(condition) {
    const generator = this.generator();
    const gen = function* () {
      let index = 0;
      for (const item of generator) {
        if (condition(item, index)) {
          yield item;
        }
        index++;
      }
    };
    return new Enumerable(gen);
  }

  take(nr) {
    const generator = this.generator();
    const gen = function* () {
      let nrLeft = nr;
      for (const item of generator) {
        if (nrLeft > 0) {
          yield item;
          nrLeft--;
        }
        if (nrLeft <= 0) {
          break;
        }
      }
    };
    return new Enumerable(gen);
  }

  skip(nr) {
    const generator = this.generator();
    const gen = function* () {
      let nrLeft = nr;
      for (const item of generator) {
        if (nrLeft > 0) {
          nrLeft--;
        } else {
          yield item;
        }
      }
    };
    return new Enumerable(gen);
  }

  select(transform) {
    const generator = this.generator();
    const gen = function* () {
      for (const item of generator) {
		yield transform(item);
      }
    };
    return new Enumerable(gen);
  }

  toArray() {
	return Array.from(this.generator());
  }
}

The post is filled with links and for whatever you don't understand from the post, I urge you to search and learn.

and has 0 comments

A few years ago I wrote an article about using RealProxy to intercept methods and properties calls in order to log them. It was only for .NET Framework and suggested you inherit all intercepted classes from MarshalByRefObject. This one is a companion piece that shows how interception can be done in a more general way and without the need for MarshalByRefObject.

To do that I am going to give you two versions of the same class, one for .NET Framework and one for .NET Core which can be used like this:

//Initial code:
IInterface obj = new Implementation();

//Interceptor added:
IInterface obj = new Implementation();
var interceptor = new MyInterceptor<IInterface>();
obj = interceptor.Decorate(obj);

//Interceptor class (every method there is optional):
public class MyInterceptor<T> : ClassInterceptor<T>
{
    protected override void OnInvoked(MethodInfo methodInfo, object[] args, object result)
    {
        // do something when the method or property call ended succesfully
    }

    protected override void OnInvoking(MethodInfo methodInfo, object[] args)
    {
        // do something before the method or property call is invoked
    }

    protected override void OnException(MethodInfo methodInfo, object[] args, Exception exception)
    {
        // do something when a method or property call throws an exception
    }
}

This code would be the same for .NET Framework or Core. The difference is in the ClassInterceptor code and the only restriction is that your class has to implement an interface for the methods and properties intercepted.

Here is the .NET Framework code:

public abstract class ClassInterceptor<TInterface> : RealProxy
{
    private object _decorated;

    public ClassInterceptor()
        : base(typeof(TInterface))
    {
    }

    public TInterface Decorate<TImplementation>(TImplementation decorated)
        where TImplementation:TInterface
    {
        _decorated = decorated;
        return (TInterface)GetTransparentProxy();
    }

    public override IMessage Invoke(IMessage msg)
    {
        var methodCall = msg as IMethodCallMessage;
        var methodInfo = methodCall.MethodBase as MethodInfo;
        OnInvoking(methodInfo,methodCall.Args);
        object result;
        try
        {
            result = methodInfo.Invoke(_decorated, methodCall.InArgs);
        } catch(Exception ex)
        {
            OnException(methodInfo, methodCall.Args, ex);
            throw;
        }
        OnInvoked(methodInfo, methodCall.Args, result);
        return new ReturnMessage(result, null, 0, methodCall.LogicalCallContext, methodCall);
    }

    protected virtual void OnException(MethodInfo methodInfo, object[] args, Exception exception) { }
    protected virtual void OnInvoked(MethodInfo methodInfo, object[] args, object result) { }
    protected virtual void OnInvoking(MethodInfo methodInfo, object[] args) { }
}

In it, we use the power of RealProxy to create a transparent proxy. For Core we use DispatchProxy, which is the .NET Core replacement from Microsoft. Here is the code:

public abstract class ClassInterceptor<TInterface> : DispatchProxy
{
    private object _decorated;

    public ClassInterceptor() : base()
    {
    }

    public TInterface Decorate<TImplementation>(TImplementation decorated)
        where TImplementation : TInterface
    {
        var proxy = typeof(DispatchProxy)
                    .GetMethod("Create")
                    .MakeGenericMethod(typeof(TInterface),GetType())
                    .Invoke(null,Array.Empty<object>())
            as ClassInterceptor<TInterface>;

        proxy._decorated = decorated;

        return (TInterface)(object)proxy;
    }

    protected override object Invoke(MethodInfo targetMethod, object[] args)
    {
        OnInvoking(targetMethod,args);
        try
        {
            var result = targetMethod.Invoke(_decorated, args);
            OnInvoked(targetMethod, args,result);
            return result;
        }
        catch (TargetInvocationException exc)
        {
            OnException(targetMethod, args, exc);
            throw exc.InnerException;
        }
    }


    protected virtual void OnException(MethodInfo methodInfo, object[] args, Exception exception) { }
    protected virtual void OnInvoked(MethodInfo methodInfo, object[] args, object result) { }
    protected virtual void OnInvoking(MethodInfo methodInfo, object[] args) { }
}

DispatchProxy is a weird little class. Look how it generates an object which can be cast simultaneously to T or Class<T>!

There are many other things one can do to improve this class:

  • the base class could make the distinction between a method call and a property call. In the latter case the MethodInfo object will have IsSpecialName true and start with set_ or get_
  • for async/await scenarios and not only, the result of a method would be a Task<T> and if you want to log the result you should check for that, await the task, get the result, then log it. So this class could make this functionality available out of the box
  • support for Dependency Injection scenarios could also be added as the perfect place to use interception is when you register an interface-implementation pair. An extension method like container.RegisterSingletonWithLogging could be used instead of container.RegisterSingleton, by registering a factory which replaces the implementation with a logging proxy

I hope this helps!

P.S. Here is an article helping to migrate from RealProxy to DispatchProxy: Migrating RealProxy Usage to DispatchProxy

Definition

So, the task at hand is the subject of a common interview question: Implement an algorithm to get all valid (opened and closed) combinations of n pairs of parentheses. This means that for n=1 there is only one solution: "()". "((" or "))" are not valid, for 2 you will have "(())" and "()()" and so on. The question is trying to test how the interviewee handles recursion and what is commonly called backtracking. But as usual, there's more than one way to skin a cat, although for the life of me I can't see why you would want to do that.

The solutions here will be in C# and the expected result is an enumeration of strings containing open and closed parentheses. The code can be easily translated into other languages, including Javascript (ECMAScript 2015 introduced iterators and generator functions), but that's left to the reader. Let's begin.

Analysis

Before we solve any problem we need to analyse it and see what are the constraints and the expected results. In this case there are several observations that can be made:

  • the resulting strings will be of length n*2 (n pairs)
  • they will contain n '(' characters and n ')' characters
  • they cannot start with a ')' or end in a '('
  • in order to generate such a string, we can start with a smaller string to which we add '(' or ')'
  • we cannot add a ')' if there isn't at least one corresponding unclosed '(' 
  • if we add a '(' we need to have enough characters left to close the parenthesis, so the number of unclosed parentheses cannot exceed the characters left to fill
  • we could count the open and closed parentheses, but we only care about the number of unclosed ones, so instead of "closed" and "open" values, we can only use "open" to represent unclosed parentheses

Let's go for some variables and calculations:

  • n = number of pairs
  • open = number of unclosed parentheses in a string
  • open cannot be negative
  • one cannot add ')' if open = 0
  • one cannot add '(' if open >= n*2 - substring.length

Recursive solution

A simple implementation of these requirements can done with recursion:

public IEnumerable<string> GenerateRecursive(int n, string root = "", int open = 0)
{
    // substring is long enough, return it and exit
    if (root.Length == n * 2)
    {
        yield return root;
        yield break;
    }
    // if we can add '(' to existing substring, continue the process with the result
    if (open < n * 2 - root.Length)
    {
        // if only C# could just 'yield IteratorFunction()' this would look sleeker
        foreach (var s in GenerateRecursive(n, root + "(", open + 1))
        {
            yield return s;
        }
    }
    // if we can add ')' to existing substring, continue the process with the result
    if (open > 0)
    {
        foreach (var s in GenerateRecursive(n, root + ")", open - 1))
        {
            yield return s;
        }
    }
}

However, every time you see recursion you have to ask yourself: could n be large enough to cause a stack overflow? For example this fails for n=3000. The nice thing about this method, though, is that it can be limited to the number of items you want to see. For example var firstTen = GenerateRecursive(1000).Take(10) is very fast, as the generation is depth first and only computes the first ten values and exits.

So, can we replace the recursion with iteration?

Iterative solution

In order to do thing iteratively, we need to store the results of the previous step and use them in the current step. This means breadth first generation, which has its own problems. Let's see some code:

public IEnumerable<string> GenerateIteration(int n)
{
    // using named tuples to store the unclosed parentheses count with the substring
    var results = new List<(string Value,int Open)>() { ("",0) };
    for (int i = 0; i < n*2; i++)
    {
        // each step we compute the list of new strings from the list in the previous step
        var newResults = new List<(string Value, int Open)>();
        foreach (var (Value, Open) in results)
        {
            if (Open < n * 2 - Value.Length)
            {
                newResults.Add((Value + "(", Open + 1));
            }
            if (Open > 0)
            {
                newResults.Add((Value + ")", Open - 1));
            }
        }
        results = newResults;
    }
    return results.Select(r=>r.Value);
}

It's pretty sleek, but if you try something like var firstTen = GenerateRecursive(1000).Take(10) now it will take forever since all combinations of 1000 parentheses need to be computed and stored before taking the first 10! BTW, we can write this much nicer with LINQ, but be careful at the gotcha in the comment:

public IEnumerable<string> GenerateLinq(int n)
{
    // looks much nicer with LINQ
    IEnumerable<(string Value, int Open)> results = new[] { ("", 0) };
    for (var i = 0; i < n * 2; i++)
    {
        results =
            results
                .Where(r => r.Open < n * 2 - r.Value.Length)
                .Select(r => (Value: r.Value + "(", Open: r.Open + 1))
            .Concat(results
                .Where(r => r.Open > 0)
                .Select(r => (Value: r.Value + ")", Open: r.Open - 1))
            );  // but if you do not end this with a .ToList()
                // it will generate a huge expression that then will be evaluated at runtime! Oops!
    }
    return results.Select(r => r.Value);
}

But can't we do better? One is going to stack overflow, the other memory overflow and the last one kind of does both.

Incremental solution

They did say this requires an incremental solution, right? So why don't we take this literally? '(' and ')' are like 0 and 1, as ')' must always follow a '('. If you view a parenthesis string as a binary number, then all possible combinations can be encoded as numbers. This means that we could conceivably write a very fast function that would compute all possible combinations using bit operations, maybe even special processor instructions that count bits and so on. However, this would work only for n<=32 or 64 depending on the processor architecture and we don't want to get into that. But we can still use the concept!

If a string represents a fictional number, then you can start with the smallest one, increment it and check for validity. If you combine the incremental operation with the validity check you don't need to go through 2n operations to get the result. It doesn't use any memory except the current string and it is depth first generation. The best of both worlds! Let's see some code:

public IEnumerable<string> GenerateIncrement(int n)
{
    // the starting point is n open parentheses and n closing ones
    // we use the same array of characters to generate the strings we display
    var arr = (new string('(', n) + new string(')', n)).ToCharArray();
    // iteration will stop when incrementation reaches the "maximum" valid combination
    var success = true;
    while (success)
    {
        yield return new string(arr);
        success = Increment(arr, n);
    }
}

private bool Increment(char[] arr, int n)
{
    // we begin with a valid string, which means there are no unclosed parentheses
    var open = 0;
    // we start from the end of the string
    for (var i = arr.Length - 1; i >= 0; i--)
    {
        // ')' is equivalent to a 1. To "increment" this string we need to go to the previous position
        // incrementing 01 in binary results in 10
        if (arr[i] == ')')
        {
            open++;
            continue;
        }

        // '(' is equivalent to a 0. We will turn it into a ')' to increment it,
        // but only if there are unclosed parentheses to close
        open--;
        if (open == 0) continue;

        // we 'increment' the value
        arr[i] = ')';
        // now we need to reset the rest of the array
        var k = n - (open + i) / 2;
        // as many opening parenthesis as possible
        for (var j = i + 1; j < i + 1 + k; j++)
        {
            arr[j] = '(';
        }
        // the rest are closing parentheses
        for (var j = i + 1 + k; j < n * 2; j++)
        {
            arr[j] = ')';
        }
        return true;
    }
    // if we reached this point it means we got to a maximum
    return false;
}

Now doing GenerateIncrement(1000000).Take(10) took more to display the results than to actually compute them.

More solutions

As this is a classic interview question, there are a billion solutions to it at LeetCode. Yet the purpose of interview questions is to find out how one thinks, not what the solution of the problem actually is. I hope this helps.

and has 0 comments

Intro

  When talking Dependency Injection, if a class implementing Interface1 needs an implementation of Interface2 in its constructor and the implementation for Interface2 needs an implementation of Interface1 you get a circular dependency error. This could be fixed, though, by providing lazy proxy implementations, which would also fix issues with resources getting allocated too early and other similar issues.  Now, theoretically this is horrible. Yet in practice one meets this situation a lot. This post will attempt to clarify why this happens and how practice may be different from theory.

Problem definition

  Let's start with defining what an interface is. Wikipedia says it's a shared boundary between components. In the context of dependency injection you often hear about the Single Responsibility Principle, which stipulates that a class (and by extension an interface) should only do one thing. Yet even in this case, the implementation for any of the Facade, Bridge, Decorator, Proxy and Adapter software patterns would do only one thing: to proxy, merge or split the functionality of other components, regardless of how many and how complex they are. Going to the other extreme, one could create an interface for every conceivable method, thus eliminating the need for circular dependencies and also loading code that is not yet needed. And then there are the humans writing the code. When you need a service to provide the physical location of the application you would call it ILocationService and when you want to compute the distance between two places you would use the same, because it's about locations, right? Having an ILocationProviderService and an ILocationDistanceCalculator feels like overkill. Imagine trying to determine if a functionality regarding locations is already implemented and going through all the ILocation... interfaces to find out, then having to create a new interface when you write the code for it and spending sleepless nights wondering if you named things right (and if you need to invalidate their cache).

  In other words, depending on context, an interface can be anything, as arbitrarily complex as the components it separates. They could contain methods that are required by other components together with methods that require other components. If you have more such interfaces, you might end up with a circular dependency in the instantiation phase even if the execution flow would not have this problem. Let's take a silly example.

  We have a LocationService and a TimeService. One handles points in space the other moments in time. And let's say we have the entire history of someone's movements. You could get a location based on the time provided (GetLocation) or get the time based on a provided location (GetTime). Now, the input from the user is text, so we need the LocationService and the TimeService to translate that text into actual points in space and moments in time, so GetLocation would use an ITimeService, while GetTime would use an ILocationService. You start the program and you get the circular dependency error. I told you it would be silly. Anyway, you can split any of the services into ITimeParser and ITimeManager or whatever, you can create a new interface called ITextParser, there are a myriad refactoring solutions. But what if you don't have the luxury to refactor and why do you even need to do anything? Surely if you call GetLocation you only need to parse the time, you never call GetTime, and the other way around.

Solution?

  A possible solution is to only actually get the dependency implementation when you use it. Instead of providing the actual implementation for the interface you need, you provide a lazy proxy. Here is an example of a generic (and lazy one liner) LazyProxy implementation:

public class LazyProxy<TInterface>:Lazy<TInterface>
{
    public LazyProxy(IServiceProvider serviceProvider) : base(() => serviceProvider.GetService<TInterface>()) { }
}

  Problem solved, right? LocationService would ask for a LazyProxy<ITimeService> implementation, GetLocation would do _lazyTimeService.Value.ParseTime(input) which would instantiate a TimeService for the first time, which would ask for a LazyProxy<ILocationService> and in GetTime it would use _lazyLocationService.Value.ParseLocation(input) which would get the existing instance of LocationService (if it's registered as Singleton). Imagine either of these services would have needed a lot of other dependencies.

  Now, that's what called a "leaky abstraction". You are hiding the complexity of instantiating and caching a service (and all of its dependencies) until you actually use it. Then you might get an error, when the actual shit hits the actual fan. I do believe that the term "leaky" might have originated from the aforementioned idiom. Yuck, right? It's when the abstraction leaked the complexity that lies beneath.

  There are a number of reasons why you shouldn't do it. Let's get through them.

Criticism

  The most obvious one is that you could do better. The design in the simple and at the same time contrived example above is flawed because each of the services are doing two very separate things: providing a value based on a parameter and interpreting text input. If parsing is a necessary functionality of your application, then why not design an ITextParser interface that both services would use? And if your case is that sometimes you instantiate a class to use one set of functions and sometimes to use another set of functions, maybe you should split that up into two. However, in real life situations you might not have full control over the code, you might not have the resources to refactor the code. Hell, you might be neck deep in spaghetti code! Have you ever worked in one of those house of cards companies where you are not allowed to touch any piece of code for fear it would all crash?

  The next issue is that you would push the detection for a possible bug to a particular point of the execution of your code. You would generate a Heisenbug, a bug that gets reproduced inconsistently. How appropriate this would have been if an IMomentumService were used as well. Developers love Heisenbugs, as the time for their resolution can vary wildly and they would be forced to actually use what they code. Oh, the humanity! Yet, the only problem you would detect early is the cycle in the dependency graph, which is more of a design issue anyway. A bug in the implementation would still be detected when you try to use it. 

  One other issue that this pattern would solve should not be there in the first place: heavy resource use in constructors. Constructors should only construct, obviously, leaving other mechanisms to handle the use of external resources. But here is the snag: if you buy into this requirement for constructors you already use leaky abstractions. And again, you might not be able to change the constructors.

  Consider, though, the way this pattern works. It is based on the fact that no matter when you request the instantiation of a class, you would have a ready implementation of IServiceProvider. The fact that the service locator mechanism exists is already lazy instantiation on the GetService method. In fact, this lazy injection pattern is itself a constructor dependency injection abstraction of the service provider pattern. You could just as well do var timeService = _serviceProvider.GetService<ITimeService>() inside your GetLocation method and it would do the exact same thing. So this is another reason why you should not do it: mixing the metaphors. But hey! If you have read this far, you know that I love mixing those suckers up!

Conclusion

  In conclusion, I cannot recommend this solution if you have others like refactoring available. But in a pinch it might work. Let me know what you think!

  BTW, this issue has been also discussed on Stack Overflow, where there are some interesting answers. 

Update: due to popular demand, I've added Tyrion as a Github repo.

Update 2: due to more popular demand, I even made the repo public. :) Sorry about that.

Intro

  Discord is something I have only vaguely heard about and when a friend told me he used it for chat with friends, I installed it, too. I was pleasantly surprised to see it is a very usable and free chat application, which combines feature of IRC, other messenger applications and a bit of Slack. You can create servers and add channels to them, for example, where you can determine the rights of people and so on. What sets Discord apart from anything, perhaps other than Slack, is the level of "integration", the ability to programatically interact with it. So I create a "bot", a program which stays active and responds to user chat messages and can take action. This post is about how to do that.

  Before you implement a bot you obviously need:

  All of this has been done to death and you can follow the links above to learn how to do it. Before we continue, a little something that might not be obvious: you can edit a Discord chat invite so that it never expires, as it is the one on this blog now.

Writing code

One can write a bot in a multitude of programming languages, but I am a .NET dev, so Discord.NET it is. Note that this is an "unofficial" library, so it may not (and it is not) completely in sync with all the options that the Discord API provides. One such feature, for example, is multiple attachments to a message. But I digress.

Since my blog is also written in ASP.NET Core, it made sense to add the bot code to that. Also, in order to make it all clean code, I will use dependency injection as much as possible and use the built-in system for commands, even if it is quite rudimentary.

Step 1 - making dependencies available

We are going to need these dependencies:

  • DiscordSocketClient - the client to connect to Discord
  • CommandService - the service managing commands
  • BotSettings - a class used to hold settings and configuration
  • BotService - the bot itself, which we are going to make implement IHostedService so we can add it as a hosted service in ASP.Net

In order to keep things separated, I will not add all of this in Startup, instead encapsulating them into a Bootstrap class:

public static class Bootstrap
{
    public static IWebHostBuilder UseDiscordBot(this IWebHostBuilder builder)
    {
        return builder.ConfigureServices(services =>
        {
            services
                .AddSingleton<BotSettings>()
                .AddSingleton<DiscordSocketClient>()
                .AddSingleton<CommandService>()
                .AddHostedService<BotService>();
        });
    }
}

This allows me to add the bot simply in CreateWebHostBuilder as: 

WebHost.CreateDefaultBuilder(args)
   .UseStartup<Startup>()
   .UseKestrel(a => a.AddServerHeader = false)
   .UseDiscordBot();

Step 2 - the settings

The BotSettings class will be used not only to hold information, but also communicate it between classes. Each Discord chat bot needs an access token to connect and we can add that as a configuration value in appsettings.config:

{
  ...
  "DiscordBot": {
	"Token":"<the token value>"
  },
  ...
}
public class BotSettings
{
    public BotSettings(IConfiguration config, IHostingEnvironment hostingEnvironment)
    {
        Token = config.GetValue<string>("DiscordBot:Token");
        RootPath = hostingEnvironment.WebRootPath;
        BotEnabled = true;
    }

    public string Token { get; }
    public string RootPath { get; }
    public bool BotEnabled { get; set; }
}

As you can see, no fancy class for getting the config, nor do we use IOptions or anything like that. We only need to get the token value once, let's keep it simple. I've added the RootPath because you might want to use it to access files on the local file system. The other property is a setting for enabling or disabling the functionality of the bot.

Step 3 - the bot skeleton

Here is the skeleton for a bot. It doesn't change much outside the MessageReceived and CommandReceived code.

public class BotService : IHostedService, IDisposable
{
    private readonly DiscordSocketClient _client;
    private readonly CommandService _commandService;
    private readonly IServiceProvider _services;
    private readonly BotSettings _settings;

    public BotService(DiscordSocketClient client,
        CommandService commandService,
        IServiceProvider services,
        BotSettings settings)
    {
        _client = client;
        _commandService = commandService;
        _services = services;
        _settings = settings;
    }

    // The hosted service has started
    public async Task StartAsync(CancellationToken cancellationToken)
    {
        _client.Ready += Ready;
        _client.MessageReceived += MessageReceived;
        _commandService.CommandExecuted += CommandExecuted;
        _client.Log += Log;
        _commandService.Log += Log;
        // look for classes implementing ModuleBase to load commands from
        await _commandService.AddModulesAsync(Assembly.GetEntryAssembly(), _services);
        // log in to Discord, using the provided token
        await _client.LoginAsync(TokenType.Bot, _settings.Token);
        // start bot
        await _client.StartAsync();
    }

    // logging
    private async Task Log(LogMessage arg)
    {
        // do some logging
    }

    // bot has connected and it's ready to work
    private async Task Ready()
    {
        // some random stuff you can do once the bot is online: 

        // set status to online
        await _client.SetStatusAsync(UserStatus.Online);
        // Discord started as a game chat service, so it has the option to show what games you are playing
        // Here the bot will display "Playing dead" while listening
        await _client.SetGameAsync("dead", "https://siderite.dev", ActivityType.Listening);
    }
    private async Task MessageReceived(SocketMessage msg)
    {
        // message retrieved
    }
    private async Task CommandExecuted(Optional<CommandInfo> command, ICommandContext context, IResult result)
    {
        // a command execution was attempted
    }

    // the hosted service is stopping
    public async Task StopAsync(CancellationToken cancellationToken)
    {
        await _client.SetGameAsync(null);
        await _client.SetStatusAsync(UserStatus.Offline);
        await _client.StopAsync();
        _client.Log -= Log;
        _client.Ready -= Ready;
        _client.MessageReceived -= MessageReceived;
        _commandService.Log -= Log;
        _commandService.CommandExecuted -= CommandExecuted;
    }


    public void Dispose()
    {
        _client?.Dispose();
    }
}

Step 4 - adding commands

In order to add commands to the bot, you must do the following:

  • create a class to inherit from ModuleBase
  • add public methods that are decorated with the CommandAttribute
  • don't forget to call commandService.AddModuleAsync like above

Here is an example of an enable/disable command class:

public class BotCommands:ModuleBase
{
    private readonly BotSettings _settings;

    public BotCommands(BotSettings settings)
    {
        _settings = settings;
    }

    [Command("bot")]
    public async Task Bot([Remainder]string rest)
    {
        if (string.Equals(rest, "enable",StringComparison.OrdinalIgnoreCase))
        {
            _settings.BotEnabled = true;
        }
        if (string.Equals(rest, "disable", StringComparison.OrdinalIgnoreCase))
        {
            _settings.BotEnabled = false;
        }
        await this.Context.Channel.SendMessageAsync("Bot is "
            + (_settings.BotEnabled ? "enabled" : "disabled"));
    }
}

When the bot command will be issued, then the state of the bot will be sent as a message to the chat. If the parameter of the command is enable or disable, the state will also be changed accordingly.

Yet, in order for this command to work, we need to add code to the bot MessageReceived method: 

private async Task MessageReceived(SocketMessage msg)
{
    // do not process bot messages or system messages
    if (msg.Author.IsBot || msg.Source != MessageSource.User) return;
    // only process this type of message
    var message = msg as SocketUserMessage;
    if (message == null) return;
    // match the message if it starts with R2D2
    var match = Regex.Match(message.Content, @"^\s*R2D2\s+", RegexOptions.IgnoreCase);
    int? pos = null;
    if (match.Success)
    {
        // this is an R2D2 command, everything after the match is the command text
        pos = match.Length;
    }
    else if (message.Channel is IPrivateChannel)
    {
        // this is a command sent directly to the private channel of the bot, 
        // don't expect to start with R2D2 at all, just execute it
        pos = 0;
    }
    if (pos.HasValue)
    {
        // this is a command, execute it
        var context = new SocketCommandContext(_client, message);
        await _commandService.ExecuteAsync(context, message.Content.Substring(pos.Value), _services);
    }
    else
    {
        // processing of messages that are not commands
        if (_settings.BotEnabled)
        {
            // if the bot is enabled and people are talking about it, show an image and say "beep beep"
            if (message.Content.Contains("R2D2",StringComparison.OrdinalIgnoreCase))
            {
                await message.Channel.SendFileAsync(_settings.RootPath + "/img/R2D2.gif", "Beep beep!", true);
            }
        }
    }
}

This code will forward commands to the command service if message starts with R2D2, else, if bot is enabled, will send replies with the R2D2 picture and saying beep beep to messages that contain R2D2.

Step 5 - handling command results

Command execution may end in one of three states:

  • command is not recognized
  • command has failed
  • command has succeeded

Here is a CommandExecuted event handler that takes these into account:

private async Task CommandExecuted(Optional<CommandInfo> command, ICommandContext context, IResult result)
{
    // if a command isn't found
    if (!command.IsSpecified)
    {
        await context.Message.AddReactionAsync(new Emoji("🤨")); // eyebrow raised emoji
        return;
    }

    // log failure to the console 
    if (!result.IsSuccess)
    {
        await Log(new LogMessage(LogSeverity.Error, nameof(CommandExecuted), $"Error: {result.ErrorReason}"));
        return;
    }
    // react to message
    await context.Message.AddReactionAsync(new Emoji("🤖")); // robot emoji
}

Note that the command info object does not expose a result value, other than success and failure.

Conclusion

This post has shown you how to create a Discord chat bot in .NET and add it to an ASP.Net Core web site as a hosted service. You may see the result by joining this blog's chat and giving commands to Tyr, the chat's bot:

  • play
  • fart
  • use metric or imperial units in messages
  • use Yahoo Messenger emoticons in messages
  • whatever else I will add in it when I get in the mood :)

  When I was looking at Javascript frameworks like Angular and ReactJS I kept running into these weird reducers that were used in state management mostly. It all felt so unnecessarily complicated, so I didn't look too closely into it. Today, reading some random post on dev.to, I found this simple and concise piece of code that explains it:

// simple to unit test this reducer
function maximum(max, num) { return Math.max(max, num); }

// read as: 'reduce to a maximum' 
let numbers = [5, 10, 7, -1, 2, -8, -12];
let max = numbers.reduce(maximum);

Kudos to David for the code sample.

The reducer, in this case, is a function that can be fed to the reduce function, which is known to developers in Javascript and a few other languages, but which for .NET developers it's foreign. In LINQ, we have Aggregate!

// simple to unit test this Aggregator ( :) )
Func<int, int, int> maximum = (max, num) => Math.Max(max, num);

// read as: 'reduce to a maximum' 
var numbers = new[] { 5, 10, 7, -1, 2, -8, -12 };
var max = numbers.Aggregate(maximum);

Of course, in C# Math.Max is already a reducer/Aggregator and can be used directly as a parameter to Aggregate.

I found a lot of situations where people used .reduce instead of a normal loop, which is why I almost never use Aggregate, but there are situations where this kind of syntax is very useful. One would be in functional programming or LINQ expressions that then get translated or optimized to something else before execution, like SQL code. (I don't know if Entity Framework translates Aggregate, though). Another would be where you have a bunch of reducers that can be used interchangeably.

and has 12 comments

Intro

  If you are like me, you want to first establish a nice skeleton app that has everything just right before you start writing your actual code. However, as weird as it may sound, I couldn't find a way to use command line parameters with dependency injection, in the same simple way that one would use a configuration file with IOptions<T> for example. This post shows you how to use CommandLineParser, a nice library that handles everything regarding command line parsing, but in a dependency injection friendly way.

  In order to use command line arguments, we need to obtain them. For any .NET Core application or .NET Framework console application you get it from the parameters of the static Main method from Program. Alternately, you can use Environment.CommandLine, which is actually a string, not an array of strings, or Environment.GetCommandLineArgs(). But all of these are kind of nudging you towards some ugly code that either has a dependency on the static Environment, either has code early in the application to handle command line arguments, or stores the arguments somehow. What we want is complete separation of modules in our application.

Defining the command line parameters

  In order to use CommandLineParser, you write a class that contains the properties you expect from the command line, decorated with attributes that inform the parser what is the expected syntax for all. In this post I will use this:

// the way we want to use the app is
// FileUtil <command> [-loglevel loglevel] [-quiet] -output <outputFile> file1 file2 .. file10
public class FileUtilOptions
{
    // use Value for parameters with no name
    [Value(0, Required = true, HelpText = "You have to enter a command")]
    public string Command { get; set; }

    // use Option for named parameters
    [Option('l',"loglevel",Required = false, HelpText ="Log level can be None, Normal, Verbose")]
    public string LogLevel { get; set; }

    // use bool for named parameters with no value
    [Option('q', "quiet", Default = false, Required = false, HelpText = "Quiet mode produces no console output")]
    public bool Quiet { get; set; }

    // Required for required values
    [Option('o', "output", Required = true, HelpText = "Output file is required")]
    public string OutputFile { get; set; }

    // use Min/Max for enumerables
    [Value(1, Min = 1, Max = 10, HelpText = "At least one file name and at most 10")]
    public IEnumerable<string> Files { get; set; }
}

  At this point the blog post will split into two parts. One is very short and easy to use, thanks to commenter Murali Karunakaran. The other one is what I wrote in 2020 when I didn't know better. This second part is just a reminder of how much people can write when they don't have to :)

The short and easy solution

All you have to do is add your command line parameters class as options, then define what will happen when you request one instance of it:

// in ConfigureServices or wherever you define dependencies for injection
services
  .AddOptions<FileUtilOptions>()
  .Configure(opt => 
    Parser.Default.ParseArguments(() => opt, Environment.GetCommandLineArgs())
  );

// when needing the parameters
public SomeConstructor(IOptions<FileUtilOptions> options)
{
    _options = options.Value;
}

When an instance of FileUtilOptions is requested, the lambda will be executed, setting the options based on ParseArguments. If any issue, the parser will display the help to the console

This process, however, does not throw any exceptions. The instance of FileUtilOptions requested will be provided empty or partially/incorrectly filled. In order to handle the errors, some more complex code is needed, and here is a silly example:

using (var writer = new StringWriter())
{
	var parser = new Parser(configuration =>
	{
		configuration.AutoHelp = true;
		configuration.AutoVersion = false;
		configuration.CaseSensitive = false;
		configuration.IgnoreUnknownArguments = true;
		configuration.HelpWriter = writer;
	});
	var result = parser.ParseArguments<T>(_args);
	result.WithNotParsed(errors => HandleErrors(errors, writer));
	result.WithParsed(value => _value = value);
}

// a possible way to handle errors
private static void HandleErrors(IEnumerable<Error> errors, TextWriter writer)
{
	if (errors.Any(e => e.Tag != ErrorType.HelpRequestedError && e.Tag != ErrorType.VersionRequestedError))
	{
		string message = writer.ToString();
		throw new CommandLineParseException(message, errors, typeof(T));
	}
}

Now, the original post follows:

Writing a lot more than necessary

  How can we get the arguments by injection? By creating a new type that encapsulates the simple string array.

// encapsulates the arguments
public class CommandLineArguments
{
    public CommandLineArguments(string[] args)
    {
        this.Args = args;
    }

    public string[] Args { get; }
}

// adds the type to dependency injection
services.AddSingleton<CommandLineArguments>(new CommandLineArguments(args));
// the generic type declaration is superfluous, but the code is easy to read

  With this, we can access the command line arguments anywhere by injecting a CommandLineArguments object and accessing the Args property. But this still implies writing command line parsing code wherever we need that data. We could add some parsing logic in the CommandLineArguments class so that instead of the command line arguments array it would provide us with a strong typed value of the type we want. But then we would put business logic in a command line encapsulation class. Why would it know what type of options we need and why would we need only one type of options? 

  What we would like is something like

public SomeClass(IOptions<MyCommandLineOptions> clOptions) {...}

  Now, we could use this system by writing more complicated that adds a ConfigurationSource and then declaring that certain types are command line options. But I don't want that either for several reasons:

  • writing configuration providers is complex code and at some moment in time one has to ask how much are they willing to write in order to get some damn arguments from the command line
  • declaring the types at the beginning does provide some measure of centralized validation, but on the other hand it's declaring types that we need in business logic somewhere in service configuration, which personally I do not like

  What I propose is adding a new type of IOptions, one that is specific to command line arguments:

// declare the interface for generic command line options
public interface ICommandLineOptions<T> : IOptions<T>
    where T : class, new() { }

// add it to service configuration
services.AddSingleton(typeof(ICommandLineOptions<>), typeof(CommandLineOptions<>));

// put the parsing logic inside the implementation of the interface
public class CommandLineOptions<T> : ICommandLineOptions<T>
    where T : class, new()
{
    private T _value;
    private string[] _args;

    // get the arguments via injection
    public CommandLineOptions(CommandLineArguments arguments)
    {
        _args = arguments.Args;
    }

    public T Value
    {
        get
        {
            if (_value==null)
            {
                // set the value by parsing command line arguments
            }
            return _value;
        }
    }

}

  Now, in order to make it work, we will use CommandLineParser which functions in a very simple way:

  • declare a Parser
  • create a POCO class that has properties decorated with attributes that define what kind of command line parameter they are
  • parse the command line arguments string array into the type of class declared above
  • get the value or handle errors

  Also, to follow the now familiar Microsoft pattern, we will write an extension method to register both arguments and the mechanism for ICommandLineOptions. The end result is:

// extension class to add the system to services
public static class CommandLineExtensions
{
    public static IServiceCollection AddCommandLineOptions(this IServiceCollection services, string[] args)
    {
        return services
            .AddSingleton<CommandLineArguments>(new CommandLineArguments(args))
            .AddSingleton(typeof(ICommandLineOptions<>), typeof(CommandLineOptions<>));
    }
}

public class CommandLineArguments // defined above

public interface ICommandLineOptions<T> // defined above

// full class implementation for command line options
public class CommandLineOptions<T> : ICommandLineOptions<T>
    where T : class, new()
{
    private T _value;
    private string[] _args;

    public CommandLineOptions(CommandLineArguments arguments)
    {
        _args = arguments.Args;
    }

    public T Value
    {
        get
        {
            if (_value==null)
            {
                using (var writer = new StringWriter())
                {
                    var parser = new Parser(configuration =>
                    {
                        configuration.AutoHelp = true;
                        configuration.AutoVersion = false;
                        configuration.CaseSensitive = false;
                        configuration.IgnoreUnknownArguments = true;
                        configuration.HelpWriter = writer;
                    });
                    var result = parser.ParseArguments<T>(_args);
                    result.WithNotParsed(errors => HandleErrors(errors, writer));
                    result.WithParsed(value => _value = value);
                }
            }
            return _value;
        }
    }

    private static void HandleErrors(IEnumerable<Error> errors, TextWriter writer)
    {
        if (errors.Any(e => e.Tag != ErrorType.HelpRequestedError && e.Tag != ErrorType.VersionRequestedError))
        {
            string message = writer.ToString();
            throw new CommandLineParseException(message, errors, typeof(T));
        }
    }
}

// usage when configuring dependency injection
services.AddCommandLineOptions(args);

Enjoy!

Final notes

Now there are some quirks in the implementation above. One of them is that the parser class generates the usage help by writing it to a TextWriter (default being Console.Error), but since we want this to be encapsulated, we declare our own StringWriter and then store the generated help if any errors. In the case above, I am storing the help text as the exception message, but it's the principle that matters.

Also, with this system one can ask for multiple types of command line options classes, depending on the module, without the need to declare said types at the configuration of dependency injection. The downside is that if you want validation of the command line options at the very beginning, you have to write extra code. In the way implemented above, the application will fail when first asking for a command line option that cannot be mapped on the command line arguments.

Note that the short style of a parameter needs to be used with a dash, the long one with two dashes:

  • -o outputFile.txt - correct (value outputFile.txt)
  • --output outputFile.txt - correct (value outputFile.txt)
  • -output outputFile.txt - incorrect (value output and outputFile.txt is considered an unnamed argument)