I just read a very cool article (Understanding Default Parameters in Javascript) and my takeaway is this smart piece of code to enforce that a parameter is specified:
const isRequired = () => { throw new Error('param is required'); };

function filterEvil(array, evil = isRequired()) {
return array.filter(item => item !== evil);
}

So all you have to do is define the isRequired function in a shared library file and then use it in any function that you write.

Are you a bit put off by the fact you can use functions as default parameters? Welcome to Javascript, a language that seems designed by Eurythmics

I am sure I've tested this, but for some reason the icons in my blog disappeared for Internet Explorer. They are using Font Awesome SVG background images, declared something like this:
.fas-comment {
background-image: url("data:image/svg+xml;utf8,<svg height='511.6' version='1.1' viewBox='0 0 511.6 511.6' width='511.6' x='0' xml:space='preserve' xmlns='http://www.w3.org/2000/svg' y='0'><g fill='#2f5faa'><path d='M477.4 127.4c-22.8-28.1-53.9-50.2-93.1-66.5 -39.2-16.3-82-24.4-128.5-24.4 -34.6 0-67.8 4.8-99.4 14.4 -31.6 9.6-58.8 22.6-81.7 39 -22.8 16.4-41 35.8-54.5 58.4C6.8 170.8 0 194.5 0 219.2c0 28.5 8.6 55.3 25.8 80.2 17.2 24.9 40.8 45.9 70.7 62.8 -2.1 7.6-4.6 14.8-7.4 21.7 -2.9 6.9-5.4 12.5-7.7 16.9 -2.3 4.4-5.4 9.2-9.3 14.6 -3.9 5.3-6.8 9.1-8.8 11.3 -2 2.2-5.3 5.8-9.9 10.8 -4.6 5-7.5 8.3-8.8 9.9 -0.2 0.1-1 1-2.3 2.6 -1.3 1.6-2 2.4-2 2.4l-1.7 2.6c-1 1.4-1.4 2.3-1.3 2.7 0.1 0.4-0.1 1.3-0.6 2.9 -0.5 1.5-0.4 2.7 0.1 3.4v0.3c0.8 3.4 2.4 6.2 5 8.3 2.6 2.1 5.5 3 8.7 2.6 12.4-1.5 23.2-3.6 32.5-6.3 49.9-12.8 93.6-35.8 131.3-69.1 14.3 1.5 28.1 2.3 41.4 2.3 46.4 0 89.3-8.1 128.5-24.4 39.2-16.3 70.2-38.4 93.1-66.5 22.8-28.1 34.3-58.7 34.3-91.8C511.6 186.1 500.2 155.5 477.4 127.4z'/></g></svg>");
}

I had to try several things, but in the end, I found out that there are three steps in order to make this compatible with Internet Explorer (and still work in other browsers):
  1. The definition of the utf8 charset must be explicit: data:image/svg+xml;charset=utf8 instead of data:image/svg+xml;utf8
  2. The SVG code needs to be URL encoded: so turn all double quotes into single quotes and then replace < and > with %3C and %3E or use some URL encoder
  3. The colors need to be in rbg() format: so instead of fill='#2f5faa' use fill='rgb(47,95,170)' (same in style tags in the SVG, if any)


So now the result is:
.fas-comment {
background-image: url("data:image/svg+xml;charset=utf8,%3Csvg height='511.6' version='1.1' viewBox='0 0 511.6 511.6' width='511.6' x='0' xml:space='preserve' xmlns='http://www.w3.org/2000/svg' y='0'%3E%3Cg fill='rgb(47,95,170)'%3E%3Cpath d='M477.4 127.4c-22.8-28.1-53.9-50.2-93.1-66.5 -39.2-16.3-82-24.4-128.5-24.4 -34.6 0-67.8 4.8-99.4 14.4 -31.6 9.6-58.8 22.6-81.7 39 -22.8 16.4-41 35.8-54.5 58.4C6.8 170.8 0 194.5 0 219.2c0 28.5 8.6 55.3 25.8 80.2 17.2 24.9 40.8 45.9 70.7 62.8 -2.1 7.6-4.6 14.8-7.4 21.7 -2.9 6.9-5.4 12.5-7.7 16.9 -2.3 4.4-5.4 9.2-9.3 14.6 -3.9 5.3-6.8 9.1-8.8 11.3 -2 2.2-5.3 5.8-9.9 10.8 -4.6 5-7.5 8.3-8.8 9.9 -0.2 0.1-1 1-2.3 2.6 -1.3 1.6-2 2.4-2 2.4l-1.7 2.6c-1 1.4-1.4 2.3-1.3 2.7 0.1 0.4-0.1 1.3-0.6 2.9 -0.5 1.5-0.4 2.7 0.1 3.4v0.3c0.8 3.4 2.4 6.2 5 8.3 2.6 2.1 5.5 3 8.7 2.6 12.4-1.5 23.2-3.6 32.5-6.3 49.9-12.8 93.6-35.8 131.3-69.1 14.3 1.5 28.1 2.3 41.4 2.3 46.4 0 89.3-8.1 128.5-24.4 39.2-16.3 70.2-38.4 93.1-66.5 22.8-28.1 34.3-58.7 34.3-91.8C511.6 186.1 500.2 155.5 477.4 127.4z'/%3E%3C/g%3E%3C/svg%3E");
}

and has 8 comments
I've learned something new today. It all starts with an innocuous question: Given the following struct, tell me what is its size:
    public struct MyStruct
{
public int i1;
public char c1;
public long l1;
public char c2;
public short s1;
public char c3;
}
Let's assume that this is in 32bit C++ or C#.

The first answer is 4+1+8+1+2+1 = 17. Nope! It's 24.

Well, it is called memory alignment and it has to do with the way CPUs work. They have memory registers of fixed size, various caches with different sizes and speeds, etc. Basically, when you ask for a 4 byte int, it needs to be "aligned" so that you get 4 bytes from the correct position into a single register. Otherwise the CPU needs to take two registers (let's say 1 byte in one and 3 bytes in another) then mask and shift both and add them into another register. That is unbelievably expensive at that level.

So, why 24? i1 is an int, it needs to be aligned on positions that are multiple of 4 bytes. 0 qualifies, so it takes 4 bytes. Then there is a char. Chars are one byte, can be put anywhere, so the size becomes 5 bytes. However, a long is 8 bytes, so it needs to be on a position that is a multiple of 8. That is why we add 3 bytes as padding, then we add the long in. Now the size is 16. One more char → 17. Shorts are 2 bytes, so we add one more padding byte to get to 18, then the short is added. The size is 20. And in the end you get the last char in, getting to 21. But now, the struct needs to be aligned with itself, meaning with the largest primitive used inside it, in our case the long with 8 bytes. That is why we add 3 more bytes so that the struct has a size that is a multiple of 8.

Note that a struct containing a struct will align it to its largest primitive element, not the actual size of the child struct. It's a recursive process.

Can we do something about it? What if I want to spend speed on memory or disk space? We can use directives such as StructLayout. It receives a LayoutKind - which defaults to Sequential, but can also be Auto or Explicit - and a numeric Pack parameter. Auto rearranges the order of the members of the class, so it takes the least amount of space. However, this has some side effects, like getting errors when you want to use Marshal.SizeOf. With Explicit, each field needs to be adorned with a FieldOffset attribute to determine the exact position in memory; that also means you can use several fields on the same position, like in:
    [StructLayout(LayoutKind.Explicit)]
public struct MyStruct
{
[FieldOffset(0)]
public int i1;
[FieldOffset(4)]
public int i2;
[FieldOffset(0)]
public long l1;
}
The Pack parameter tells the system on how to align the fields. 0 is the default, but 1 will make the size of the first struct above to actually be 17.
    [StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct MyStruct
{
public int i1;
public char c1;
public long l1;
public char c2;
public short s1;
public char c3;
}
Other values can be 2,4,8,16,32,64 or 128. You can test on how the performance is affected by this, as an exercise.

More information here: Advanced c# programming 6: Everything about memory allocation in .NET

Update: I've created a piece of code to actually test for this:
unsafe static void Main(string[] args)
{
var st = new MyStruct();
Console.WriteLine($"sizeof:{sizeof(MyStruct)} Marshal.sizeof:{Marshal.SizeOf(st)} custom sizeof:{MySizeof(st)}");
Console.ReadKey();
}
 
private static long MySizeof(MyStruct st)
{
long before = GC.GetTotalMemory(true);
MyStruct[] array = new MyStruct[100000];
long after = GC.GetTotalMemory(true);
var size = (after - before) / array.Length;
return size;
}

Considering the original MyStruct, the size reported by all three ways of computing size is 24. I had to test the idea that the maximum byte padding is 4, so I used this structure:
public struct MyStruct
{
public long l;
public byte b;
}
Since long is 8 bytes and byte is 1, I expected the size to be 16 and it was, not 12. However, I decided to also try with a decimal instead of the long. Decimal values have 16 bytes, so if my interpretation was correct, 17 bytes should be aligned with the size of the biggest struct primitive field: a multiple of 16, so 32. The result was weirdly inconsistent: sizeof:20 Marshal.sizeof:24 custom sizeof:20, which suggests an alignment to 4 or 8 bytes, not 16. So I started playing with the StructLayoutAttribute:
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct MyStruct
{
public decimal d;
public byte b;
}

For Pack = 1, I got the consistent 17 bytes. For Pack=4, I got consistent values of 20. For Pack=8 or higher, I got the weird 20-24-20 result, which suggests packing works differently for decimals than for other values. I've replaced the decimal with a struct containing two long values and the consistent result was back to 24, but then again, that's expected. Funny thing is that Guid is also a 16 byte value, although it is itself a struct, and the resulting size was 20. Guid is not a value type, though.

The only conclusion I can draw is that what I wrote in this post is true. Also, StructLayout Pack does not work as I had expected, instead it provides a minimum packing size, not a maximum one. If the biggest element in the struct is 8 bytes, then the minimum between the Pack value and 8 will be used for alignment. The alignment of the type is the size of its largest element (1, 2, 4, 8, etc., bytes) or the specified packing size, whichever is smaller.

All this if you are not using decimals... then all bets are off! From my discussions with Filip B. Vondrášek in the comments of this post, I've reached the conclusion that decimals are internally structs that are aligned to their largest element, an int, so to 4 bytes. However, it seems Marshal.sizeof misreports the size of structs containing decimals, for some reason.

In fact, all "simple" types are structs internally, as described by the C# language specification, but the Decimal struct also implements IDeserializationEventListener, but I don't see how this would influence things. Certainly the compilers have optimizations for working with primitive types. This is as deep as I want to go with this, anyway.

and has 0 comments
For anyone coming from the welcoming arms of Visual Studio 2015 and higher, Eclipse feels like an abomination. However, knowing some nice tips and tricks helps a lot. I want to give a shout out to this article: Again! – 10 Tips on Java Debugging with Eclipse which is much more detailed that what I am going to write here and from where I got inspired.

Three things I thought most important, though, and this is what I am going to highlight:
  1. Show Logical Structure - who would have known that a little setting on top of the Expressions view would have been that important? Remember when you cursed how Maps are shown in the Eclipse debugger? With Show Logical Structures you can actually see items, keys and values!
  2. The Display View - just go to Window → Show View → Display and you get something that functions a bit like the Immediate Window in Visual Studio. In other words, just write your code there and execute it in the program's context. For a very useful example: write new java.util.Scanner(request.getEntity().getContent()).useDelimiter("\\A").next() in the Display window, select it, then click on Display Result of Evaluated Selected Text, and it will add to the Display window the string of the content of a HttpPost request.
  3. Watchpoints - you can set breakpoints that go into debug mode when a variable is accessed or changed!

For details and extra info, read the codecentric article I mentioned above.

and has 1 comment

As with all the programmer questions, I will update the post with the answer after people comment on this. Today's question is:
You have a list of regular expressions for strings to be matched against. You need to turn them into a single regular expression. How can you do it so that a string needs to match any of the initial regular expressions? How can you do it to match them all at the same time?

Here is a question for programmers. I will wait for your comments before answering.

After my disappointment with the Firefox for Android lack of proper bookmarks API implementation, I was at least happy that my Bookmark Explorer extension works well with Firefox for desktop. That quickly turned cold when I got a one star review because the extension did not work. And the user was right, it didn't! One of the variables declared in one of the Javascript files was not found. But that only happened in the published version, not the unpacked one on my computer.

Basically the scenario was this:
  1. Load unpacked (from my computer) extension
  2. Test it
  3. Works great
  4. Make it a Zip file and publish it
  5. Shame! Shame! Shame!

Long story short, I was loading the Javascript file like this: <script src="ApiWrapper.js"></script> when the name of the file was apiWrapper.js (note the lowercase A). My computer file system is Windows, couldn't care less about filename casing, while the virtual Zip filesystem probably isn't like that, at least in Firefox's implementation. Note that this error only affected Firefox and not Chrome or (as far as I know - because it has been 2 months since I've submitted the extension and I got no reply other than "awaiting moderation") Opera.

I've found a small gem in Javascript ES6 that I wanted to share:
let arr = [3, 5, 2, 2, 5, 5];
let unique = [...new Set(arr)]; // [3, 5, 2]

and has 0 comments
Today I've discovered, to my dismay, that two Integer objects with the same value compared with the == operator may return false, because they are different objects. So you need to use .equals (before you check for null, of course). I was about to write a scathing blog entry on how much Java sucks, but then I discovered this amazing link: Java gotchas: Immutable Objects / Wrapper Class Caching that explains that the Integer class creates a cache of 256 values so that everything between -128 and 127 is actual equal as an instance as well.

Yes, folks, you've heard that right. I didn't believe it, either, so I wrote a little demo code:
Integer i1 = Integer.valueOf(1);
Integer i2 = Integer.valueOf(1);
boolean b1 = i1 == i2; // true

i1 = Integer.valueOf(1000);
i2 = Integer.valueOf(1000);
boolean b2 = i1 == i2; // false

i1=1;
i2=1;
boolean b3 = i1 == i2; // true

i1=1000;
i2=1000;
boolean b4 = i1 == i2; // false

i1=126;
i2=126;
boolean b5 = i1 == i2; // true

i1++;
i2++;
boolean b6 = i1 == i2; // true

i1++;
i2++;
boolean b7 = i1 == i2; // false

i1 = 2000;
i2 = i1;
boolean b8 = i1 == i2; // true

i1++;
i1--;
boolean b9 = i1 == i2; // false


Update: the same thing also applies to Strings. Two strings with the same value are not == although they are immutable, so even the same string won't be equal to itself after changes. Fun!

I now submit to you that "sucks" applies to many things, but not to Java. A new term needs to be defined for it, so that it captures the horror above in a single word.

Tonight I went to an ADCES presentation about SQL table partitioning, a concept that allows for a lot of flexibility while preserving the same basic interface for a table one would use for a simpler and less scalable application. The talk was very professionally held by Bogdan Sahlean and you should have been there to see it :)

He talked about how one can create filegroups on which a table can be split into as many partitions as needed. He then demonstrated the concept of partition switching, which means swapping two tables without overhead, just via metadata, and, used in the context of partitions, the possibility to create a staging table, do stuff on it, then just swap it with a partition with no downtime. The SQL scripts used in the demo can be found on Sahlean's blog. This technology exists since SQL Server 2005, it's not something terribly new, and features with similar but limited functionality existed since SQL Server 2000. Basically the data in a table can be organized in separate buckets and one can even put each partition on a different drive for extra speed.

Things I've found interesting, in no particular order:

  • Best practice: create custom filegroups for databases and put objects in them, rather than in the primary (default) filegroup. Reason: each filegroup is restored separately,
    with the primary being the first and the one the database restore waits for to call a database as online. That means one can quickly restore the important data and see the db online, while the less accessed or less important data, like archive info, loaded afterwards.
  • Using constraints with CHECK on tables is useful in so many ways. For example, even since SQL Server 2000, one could create tables on different databases, even different servers, and if they are marked with not overlapping checks, one can not only create a view that combines all data with UNION ALL, but also insert into the view. The server will know which tables, databases and servers to connect to. Also, useful in the partition presentation.
  • CREATE INDEX with a DROP_EXISTING hint to quickly recreate or alter clustered indexes. With DROP_EXISTING, you save one complete cycle of dropping and recreating nonclustered indexes. Also, if specifying a different filegroup, you are effectively moving the data in a table from a filegroup to another.
  • Finally, the SWITCH TO partition switching can be used to quickly swap two tables, since from Sql Server 2005 all tables are considered partitioned, with regular ones just having one partition. So one creates a table identical in structure with another, does whatever with it, then just uses something like this: ALTER TABLE Orders SWITCH PARTITION 1 TO OrdersHistory PARTITION 1; to swap them out, with minimal overhang.

and has 0 comments
Just a heads up on a technology than I had no idea existed. To get the details read this 2009 (!!! :( ) article.

Basically you define a MemoryMappedFile instance from a path or a file reader, then create one or more MemoryMappedViewAccessors, then read or write binary data. The data can be structures, by using the generic Read/Write<[type]> methods.

Drawbacks: The size of the file has to be fixed, it cannot be increased or decreased. Also the path of the file needs to be on a local drive, it can't be on a network path.
Advantages: Fast access, built in persistency, the most efficient method to share data between processes.

and has 0 comments
Fuck Java! Just fuck it! I have been trying for half an hour to understand why a NullPointerException is returned in a Java code that I can't debug. It was a simple String object that was null inside a switch statement. According to this link states that The prohibition against using null as a switch label prevents one from writing code that can never be executed. If the switch expression is of a reference type, that is, String or a boxed primitive type or an enum type, then a run-time error will occur if the expression evaluates to null at run time.

and has 2 comments
A reader asked me how to work with multiple projects in Visual Studio Code and after fumbling a little I realized I had no idea. So I started trying out things.

First I created a folder in which I created two other folders ingeniously named Proj1 and Proj2. I went in both and ran dotnet new console and dotnet new classlib respectively. I moved the Console.WriteLine("Hello World!"); code from the console Program.cs in a static method in the Class1 class, then called the method from Program.cs Main, then tried to find ways of referencing Proj2 from Proj1 in Visual Studio Code.

And here I got stuck. I tried the smart solutions VS Code recommended, but none of them included adding a reference. I right clicked on everything to no avail. I wrote using Proj2; by hand hoping that Code will magically understand I need a project reference. I googled, only to find old articles that discussed project.json, not .csproj type of .NET projects.

In the end I was resigned to write the reference by hand. I opened Proj1.csproj and added
<ItemGroup>
<ProjectReference Include="..\Proj2\Proj2.csproj"/>
</ItemGroup>

After saving the file and going to the unresolved Class1 reference, I now got using Proj2; as an option to fix it. And now I got to the problem my reader was having. When trying to run Proj1, I got Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'Proj2, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified. at Proj1.Program.Main(String[] args).

It's disgustingly easy to solve, you just need to know what to do. Either Ctrl-Shit-P and type restore, then select restoring Proj1 or do it manually by going to the Proj1 folder and running dotnet restore by hand. After that the project is compiled and runs.

Summary:
  1. add project reference by hand to .csproj file
  2. resolve whatever compilation errors you have by specifying the correct usings or inlining namespaces
  3. dotnet restore the project you added references to

and has 2 comments
As a .NET developer I am very familiar with LInQ, or Language Integrated Queries, which is a collection of fluent interface methods that deal with querying data from collections. However, so many people outside the .NET ecosystem are not familiar with the concept or they use it as disparate functions in their language of choice. What makes it even more confusing is that the same concept is implemented in other languages with different names. Let me give you an example:
var arr=[1,2,3,4,5,6];
var result=arr
.Where(v=>v%2==0) //get only even values
.Select(v=>v*10) //return their values multiplied with 10
.Aggregate(15,(s,v)=>v+s); //aggregate their value into a sum that starts with a seed of 15
// result should be 15+2*10+4*10+6*10=135

We see here the use of three of these methods:
  • Where - filters the values on a condition
  • Select - changes the values it returns
  • Aggregate - creates an aggregate value using an operation on all the values in the collection

Let me write you the same C# code without using these methods:
var arr=[1,2,3,4,5,6];
var result=15;
foreach(var v in arr) {
if (v%2==0) {
result+=v*10;
}
}

In this case, some people might prefer the second version, but it is only an example. LInQ is not a silver bullet that replaces all loops, only a tool amongst many in a large toolset. Some advantages of using such a method are concise code, better readability, a common API for iterating, filtering and querying collections, etc. For example in the largely used Entity Framework or its previous incarnations such as Linq over SQL, the queries would look the same, but they would be translated into SQL and sent to the database and executed just once. So it would not get a list of thousands of records to filter it in memory, instead it would translate the expression of the function sent to the query into SQL and execute it there. The same sort of operations can be used on streams of data, rather than fixed collections, like in the case of Reactive Extensions.

Some other methods in this set include:
  • First/Last - getting the first or last element in an enumerable that satisfies a boolean condition
  • Skip - ignoring a number of values in a collection
  • Take - returning a number of values in a collection
  • Any/All - returning true if at least one or all of the items satisfy a boolean condition
  • Average/Sum/Min/Max - specific aggregating methods for the elements in the collection
  • OrderBy/OrderByDescending - sorting
  • Count - counting

There are many others, you can look them up here.

Does this system of querying data seem familiar to you? To SQL developers it will feel second nature. In SQL the same result from above would be achieved by using something like:
SELECT 15+SUM(SELECT v*10 FROM table WHERE v%2=0)

Note that other than putting the source of the data in front, LInQ syntax is almost identical.

However, in other languages this sort of data query is called map/reduce and in fact there is a very used programming model called MapReduce that applies in big data processing. In Java, the function that filters data is called filter, the one that alters the values is called map and the one that aggregates data is called reduce. Similar in Javascript. Here is the same code in Javascript:
var arr=[1,2,3,4,5,6];
var result=arr
.filter(v=>v%2==0) //get only even values
.map(v=>v*10) //return their values multiplied with 10
.reduce((s,v)=>v+s,15); //aggregate their value into a sum that starts with a seed of 15
// result should be 15+2*10+4*10+6*10=135
Note that the lambda syntax of writing functions used here is new in ECMA Script version 6. Before you would have to use the function(x) { return [something with x]; } syntax.

In Haxe, the concept is achieved by using the Lambda library and the functions are again named differently: filter for filtering, map for altering and fold for aggregating.

There is another sort of people that would instantly recognize this model of data querying: functional programming people. Indeed, SQL is a functional programming language at its core and the same standard for data querying is used very efficiently in functional programming languages, since they know whether a function is pure or not (has side effects). When only dealing with pure functions, some optimizations can be made on the query by the compiler before anything is even executed. Haskell has the same naming as Haxe (filter, map, fold) for example.

So whenever I get to review other people's code, especially people that have little experience with either SQL or C#, I cringe to see stuff like this:
var max=-1;
for (var i=0; i<arr.length; i++) {
if (max<arr[i]) max=arr[i];
}
In my head this should be simply arr.max(); And considering how easy it is to implement something like this in Javascript, for example, it's a crime for not using it:
Array.prototype.max=function() { return Math.max.apply(null,this); }

Yet there is more to this than my personal preference for reading code. Composition, for example. Because this works like a fluent API or a builder pattern, one can keep adding conditions to a query. Imagine you have to filter a list of strings based on a Google like query string. At the very minimum you would need to split the query into strings and filter repeatedly on each one. Something like this:
var arr=['this is my special query string','this is a string','my query string is this awesome','no query strings here, move along','these are not the strings you are looking for'];
var query="this is a query string";
var splits=query.split(/\s+/g);
var result=arr;
splits.forEach(s=>result=result.filter(a=>a.includes(s)));
console.log(result);

There is a lot of stuff I could be saying about this subject, but I need to summarize it. It's all about inverting loops. Instead of having to go through a collection, a stream or some other data source, then executing some code for each element, this method allows you to encapsulate the operations you want to execute on those elements, pass them around, compose them, translate them, then use them on any data source in the same way. A common API means reusability, better readability of code, less written code and a simpler declaration of intent. Because we get out of the loop system, we can expand the use for other paradigms, such as infinite data streams or event buses.

Learning ASP.Net MVC series:
  1. Setup
  2. MVC Concepts
  3. Authentication
  4. Entity Framework Fundamentals
  5. Upgrading project to .NET Core 1.1
  6. Dependency Injection and Services

Previously on Learning ASP.Net MVC...


Started with the idea of a project that would use user configurable queries to do Google searches, store the information in the results and then perform various data analysis on them and display them based on what the user wants. However, I first implemented a Google authentication and then went to write some theoretical posts. Lastly, I've upgraded the project from .NET Core 1.0 to version 1.1.

Well, it took me a while to get here because I was at a crossroads. I like the idea of dependency injectable services to do data access. At the same time there is the entire Entity Framework tutorial path that kind of wants to strongly integrate EF with my projects. I mean, if I have a service that gives me the list of all items in the database and then I want to get only a few items, it would be bad design to filter the entire list. As such, I would have to write a different method that allows me to get the items based on some kind of filters. On the other hand, Entity Framework code looks just like that "give me all you have, filtered by this" which is then translated into an efficient query to the database. One possibility would be to have my service return IQueryable <T>, so I could also use the system to generate the database code on the fly.

The Design


I've decided on the service architecture, against an EF type IQueryable way, because I want to be able to replace that service with anything, including something that doesn't work with a database or something that doesn't know how to dynamically create queries. Also, the idea that the service methods will describe exactly what I want appeals to me more than avoiding a bit of duplicated code.

Another thing to define now is the method through which I will implement the dependency injection. Being the control freak that I am, I would go with installing my own library, something like SimpleInjector, and configure it myself and use it explicitly. However, ASP.Net Core has dependency injection included out of the box, so I will use that.

As defined, the project needs queries to pass on to Google and a storage service for the results. It needs data services to manage these entities, as well as a service to abstract Google itself. The data gathering operation itself cannot be a simple REST call, since it might take a while, it must be a background task. The data analysis as well. So we need a sort of job manager.

As per a good structured design, the data objects will be stored in a separate project, as well as the interfaces for the services we will be using.

Some code, please!


Well, start with the code of the project so far: GitHub and let's get coding.

Before finding a solution to actually run the background code in the context of ASP.Net, let's write it inside a class. I am going to add a folder called Jobs and add a class in it called QueryProcessor with a method ProcessQueries. The code will be self explanatory, I hope.
public void ProcessQueries()
{
var now = _timeService.Now;
var queries = _queryDataService.GetUnprocessed(now);
var contentItems = queries.AsParallel().WithDegreeOfParallelism(3)
.SelectMany(q => _contentService.Query(q.Text));
_contentDataService.Update(contentItems);
}

So we get the time - from a service, of course - and request the unprocessed queries for that time, then we extract the content items for each query, which then are updated in the database. The idea here is that, for the first time a query is defined or when the interval from the last time the query was processed, the query will be sent to the content service from which content items will be received. These items will be stored in the database.

Now, I've kept the code as concise as possible: there is no indication yet of any implementation detail and I've written as little code as I need to express my intention. Yet, what are all these services? What is a time service? what is a content service? Where are they defined? In order to enable dependency injection, we will populate all of these fields from the constructor of the query processor. Here is how the class would look in its entirety:
using ContentAggregator.Interfaces;
using System.Linq;

namespace ContentAggregator.Jobs
{
public class QueryProcessor
{
private readonly IContentDataService _contentDataService;
private readonly IContentService _contentService;
private readonly IQueryDataService _queryDataService;
private readonly ITimeService _timeService;

public QueryProcessor(ITimeService timeService, IQueryDataService queryDataService, IContentDataService contentDataService, IContentService contentService)
{
_timeService = timeService;
_queryDataService = queryDataService;
_contentDataService = contentDataService;
_contentService = contentService;
}

public void ProcessQueries()
{
var now = _timeService.Now;
var queries = _queryDataService.GetUnprocessed(now);
var contentItems = queries.AsParallel().WithDegreeOfParallelism(3)
.SelectMany(q => _contentService.Query(q.Text));
_contentDataService.Update(contentItems);
}
}
}

Note that the services are only defined as interfaces which we declare in a separate project called ContentAggregator.Interfaces, referred above in the usings block.

Let's ignore the job processor mechanism for a moment and just run ProcessQueries in a test method in the main controller. For this I will have to make dependency injection work and implement the interfaces. For brevity I will do so in the main project, although it would probably be a good idea to do it in a separate ContentAggregator.Implementations project. But let's not get ahead of ourselves. First make the code work, then arrange it all nice, in the refactoring phase.

Implementing the services


I will create mock services first, in order to test the code as it is, so the following implementations just do as little as possible while still following the interface signature.
public class ContentDataService : IContentDataService
{
private readonly static StringBuilder _sb;

static ContentDataService()
{
_sb = new StringBuilder();
}

public void Update(IEnumerable<ContentItem> contentItems)
{
foreach (var contentItem in contentItems)
{
_sb.AppendLine($"{contentItem.FinalUrl}:{contentItem.Title}");
}
}

public static string Output
{
get { return _sb.ToString(); }
}
}

public class ContentService : IContentService
{
private readonly ITimeService _timeService;

public ContentService(ITimeService timeService)
{
_timeService = timeService;
}

public IEnumerable<ContentItem> Query(string text)
{
yield return
new ContentItem
{
OriginalUrl = "http://original.url",
FinalUrl = "https://final.url",
Title = "Mock Title",
Description = "Mock Description",
CreationTime = _timeService.Now,
Time = new DateTime(2017, 03, 26),
ContentType = "text/html",
Error = null,
Content = "Mock Content"
};
}
}

public class QueryDataService : IQueryDataService
{
public IEnumerable<Query> GetUnprocessed(DateTime now)
{
yield return new Query
{
Text="Some query"
};
}
}

public class TimeService : ITimeService
{
public DateTime Now
{
get
{
return DateTime.UtcNow;
}
}
}

Now all I have to do is declare the binding between interface and implementation. The magic happens in ConfigureServices, in Startup.cs:
services.AddTransient<ITimeService, TimeService>();
services.AddTransient<IContentDataService, ContentDataService>();
services.AddTransient<IContentService, ContentService>();
services.AddTransient<IQueryDataService, QueryDataService>();

They are all transient, meaning that for each request of an implementation the system will just create a new instance. Another popular method is AddSingleton.

Using dependency injection


So, now I have to instantiate my query processor and run ProcessQueries.

One way is to set QueryProcessor as a service. I extract an interface, I add a new binding and then I give an interface as a parameter of my controller constructor:
[Authorize]
public class HomeController : Controller
{
private readonly IQueryProcessor _queryProcessor;

public HomeController(IQueryProcessor queryProcessor)
{
_queryProcessor = queryProcessor;
}

public IActionResult Index()
{
return View();
}

[HttpGet("/test")]
public string Test()
{
_queryProcessor.ProcessQueries();
return ContentDataService.Output;
}
}
In fact, I don't even have to declare an interface. I can just use services.AddTransient<QueryProcessor>(); in ConfigureServices and it works as a parameter to the controller.

But what if I want to use it directly, resolve it manually, without injecting it in the controller? One can use the injection of a IServiceProvider instead. Here is an example:
[Authorize]
public class HomeController : Controller
{
private readonly IServiceProvider _serviceProvider;

public HomeController(IServiceProvider serviceProvider)
{
_serviceProvider = serviceProvider;
}

public IActionResult Index()
{
return View();
}

[HttpGet("/test")]
public string Test()
{
var queryProcessor = _serviceProvider.GetService<QueryProcessor>();
queryProcessor.ProcessQueries();
return ContentDataService.Output;
}
}
Yet you still need to use services.Add... in ConfigureServices and inject the service provider in the constructor of the controller.

There is a way of doing it completely separately like this:
var serviceProvider = new ServiceCollection()
.AddTransient<ITimeService, TimeService>()
.AddTransient<IContentDataService, ContentDataService>()
.AddTransient<IContentService, ContentService>()
.AddTransient<IQueryDataService, QueryDataService>()
.AddTransient<QueryProcessor>()
.BuildServiceProvider();
var queryProcessor = serviceProvider.GetService<QueryProcessor>();

This would be the way to encapsulate the ASP.Net Dependency Injection in another object, maybe in a console application, but clearly it would be pointless in our application.

The complete source code after these modifications can be found here. Test the functionality by going to /test on your local server after you start the app.