A funny feature that I've encountered recently. It's not something most people would find useful, but it helps tremendously with tracing and debugging what is going on. It's easy, just add .TagWith(someString) to your LINQ query and it will generate comments in SQL. More details here: Query tags.

I was trying to create an entity with several children entities and persist it to the database using Entity Framework. I was generating the entity, set its entry state to Added, saved the changes in the DbContext, everything right. However, in the database I had one parent entity and one child entity. I suspected it had something to do with the way I created the tables, the foreign keys, something related to the way I was declaring the connection between the entities in the model. It was none of that.

If you look at the way EF generates entities from database tables, it creates a two directional relationship from any foreign key: the parent entity gets an ICollection<Child> property to hold the children and the child entity a virtual property of type Parent to hold the parent. Moreover, the parent entity instantiates the collection in the constructor in the form of a HashSet<Child>. It doesn't have to be a HashSet, though. It works just as well if you overwrite it when you create the entity with something like a List. However, the HashSet approach tells something important of the way EF behaves when considering collections of child objects.

In my case, I was doing something like
var parent = new Parent { 
Children = Enumerable
.Repeat(new Child { SomeProperty = SomeValue }, 3)
.ToList()
};
Can you spot the problem? When I changed the code to
var parent = new Parent();
Enumerable
.Repeat(new Child { SomeProperty = SomeValue }, 3)
.ToList()
.ForEach(child => parent.Children.Add(child));
I was shocked to see that my Parent had only a list of Children with count 1. Why? Because Enumerable.Repeat takes the instance of the object you give it and repeats it:

Enumerable.Repeat(something,3).Distinct().Count() == 1 !

There was no problem that I was setting the Children collection to a List instead of a HashSet, but when saving the children, Entity Framework was considering the distinct instances of Child.

The solution here was to generate different instances of the same object, something like
var parent = new Parent {
Children = Enumerable.Range(0,3)
.Select(i => new Child { SomeProperty = SomeValue }, 3)
.ToList()
};
or, to make it more clear for this case:
var parent = new Parent {
Children = Enumerable
.Repeat(() => new Child { SomeProperty = SomeValue }, 3)
.Select(generator => generator())
.ToList()
};

Learning ASP.Net MVC series:
  1. Setup
  2. MVC Concepts
  3. Authentication
  4. Entity Framework Fundamentals
  5. Upgrading project to .NET Core 1.1
  6. Dependency Injection and Services

Previously on Learning ASP.Net MVC...


Started with the idea of a project that would use user configurable queries to do Google searches, store the information in the results and then perform various data analysis on them and display them based on what the user wants. However, I first implemented a Google authentication and then went to write some theoretical posts. Lastly, I've upgraded the project from .NET Core 1.0 to version 1.1.

Well, it took me a while to get here because I was at a crossroads. I like the idea of dependency injectable services to do data access. At the same time there is the entire Entity Framework tutorial path that kind of wants to strongly integrate EF with my projects. I mean, if I have a service that gives me the list of all items in the database and then I want to get only a few items, it would be bad design to filter the entire list. As such, I would have to write a different method that allows me to get the items based on some kind of filters. On the other hand, Entity Framework code looks just like that "give me all you have, filtered by this" which is then translated into an efficient query to the database. One possibility would be to have my service return IQueryable <T>, so I could also use the system to generate the database code on the fly.

The Design


I've decided on the service architecture, against an EF type IQueryable way, because I want to be able to replace that service with anything, including something that doesn't work with a database or something that doesn't know how to dynamically create queries. Also, the idea that the service methods will describe exactly what I want appeals to me more than avoiding a bit of duplicated code.

Another thing to define now is the method through which I will implement the dependency injection. Being the control freak that I am, I would go with installing my own library, something like SimpleInjector, and configure it myself and use it explicitly. However, ASP.Net Core has dependency injection included out of the box, so I will use that.

As defined, the project needs queries to pass on to Google and a storage service for the results. It needs data services to manage these entities, as well as a service to abstract Google itself. The data gathering operation itself cannot be a simple REST call, since it might take a while, it must be a background task. The data analysis as well. So we need a sort of job manager.

As per a good structured design, the data objects will be stored in a separate project, as well as the interfaces for the services we will be using.

Some code, please!


Well, start with the code of the project so far: GitHub and let's get coding.

Before finding a solution to actually run the background code in the context of ASP.Net, let's write it inside a class. I am going to add a folder called Jobs and add a class in it called QueryProcessor with a method ProcessQueries. The code will be self explanatory, I hope.
public void ProcessQueries()
{
var now = _timeService.Now;
var queries = _queryDataService.GetUnprocessed(now);
var contentItems = queries.AsParallel().WithDegreeOfParallelism(3)
.SelectMany(q => _contentService.Query(q.Text));
_contentDataService.Update(contentItems);
}

So we get the time - from a service, of course - and request the unprocessed queries for that time, then we extract the content items for each query, which then are updated in the database. The idea here is that, for the first time a query is defined or when the interval from the last time the query was processed, the query will be sent to the content service from which content items will be received. These items will be stored in the database.

Now, I've kept the code as concise as possible: there is no indication yet of any implementation detail and I've written as little code as I need to express my intention. Yet, what are all these services? What is a time service? what is a content service? Where are they defined? In order to enable dependency injection, we will populate all of these fields from the constructor of the query processor. Here is how the class would look in its entirety:
using ContentAggregator.Interfaces;
using System.Linq;

namespace ContentAggregator.Jobs
{
public class QueryProcessor
{
private readonly IContentDataService _contentDataService;
private readonly IContentService _contentService;
private readonly IQueryDataService _queryDataService;
private readonly ITimeService _timeService;

public QueryProcessor(ITimeService timeService, IQueryDataService queryDataService, IContentDataService contentDataService, IContentService contentService)
{
_timeService = timeService;
_queryDataService = queryDataService;
_contentDataService = contentDataService;
_contentService = contentService;
}

public void ProcessQueries()
{
var now = _timeService.Now;
var queries = _queryDataService.GetUnprocessed(now);
var contentItems = queries.AsParallel().WithDegreeOfParallelism(3)
.SelectMany(q => _contentService.Query(q.Text));
_contentDataService.Update(contentItems);
}
}
}

Note that the services are only defined as interfaces which we declare in a separate project called ContentAggregator.Interfaces, referred above in the usings block.

Let's ignore the job processor mechanism for a moment and just run ProcessQueries in a test method in the main controller. For this I will have to make dependency injection work and implement the interfaces. For brevity I will do so in the main project, although it would probably be a good idea to do it in a separate ContentAggregator.Implementations project. But let's not get ahead of ourselves. First make the code work, then arrange it all nice, in the refactoring phase.

Implementing the services


I will create mock services first, in order to test the code as it is, so the following implementations just do as little as possible while still following the interface signature.
public class ContentDataService : IContentDataService
{
private readonly static StringBuilder _sb;

static ContentDataService()
{
_sb = new StringBuilder();
}

public void Update(IEnumerable<ContentItem> contentItems)
{
foreach (var contentItem in contentItems)
{
_sb.AppendLine($"{contentItem.FinalUrl}:{contentItem.Title}");
}
}

public static string Output
{
get { return _sb.ToString(); }
}
}

public class ContentService : IContentService
{
private readonly ITimeService _timeService;

public ContentService(ITimeService timeService)
{
_timeService = timeService;
}

public IEnumerable<ContentItem> Query(string text)
{
yield return
new ContentItem
{
OriginalUrl = "http://original.url",
FinalUrl = "https://final.url",
Title = "Mock Title",
Description = "Mock Description",
CreationTime = _timeService.Now,
Time = new DateTime(2017, 03, 26),
ContentType = "text/html",
Error = null,
Content = "Mock Content"
};
}
}

public class QueryDataService : IQueryDataService
{
public IEnumerable<Query> GetUnprocessed(DateTime now)
{
yield return new Query
{
Text="Some query"
};
}
}

public class TimeService : ITimeService
{
public DateTime Now
{
get
{
return DateTime.UtcNow;
}
}
}

Now all I have to do is declare the binding between interface and implementation. The magic happens in ConfigureServices, in Startup.cs:
services.AddTransient<ITimeService, TimeService>();
services.AddTransient<IContentDataService, ContentDataService>();
services.AddTransient<IContentService, ContentService>();
services.AddTransient<IQueryDataService, QueryDataService>();

They are all transient, meaning that for each request of an implementation the system will just create a new instance. Another popular method is AddSingleton.

Using dependency injection


So, now I have to instantiate my query processor and run ProcessQueries.

One way is to set QueryProcessor as a service. I extract an interface, I add a new binding and then I give an interface as a parameter of my controller constructor:
[Authorize]
public class HomeController : Controller
{
private readonly IQueryProcessor _queryProcessor;

public HomeController(IQueryProcessor queryProcessor)
{
_queryProcessor = queryProcessor;
}

public IActionResult Index()
{
return View();
}

[HttpGet("/test")]
public string Test()
{
_queryProcessor.ProcessQueries();
return ContentDataService.Output;
}
}
In fact, I don't even have to declare an interface. I can just use services.AddTransient<QueryProcessor>(); in ConfigureServices and it works as a parameter to the controller.

But what if I want to use it directly, resolve it manually, without injecting it in the controller? One can use the injection of a IServiceProvider instead. Here is an example:
[Authorize]
public class HomeController : Controller
{
private readonly IServiceProvider _serviceProvider;

public HomeController(IServiceProvider serviceProvider)
{
_serviceProvider = serviceProvider;
}

public IActionResult Index()
{
return View();
}

[HttpGet("/test")]
public string Test()
{
var queryProcessor = _serviceProvider.GetService<QueryProcessor>();
queryProcessor.ProcessQueries();
return ContentDataService.Output;
}
}
Yet you still need to use services.Add... in ConfigureServices and inject the service provider in the constructor of the controller.

There is a way of doing it completely separately like this:
var serviceProvider = new ServiceCollection()
.AddTransient<ITimeService, TimeService>()
.AddTransient<IContentDataService, ContentDataService>()
.AddTransient<IContentService, ContentService>()
.AddTransient<IQueryDataService, QueryDataService>()
.AddTransient<QueryProcessor>()
.BuildServiceProvider();
var queryProcessor = serviceProvider.GetService<QueryProcessor>();

This would be the way to encapsulate the ASP.Net Dependency Injection in another object, maybe in a console application, but clearly it would be pointless in our application.

The complete source code after these modifications can be found here. Test the functionality by going to /test on your local server after you start the app.

This will be a short post to describe my own stupidity. I was testing the new Entity Framework Plain Old CLR Objects (POCO) support and so I made a small test to:
  • Clear the database
  • Insert new items in the database
  • Select the items from the database, with and without related items

Every time I got the entire object tree, with child collection and parent objects, thus making me think that in this implementation of EF, the need to use Include was gone, instead an Exclude method was needed in order to tell the framework to NOT LOAD related objects! Insane!
After looking everywhere to find the answer, I finally turned to profiling SQL only to see that the database was not accessed with related items, but only what I had asked for. Then I had my "I'm an idiot!" moment. I was using the same context and EF knew the entire hierarchy of objects because (duh!) I had just inserted it a few code lines above. Using different contexts solved the "problem" and only returned the requested objects, making Include a necessity again.

Well, you might laugh out loud when you read this, but it is an achievement for me. I wanted to update a table (table A) which had a connection to another table (table B) through a many to many table (table A2B). I had changed A2B to link A to a new table C. I wanted to reflect that change in my Entity Model.

So the first step was to right click on the model and do an "Update Model from Database". Then went through all the steps and clicked Finish. Now entity A has both a collection of Bs and a collection of Cs. I wanted to delete the B collection but, to my chagrin, no delete option was available.

The epiphany came when I managed to select the graphical link from entity A to entity B. The link has the delete option! Which, I have to admit, does make sense. So whenever you want to remove an association between entities... well, select the association and delete it, not the entity properties.