I am mentally preparing for giving a talk about dependency injection and inversion of control and how are they important, so I intend to clarify my thoughts on the blog first. This has been spurred by seeing how so many talented and even experienced programmers don't really understand the concepts and why they should use them. I also intend to briefly explore these concepts in the context of programming languages other than C#.

And yes, I know I've started an ASP.Net MVC exploration series and stopped midway, and I truly intend to continue it, it's just that this is more urgent.

Head on intro


So, instead of going to the definitions, let me give you some examples, instead.
public class MyClass {
public IEnumerable<string> GetData() {
var provider=new StringDataProvider();
var data=provider.GetStringsNewerThan(DateTime.Now-TimeSpan.FromHours(1));
return data;
}
}
In this piece of code I create a class that has a method that gets some text. That's why I use a StringDataProvider, because I want to be provided with string data. I named my class so that it describes as best as possible what it intends to do, yet that descriptiveness is getting lost up the chain when my method is called just GetData. It is called so because it is the data that I need in the context of MyClass, which may not care, for example, that it is in string format. Maybe MyClass just displays enumerations of objects. Another issue with this is that it hides the date and time parameter that I pass in the method. I am getting string data, but not all of it, just for the last hour. Functionally, this will work fine: task complete, you can move to the next. Yet it has some nagging issues.

Dependency Injection


Let me show you the same piece of code, written with dependency injection in mind:
public class MyClass {
private IDataProvider _dataProvider;
private IDateTimeProvider _dateTimeProvider;

public void MyClass(IDataProvider dataProvider, IDateTimeProvider dateTimeProvider) {
this._dataProvider=dataProvider;
this._dateTimeProvider=dateTimeProvider;
}

public IEnumerable<string> GetData() {
var oneHourBefore=_dateTimeProvider.Now-TimeSpan.FromHours(1);
var data=_dataProvider.GetDataNewerThan(oneHourBefore);
return data;
}
}
A lot more code, but it solves several issues while introducing so many benefits that I wonder why people don't code like this from the get go.

Let's analyse this for a bit. First of all I introduce a constructor to MyClass, one that accepts and caches two parameters. They are not class types, but interfaces, which declare the intention for any class implementing them. The method then does the same thing as in the original example, using the providers it cached. Now, when I write the code of the class I don't actually need to have any provider implementation. I just declare what I need and worry about it later. I also don't need to inject real providers, I can mock them so that I can test my class as standalone. Note that the previous implementation of the class would have returned different data based on the system time and I had no way to control that behavior. The best benefit, for me, is that now the class is really descriptive. It almost reads like English: "Hi, folks, I am a class that needs someone to give me some data and the time of day and I will give you some processed data in return!". The rule of thumb is that for each method, external factors that may influence its behavior must be abstracted away. In our case if the date time provider provides the same time and the data provider the same data, the effect of the method is always the same.

Note that the interface I used was not IStringDataProvider, but IDataProvider. I don't really care, in my class, that the data is a bunch of strings. There is something called the Single Responsibility Principle, which says that a class or a method or some sort of unit of computation should try to only have one responsibility. If you change that code, it should only affect one area. Now, real life is a little different and classes do many things in many directions, yet they can implement any number of interfaces. The interfaces themselves can declare only one responsibility, which is why this is so nice. I don't actually have to have a class that is only a data provider, but in the context of my class, I only need that part and I am clearly declaring my intent in the code.

This here is called dependency injection, which is a fancy expression for saying "my code receives all third party instances as parameters". It is also in line with the Single Responsibility Principle, as now your class doesn't have to carry the responsibility of knowing how to instantiate the classes it needs. It makes the code more modular, easier to test, more legible and more maintainable.

But there is a problem. While before I was using something like new MyClass().GetData(), now I have to push the instantiation of the providers somewhere up the stream and do maybe something like this:
var dataProvider=new StringDataProvider();
var dateTimeProvider=new DateTimeProvider();
var myClass=new MyClass(dataProvider,dateTimeProvider);
myClass.GetData();
The apparent gains were all for naught! I just pushed the same ugly code somewhere else. But here is where Inversion of Control comes in. What if you never need to instantiate anything again? What it you never actually had to write any new Something() code?

Inversion of Control


Inversion of Control actually takes over the responsibility of creating instances from you. With it, you might get this code instead:
public interface IMyClass {
IEnumerable<string> GetData();
}

public class MyClass:IMyClass {
private IDataProvider _dataProvider;
private IDateTimeProvider _dateTimeProvider;

public void MyClass(IDataProvider dataProvider, IDateTimeProvider dateTimeProvider) {
this._dataProvider=dataProvider;
this._dateTimeProvider=dateTimeProvider;
}

public IEnumerable<string> GetData() {
var oneHourBefore=_dateTimeProvider.Now-TimeSpan.FromHours(1);
var data=_dataProvider.GetDataNewerThan(oneHourBefore);
return data;
}
}
Note that I created an interface for MyClass to implement, one that declares my GetData method. Now, to use it, I could write something like this:
var myClass=Dependency.Get<IMyClass>();
myClass.GetData();

Wow! What happened here? I just used a magical class called Dependency that gets me an instance of IMyClass. And I really don't care how it does it. It can discover implementations by itself or maybe I am manually binding interfaces to implementations when the application starts (for example Dependency.Bind<IMyClass,MyClass>();). When it needs to create a new MyClass it automatically sees that it needs two other interfaces as parameters, so it gets implementations for those first and continues up the chain. It is called a dependency chain and the container will go through it all to simply "Get" you what you need. There are many inversion of control frameworks out there, but the concept is so simple that one can make their own easily.

And I get another benefit: if I want to display some other type of data, all I have to do is instruct the dependency container that I want another implementation for the interface. I can even think about versioning: take a class that I know does the job and compare it with a new implementation of the same interface. I can tell it to use different versions based on the client used. And all of this in exactly one place: the dependency container bindings. You may want to plug different implementations provided by third parties and all they have to care about is respecting the contract in your interface.



Solution structure


This way of writing code forces some changes in the structure of your projects. If all you have is written in a single project, you don't care, but if you want to split your work in several libraries, you have to take into account that interfaces need to be referenced by almost everything, including third party modules that you want to plug. That means the interfaces need their own library. Yet in order to declare the interfaces, you need access to all the data objects that their members need, so your Interfaces project needs to reference all the projects with data objects in them. And that means that your logic will be separated from your data objects in order to avoid circular dependencies. The only project that will probably need to go deeper will be the unit and integration test project.

Bottom line: in order to implement this painlessly, you need an Entities library, containing data objects, then an Interfaces library, containing the interfaces you need and, maybe, the dependency container mechanism, if you don't put it in yet another library. All the logic needs to be in other projects. And that brings us to a nice side effect: the only connection between logic modules is done via abstractions like interfaces and simple data containers. You can now substitute one library with another without actually caring about the rest. The unit tests will work just the same, the application will function just the same and functionality can be both encapsulated and programatically described.

There is a drawback to this. Whenever you need to see how some method is implemented and you navigate to definition, you will often reach the interface declaration, which tells you nothing. You then need to find classes that implement the interface or to search for uses of the interface method to find implementations. Even so, I would say that this is an IDE problem, not a dependency injection issue.

Other points of view


Now, the intro above describes what I understand by dependency injection and inversion of control. The official definition of Dependency Injection claims it is a subset of Inversion of Control, not a separate thing.

For example, Martin Fowler says that when he and his fellow software pattern creators thought of it, they called it Inversion of Control, but they decided that it was too broad a term, so they moved to calling it Dependency Injection. That seems strange to me, since I can describe situations where dependencies are injected, or at least passed around, but they are manually instantiated, or situations where the creation of instances is out of the control of the developer, but no dependencies are passed around. He seems to see both as one thing. On the other hand, the pattern where dependencies are injected by constructor, property setters or weird implementation of yet another set of interfaces (which he calls Dependency Injection) is different from Service Locator, where you specifically ask for a type of service.

Wikipedia says that Dependency Injection is a software pattern which implements Inversion of Control to resolve dependencies, while it calls Inversion of Control a design principle (so, not a pattern?) in which custom-written portions of a computer program receive the flow of control from a generic framework. It even goes so far as to say Dependency Injection is a specific type of Inversion of Control. Anyway, the pages there seem to follow the general definitions that Martin Fowler does, which pits Dependency Injection versus Service Locator.

On StackOverflow a very well viewed answer sees dependency injection as "giving an object its instance variables". I tend to agree. I also liked another answer below that said "DI is very much like the classic avoiding of hardcoded constants in the code." It makes one think of a variable as an abstraction for values of a certain type. Same page holds another interesting view: "Dependency Injection and dependency Injection Containers are different things: Dependency Injection is a method for writing better code, a DI Container is a tool to help injecting dependencies. You don't need a container to do dependency injection. However a container can help you."

Another StackOverflow question has tons of answers explaining how Dependency Injection is a particular case of Inversion of Control. They all seem to have read Fowler before answering, though.

A CodeProject article explains how Dependency Injection is just a flavor of Inversion of Control, others being Service Locator, Events, Delegates, etc.

Composition over inheritance, convention over configuration


An interesting side effect of this drastic decoupling of code is that it promotes composition over inheritance. Let's face it: inheritance was supposed to solve all of humanity's problems and it failed. You either have an endless chain of classes inheriting from each other from which you usually use only one or two or you get misguided attempts to allow inheritance from multiple sources which complicates understanding of what does what. Instead interfaces have become more widespread, as declarations of intent, while composition has provided more of what inheritance started off as promising. And what is dependency injection if not a sort of composition? In the intro example we compose a date time provider and a data provider into a time aware data provider, all the time while the actors in this composition need to know nothing else than the contracts each part must abide by. Do that same thing with other implementations and you get a different result. I will go as far as to say that inheritance defines what classes are, while composition defines what classes do, which is what matters in the end.

Another interesting effect is the wider adoption of convention over configuration. For example you can find the default implementation of an interface as the class that implements it and has the same name minus the preceding "I". Rather than explicitly tell the framework that we want to use the Manager class each time someone needs an IManager implementation, it can figure it out for itself by naming alone. This would never work if the responsibility of getting class instances resided with each method using them.

Real life examples


Simple Injector


If you look on the Internet, one of the first dependency injection frameworks you find for .Net is Simple Injector, which works on every flavor of .Net including Mono and Core. It's as easy to use as installing the NuGet package and doing something like this:
// 1. Create a new Simple Injector container
var container = new Container();

// 2. Configure the container (register)
container.Register<IUserRepository, SqlUserRepository>(Lifestyle.Transient);
container.Register<ILogger, MailLogger>(Lifestyle.Singleton);

// 3. Optionally verify the container's configuration.
container.Verify();

// 4. Get the implementation by type
IUserService service = container.GetInstance<IUserService>();

ASP.Net Core


ASP.Net Core has dependency injection built in. You configure your bindings in ConfigureServices:
public void ConfigureServices(IServiceCollection svcs)
{
svcs.AddSingleton(_config);

if (_env.IsDevelopment())
{
svcs.AddTransient<IMailService, LoggingMailService>();
}
else
{
svcs.AddTransient<IMailService, MailService>();
}

svcs.AddDbContext<WilderContext>(ServiceLifetime.Scoped);

// ...
}
then you use any of the registered classes and interfaces as constructor parameters for controllers or even using them as method parameters (see FromServicesAttribute)

Managed Extensibility Framework


MEF is a big beast of a framework, but it can simplify a lot of work you would have to do to glue things together, especially in extensibility scenarios. Typically one would use attributes to declare which interface something "exports" and then use other attributes to "import" implementations in properties and values. All you need to do is put them in the same place. Something like this:
[Export(typeof(ICalculator))]
class SimpleCalculator : ICalculator {
//...
}

class Program {

[Import(typeof(ICalculator))]
public ICalculator calculator;

// do something with calculator
}
Of course, in order for this to work seamlessly you need stuff like this, as well:
private Program()
{
//An aggregate catalog that combines multiple catalogs
var catalog = new AggregateCatalog();
//Adds all the parts found in the same assembly as the Program class
catalog.Catalogs.Add(new AssemblyCatalog(typeof(Program).Assembly));
catalog.Catalogs.Add(new DirectoryCatalog("C:\\Users\\SomeUser\\Documents\\Visual Studio 2010\\Projects\\SimpleCalculator3\\SimpleCalculator3\\Extensions"));


//Create the CompositionContainer with the parts in the catalog
_container = new CompositionContainer(catalog);

//Fill the imports of this object
try
{
this._container.ComposeParts(this);
}
catch (CompositionException compositionException)
{
Console.WriteLine(compositionException.ToString());
}
}

Dependency Injection in other languages


Admit it, C# is great, but it is not by far the most used computer language. That place is reserved, at least for now, for Javascript. Not only is it untyped and dynamic, but Javascript isn't even a class inheritance language. It uses the so called prototype inheritance, which uses an instance of an object attached to a type to provide default values for the instance of said type. I know, it sounds confusing and it is, but what is important is that it has no concept of interfaces or reflection. So while it is trivial to create a dictionary of instances (or functions that create instances) of objects which you could then use to get what you need by using a string key (something like var manager=Dependency.Get('IManager');, for example) it is difficult to imagine how one could go through the entire chain of dependencies to create objects that need other objects.

And yet this is done, by AngularJs, RequireJs or any number of modern Javascript frameworks. The secret? Using regular expressions to determine the parameters needed for a constructor function after turning it to string. It's complicated and beyond the scope of this blog post, but take a look at this StackOverflow question and its answers to understand how it's done.

Let me show you an example from AngularJs:
angular.module('myModule', [])
.directive('directiveName', ['depService', function(depService) {
// ...
}])
In this case the key/type of the service is explicit using an array notation that says "this is the list of parameters that the dependency injector needs to give to the function", but this might be have been written just as the function:
angular.module('myModule', [])
.directive('directiveName', function(depService) {
// ...
})
In this case Angular would use the regular expression approach on the function string.


What about other languages? Java is very much like C# and the concepts there are similar. Even if all are flavors of C, C++ is very different, yet Dependency Injection can be achieved. I am not a C++ developer, so I can't tell you much about that, but take a look at this StackOverflow question and answers; it is claimed that there is no one method, but many that can be used to do dependency injection in C++.

In fact, the only languages I can think of that can't do dependency injection are silly ones like SQL. Since you cannot (reasonably) define your own types or pass functions along, the concept makes no sense. Even so, one can imagine creating dummy stored procedures that other stored procedures would use in order to be tested. There is no reason why you wouldn't use dependency injection if the language allows for it.

Testability


I mentioned briefly unit testing. Dependency Injection works hand in hand with automated testing. Given that the practice creates modules of software that give reproducible results for the same inputs and account for all the inputs, testing becomes a breeze. Let me give you some examples using Moq, a mocking library for .Net:
var dateTimeMock=new Mock<IDateTimeProvider>();
dateTimeMock
.Setup(m=>m.Now)
.Returns(new DateTime(2016,12,03));

var dataMock=new Mock<IDataProvider>();
dataMock
.Setup(m=>m.GetDataNewerThan(It.IsAny<DateTime>()))
.Returns(new[] { "test","data" });

var testClass=new MyClass(dateTimeMock.Object, dataMock.Object);

var result=testClass.GetData();
AssertDeepEqual(result,new[] { "test","data" });

First of all, I take care of all dependencies. I create a "mock" for each of them and I "set up" the methods or property setters/getters that interest me. I don't really need to set up the date time mock for Now, since the data from the data provider is always the same no matter the parameter, but it's there for you to see how it's done. Second, I instantiate the class I want to test using the Object property of my mocks, which returns an object that implements the type given as a generic parameter in the constructor. Third I assert that the side effects of my call are the ones I expect. The mocks need to be as dumb as possible. If you feel you need to write code to define your mocks you are probably doing something wrong.

The type of the tests, for people who are not familiar with this concept, is usually a fully positive one - that is give full valid data and expect the correct result - followed by many negative ones, where the correct data is made incorrect in all possible ways and it is tested that the method fails. If there are many types of combinations of data that would be considered valid, you need a test for as many of them.

Note that the test is instantiating the test class directly, using the constructor. We are not testing the injector here, but the actual class.

Conclusions


What I appreciate most with Dependency Injection is that it forces you to write code that has clear boundaries defined by interfaces. Once this is achieved, you can go write your own stuff and not care about what other people do with theirs. You can test your modules without even caring if the rest of the project even exists. It allows to refactor code in steps and with a lot more confidence since you are covered by unit tests.

While some people work on fire-and-forget projects, like small games or utilities, and they don't care about maintainability, one of the most touted reasons for using unit tests and dependency injection, these practices bring so many other benefits that are almost impossible to get otherwise.

The entire point of this is reducing the complexity of dependencies, which include not only the modules in your application, but also the support frame for them, like people working on them. While some managers might not see the wisdom of reducing friction between software components, surely they can see the positive value of reducing friction between people.

There was one other topic that I wanted to touch, but it is both vast and I have not enough experience with it, however it feels very attractive to me: refactoring old code in order to use dependency injection. Best practices, how to make it safe enough and fast enough to make managers approve it and so on. Perhaps another post later on. I was thinking of a combination of static analysis and automated methods, like replacing all usages of "new" with a single point of instantiation, warning about static methods and properties, automatically replacing known bad practices like DateTime.Now and so on. It might be interesting, right?

I hope I wasn't too confusing and I appreciate any feedback you have. I will be working on a presentation file with similar content, so any help will go into doing a better job explaining it to others.

and has 0 comments
A colleague of mine hit a strange bug today. It so happened that we use a bastardized dependency injection method that takes into account the WCF session before returning an implementation of an interface. In a piece of code the injection failed and we couldn't see why for a while. Let me give you a simplified version:
var someManager=Package.Get<IManager>();
var someDTOs=Cache.GetDatabaseObjects().Select(x=>x.Pack());

public class DataObject {
public string Data {get;set;}
public DataObjectDTO Pack() {
var anotherManager=Package.Get<IAnother>();
return new DataObjectDTO {
Data=anotherManager.Process(Data)
};
}
}

Package.Get will attempt to find a session object and if not it will use another mechanism, but if it finds one, it will only use it if it is not expired or invalid, else throwing an exception. This code failed in the Pack method, when trying to get an instance of IAnother. Please take a few moments to reflect on why (and no, it's not that between calls the session expired).

Show explanation

and has 0 comments
I've stumbled upon a very funny exception today. Basically I was creating a constant string from adding some other constant strings to each other. And it worked. The moment I added an integer, though, I got The expression being assigned to 'Program.x2' must be constant. The code that generated this error is simple:
const string x2 = "string" + 2;
Note that
const string x2 = "string" + "2";
is perfectly valid. Got the same result when using VS2010 and VS2015, so it's not a compiler bug, it's intended behavior.

So, what's going on? Well, my code transforms behind the scenes into
const string x2 = "string" + 2.ToString();
which is not constant because of ToString!

The only way to solve it was to declare the numeric constant as string as well.

This clause is so obscure that I couldn't even find the Microsoft reference page for it for a few minutes, so no wonder I didn't know about it. Introduced in SQL Server 2005, the TABLESAMPLE clause limits the number of rows returned from a table in the FROM clause to a sample number or PERCENT of rows.

Usage:
TABLESAMPLE (sample_number [ PERCENT | ROWS ] ) [ REPEATABLE (repeat_seed) ]

REPEATABLE is used to set the seed of the random number generator so one can get the same result if running the query again.

It sounds great at the beginning, until you start seeing the limitations:
  • it cannot be applied to derived tables, tables from linked servers, and tables derived from table-valued functions, rowset functions, or OPENXML
  • the number of rows returned is approximate. 10 ROWS doesn't necessarily return 10 records. In fact, the functionality underneath transforms 10 into a percentage, first
  • a join of two tables is likely to return a match for each row in both tables; however, if TABLESAMPLE is specified for either of the two tables, some rows returned from the unsampled table are unlikely to have a matching row in the sampled table.
  • it isn't even that random!

Funny enough, even the reference page recommends a different way of getting a random sample of rows from a table:
SELECT * FROM Sales.SalesOrderDetail
WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float) / CAST (0x7fffffff AS int)

Even if probably not really usable, at least I've learned something new about SQL.

Update:
More about getting random samples from a table here, where it explains why ORDER BY NEWID() is not the way to do it and gives hints of what really happens in the background when we invoke TABLESAMPLE.
Another interesting article on the subject, focused more on the statistical probability, can be found here, where it also shows how TABLESAMPLE's cluster sampling may fail in spectacular ways.

.Net Core Web API uses Newtonsoft's Json.NET to do JSON serialization and for other cases where you wanted to control Json.NET options you would do something like
JsonConvert.DefaultSettings = (() =>
{
var settings = new JsonSerializerSettings();
// do something with settings
return settings;
});
, but in this case it doesn't work. The way to do it is to use the fluent interface method and hook yourself in the ConfigureServices(IServiceCollection services) method, after the call to .AddMvc(), like this:
services
.AddMvc()
.AddJsonOptions(options =>
{
var settings=options.SerializerSettings;
// do something with settings
});

In my particular case I wanted to serialize enums as strings, not as integers. To do that, you need to use the StringEnumConverter class. For example if you wanted to serialize the Gender property of a person as a string you could have defined the entity like this:
public class Person
{
public string Name { get; set; }
[JsonConverter(typeof(StringEnumConverter))]
public GenderEnum Gender { get; set; }
}

In order to do this globally, add the converter to the settings converter list:
services
.AddMvc()
.AddJsonOptions(options =>
{
options.SerializerSettings.Converters.Add(new StringEnumConverter {
CamelCaseText = true
});
});

Note that in this case, I also instructed the converter to use camel case. The result of the serialization ends up as:
{"name":"James Carpenter","age":51,"gender":"male"}

and has 0 comments
I was doing this silly HackerRank algorithm challenge and I got the solution correctly, but it would always time out on test 7. I wracked my brain on all sorts of different ideas but to no avail. I was ready to throw in the towel and check out other people solutions, only they were all in C++ and seemed pretty similar to my own. And then I've made a code change and the test passed. I had replaced LINQ's OrderBy with Array.Sort.

Intrigued, I started investigating. The idea was creating a sorted integer array from a space delimited string of values. I had used Console.ReadLine().Split(' ').Select(s=>int.Parse(s)).OrderBy(v=>v); and it consumed above 7% of the total CPU of the test. Now I was using var arr=Console.ReadLine().Split(' ').Select(s=>int.Parse(s)).ToArray(); Array.Sort(arr); and the CPU usage for that piece of the code was 1.5%. So it was almost five times slower. How do the two implementations differ?

Array.Sort should be simple: an in place quicksort, the best general solution for this sort (heh heh heh) of problem. How about Enumerable.OrderBy? It returns an OrderedEnumerable which internally uses a Buffer<T> to get all the values in a container, then uses an EnumerableSorter to ... quicksort the values. Hmm...

Let's get back to Array.Sort. It's not as straightforward as it seems. First of all it "tries" a SZSort. If it works, fine, return that. This is an external native code implementation of QuickSort on several native value types. (More on that here) Then it goes to a SorterObjectArray that chooses, based on framework target, to use either an IntrospectiveSort or a DepthLimitedQuickSort. Even the implementation of this DepthLimitedQuickSort is much, much more complex than the quicksort used by OrderBy. IntrospectiveSort seems to be the one preferred for the future and is also heavily optimized, but less complex and easier to understand, perhaps. It uses quicksort, heapsort and insertionsort together.

Now, before you go all "OrderBy sucks!", read more about it. This StackOverflow list of answers seems to indicate that in case of strings, at least, the performance is similar. A lot of other interesting things there, as well. OrderBy uses a "stable" QuickSort, meaning that two items that are compared as equal will appear in their original order. Array.Sort does not guarantee that.

Anyway, the performance difference in my particular case seems to come from the native code implementation of the sort for integers, rather than algorithmic improvements, although I don't have the time right now to grab the various implementations and test them properly. However, just from the way the code reads, I would bet the IntrospectiveSort will compare favorably to the simple Quicksort implementation used in OrderBy.

I've met an interesting case today when we needed to manipulate data from tens of thousands of people daily. Assuming we would use table rows for the information, then we get a table in which rows are constantly added, updated and deleted. The issue is with the space allocated in table pages.

SQL works like this: If it needs space it allocates some as a "page" which can contain more records. When you delete records the space is not reclaimed, it remains as is (this is called ghosting). The exception is when all records in a page are deleted, in which case the page is reused as an empty page. When you update a record with more data then it held before (like when you have a variable length column), the page is split, with the rest of the records on the page moved to a new page.

In a heap table (no clustered index) the space inside pages is reused for new records or for updated records that don't fit in their allocated space, however if you use a clustered index, like a primary key, the space is not reused, since there needs to be a correlation between the value of the column and its position in the page. And here lies the problem. You may end up with a lot of pages with very few records in them. A typical page is 8 kilobytes, so a table with a few integers in a record can hold hundreds of records on a single page.

Fragmentation can be within a page, as described above, also called internal, but also external, between pages, when the recycled pages are used for data that is out of order. To get a large swathe of records the disk might be worked hard in order to jump from page to page to get what is logically a continuous blob of data. It is disk input/output that kills a database.

OK, back to our case. A possible solution was to store all the data for a user in a "blob", a VARBINARY column. For reads or changes only the disk space occupied by the blob would be changed, with C# code handling everything. It's what is called trading CPU for IO, which is generally good. However this NoSql-like idea itself smelled badly to me. We are supposed to trust our databases, not work against them. The solution I chose is monitoring index fragmentation and occasionally issuing clustered index rebuilding or reorganizing. I am willing to bet that reading/writing the data equivalent to several pages of table is going to be more expensive than selecting the changes I want to make. Also, rebuilding the index will end up storing all the data per user in the same space anyway.

However, this case made me think. Here is a situation in which the solution might have been (and it was in a similar case implemented by someone else) to micromanage the way the database works. It made me question using a clustered index/primary key on a table.

These articles helped me understand more:

I had this problem with Perforce where I accidentally Reconciled my offline work with all the files in /bin and /obj folders, resulting in a huge 6000+ file changelist. OK, simple one button mistake, surely there must be some one button undoing what I just did. It appears there is not.

In order to fix this I have to follow these steps:
  1. Change the settings of Perforce to show files even in changelists larger than 1000 items (the default value)
  2. Select by hand in the changelist window the files from obj and bin folders and using Revert on them
  3. Revert the few other files that were unwanted in the changelist, like .suo and .user files - note that Revert on added files doesn't delete them, it just unadds them
  4. Create a file with paths to ignore and then use p4 set P4IGNORE=<filename> for future reconcile work

What didn't work was adding a filename or path filter when visualizing the changelist, since that is a changelist filter, not a files filter. It will show you changelists that have files that contain the pattern, but not filter the files inside the changelists themselves.

For reference, the p4ignore file I used looked like this:
p4ignore
bin/Debug
obj/Debug
*.suo
*.user
Note that I also added the p4ignore file itself, although the file was not in any Perforce repository (yet).

"But, Siderite, you should use Git (or whatever source control is the newest fad at the moment)!" Wish that could, my friend, wish that I could.

and has 0 comments

Update May 2020: I used this on a web site and the body was white. It may be that the bug in Chrome was solved in the meantime.

Update October 2019: a CSS media feature (prefers-color-scheme) can be used in conjunction with this. A recent development, it's a media query that allows a browser to activate CSS code based on the theme set in the operating system. You set your preference in Windows or MacOS or wherever and then sites that use prefers-color-scheme will take advantage of that. Something like this:

@media (prefers-color-scheme: dark) {
  html,img, video, object, [style*=url] {
    -webkit-filter:invert(100%) !important;
    filter:invert(100%) !important;
  }

  /* this was solving a bug in Chrome that seems to have been fixed
  body {
    background:black;
  }
  */
}

  I am a very light sensitive person. Shine a light in my eyes and you limit my productivity immensely. Not to mention it makes me irritable. Therefore I often have the desire to turn cheerful black on white sites to a dark theme, where the colors are reversed. I am sure other people have the same problem so I thought of building a browser extension to enable a switching button between the two.

  The first problem is that I need to interrogate all the elements in a page, including the ones that will be created later. The second is that even so I would have problems determining the dominant color of images. But there is something I can use which makes all of this unnecessary: using the invert CSS filter! Since I already use a browser extension that injects my own styles in any site - it's called Stylish and I highly recommend it - all I have to do is apply a filter on the entire site, right?

  Wrong! The problem is that when you invert an entire site, all images on the site get inverted, too. That also includes videos and Flash objects. The worst offenders here are the elements that sport a background image that is declared via CSS, since you can't create a CSS selector for them. I am going to present my partial solution and maybe you can help me find a more elegant or more complete one. Here is a general dark theme stylesheet, without the elements that have a background image declared via CSS (it does include those with a background image declared inline, though):

 html,img, video, object, [style*=url] {
    -webkit-filter:invert(100%) !important;
    filter:invert(100%) !important;
  }

 /* this was solving a bug in Chrome that seems to have been fixed
 body {
   background:black;
 }
 */

  What it does is invert the entire page (html), then reinvert video, img, and object elements, as well as those with "url" in the style attribute. In Chrome, at least, there seems to be a bug in the sense that the backgrounds of the direct child elements are not inverted, which means body, as the first child, needs to have the background set to black specifically. (this seems to have been solved by May 2020) The hack to invert elements with "url" in style is pretty ugly, too.

  What I think of a solution is this:

  • inject Javascript to enumerate all elements present and future using document.createTreeWalker and Mutation Observers, check if they have a background image and if so, add a class to them
  • inject the CSS above with an additional rule for the class for the elements with background image

  However, this doesn't completely solve the problem. One of the major issues is the inverted colors sometimes look dumb. For example a red background turns cyan, white text on light gray background turns black text on dark gray background, which makes it hard to read. I've tried various other filters, like hue-rotate or contrast, but it doesn't really help. Detecting individual color patterns doesn't really work, as the filter attribute affects an element and all of its children. The CSS above only works because the images are inverted again when the entire page has been inverted.

  The good news is that most of the time you may use the CSS above as a template, then add various rules (manually) to fix small issues with colors of backgrounds. Even if I don't package this in an extension, you have the power to create your own themes for various sites. Never again will you be subjects to the tyranny of the happy bright shiny people!

I was glad to attend the 565th SQLSaturday in Bucharest yesterday and, while all presentations were cool, I wanted to share with you some specific points that I found very revealing. Without further ado, here is the list:
  • SQL execution plans are read from right to left - such a simple thing, but I remember when I was trying to read them from left to right and I didn't get anything. In SQL Server Management 2016 you also get a "live" version, which shows you an execution plan while it's executing. Really useful to see where the blocking operations are.
  • Manually control your statistics update - execution plans are calculated based on statistics, but the condition for updating the statistics is to have changes in a number of 20% of the rows plus 500 of any table. This default setting is completely arbitrary and may cause a lot of pain. Not only updating the statistics blocks your table (which means more chances that the table will be locked when it is most used), but sometimes the statistics are not useful. One example are reports which may receive a startdate/enddate range or a count or something like that which makes the number of rows affected vary immensely with different parameters. Use OPTION(RECOMPILE) for that.
  • Look for a difference between estimated and actual rows in a query plan, which leads to tempdb spills, which leads to unwanted IO operations - before a query, an execution plan is created or reused, based on statistics, as I was saying above. Once a plan has been chosen, though, it doesn't change during its execution. Basically what this means is that the structure of the plan remains unchanged between the estimated and actual plan. Also based on the plan, memory is requested and never changed. So if the plan asks for 10KB of memory and you need 1000KB, the rest of 990 will be stored and used from tempdb even if there is enough memory to put them in, since the memory requirements don't change from estimated to actual. The reverse is not much better, since a plan may ask for a lot of memory when it only needs little, thus making everything else on that machine have less available resources.
  • SQL default settings suck - there was an entire presentation about that, it is useful to think a little about it. So many settings are legacy things that make no sense, like the initial database size, the autogrow size, index fill factors, maxdop (degree of parallelism), parallelism threshold, used memory (ironically, using all of it may be hurtful as it takes it away from other processes which leads to using the swap file), etc.
  • Look for hard page faults - this counter is much useful than the soft page faults, which are fixable faults. A hard page fault is indicative of unnecessary IO operations, which are orders of magnitude slower than memory use.

There are a lot more things that I want to explore now, since I participated to the event. You may find the files for the presentations in the same place as the full list of talks at SQLSaturday.

After a slightly misused sabbatical year, I went through a period of trying to get hired. That means interview after interview with people that were assessing my fit within their company. Man, that sucked! I mean, I am a white male professional in a business where everyone is looking for personnel and still it was frustrating, demeaning and painful. But I am not here to complain (maybe a little :) ), only to share my experience and my... constructive criticism.

The story


OK, so first off, the only other real experience in looking for work was more than ten years ago and then I was an absolute beginner on the market. However, back then I knew I was a nobody, while now I know that my experience and passion put me way up there as usefulness and value go. I may have started off this campaign with an unhealthy level of smugness, but it goes off quickly, I assure you.

I am lucky that I had this year of experimenting before I started looking, which allowed me to treat it as any other experiment: I accepted almost all interviews and I went diligently through the entire process, no matter my personal opinion about the company. That helped a lot as a learning experience; while I know how to code, it quickly became apparent that I have no idea how to convince others about it. I set up to try everything, learn from it, while continuing to be principled. In my mind that meant completely honest. I didn't expect company people to be honest with me, but that was on them. I would be a perfect WYSIWYG candidate. Better to fail fast, rather than have a miserable experience in the relationship. BTW, that is also my strategy with girls, which explains why I am virgin. I stuck to my guns, though.

The experience


I am not going to name names here. This is not about how awful some companies may have been. It is all about my perception of the hiring process. And it is that it sucks!

I don't know if in other fields it goes smoother, but imagine this: the only people that have any idea about how to hire people are the HR people and they have no idea what software programming is like. I could be married with an HR manager and she still would not know anything about software development. The technical people may know how to code, but they have no idea how to determine if the other person is any good and if you are a technical person you are definitely not an human resources person. I applaud people who can be both, mostly because I have not met one and I can't think of any scenario that would produce such an individual other than some radioactive alien arthropod biting a regular person.

Funny enough, compared with my past, is that HR mostly liked me while the technical people (the majority more than a decade younger) mostly dismissed me. At first I felt like a complete impostor (of course), my self esteem plummeted (which didn't help any), and I was about to beg for a job (which is what most people do, I hear). However, I tried to see the situation from the point of view of the people trying to hire me (I know: shocking!) and I could understand their situation and empathize. Think about it. How would you test someone for a programming job? Who would you call in that meeting? What would be the salary that you would budget for that person (in EUR, after taxes)?

Did you really think about it? Come on, make an effort. I guarantee it's worth it.

OK, so the HR people looked at my resumé and saw that I have had a lot of stable jobs before in all kinds of environments. I was a pleasant enough person (I mean, for a techie, which means without obvious homicidal tendencies) with a very good understanding of the English language. No obvious conflicts, although I may have been too honest in my (err.. constructive!) criticism of past employment. I mean, come on!, every dev can tell you that managers don't know what they're doing, right? If I think about it, the job of the HR department seems fairly simple to me: look for a candidate that fits the profile, lure them in, do the most simplistic psychological screening possible, then pass it along to the tech department. It's something that AIs will probably take over soon. I may fondly look backwards to these times, when there existed people that were actually biased towards me! The typical HR person is a girl. Now, if I am being insensitive here, I apologize, but if you want to seduce a tech to the point where he would do anything to come to you, you use a nice, sexy girl. It's only natural when devs are mostly male virgins. To be honest, these girls could have hated me or wanted me to have sex with them and I would have had no clue whatsoever. If they said it, I took it for granted. If they lied, again, it's on them. If they remained quiet, then I couldn't parse it into text and who's fault is that?

So then there was the tech interview. You have some guy who thinks he's God because he can code and maybe have some overview that is slightly larger than those of his juniors. He is young, probably coming from some technical university (yes, in Romania people actually do look for coding work after studying Computer Science). He has no idea how he needs to conduct an interview, but admitting to that, even to himself, is a bit too discordant with his view of his person. So he does what every tech would do in this situation: he Googles it. You might be amazed, but Google actually turns up some good advice, but you must be willing to admit that your expectations for how to do that may have been completely wrong. So he does the second thing anyone does when Googling: looks for links that validate their own beliefs (and also have some template for interviews that they can quickly print and use).

Am I being unfair here? Probably. But it is a good theory to explain the types of interviews that I had and how they all seemed carbon copied after each other. The template is basically this:
  1. an algorithmic question, such as: how do you refresh a sorted list from another complete sorted list, or how do you intersect two sorted lists, or how do you search into a sorted list or... wait a minute, are they all about bloody sorted lists?
  2. general algorithmic knowledge questions, such as: what is the difference between a list and a linked list, or an array and a list, what are the complexities of operations on lists. Pretty much there has to be a data container there.
  3. general language knowledge questions, such as: behavior of some implementation in a specific language, the results of SQL queries, characteristics of SQL indexes, some HTML stuff, the life cycle of ASP.Net if you are really lucky...
  4. tools and ways of working in tech from previous employers. Here they are actually interested, because while they appear to be judging everything you say, they actually want to hear of better ways of working themselves.
  5. questions about a project you really liked or had a lot of influence over. Yeah! And while you feel like an idiot because no one ever let you work on a project that you think was special, the interviewer learns from your experience and adapts it to their crappy project.
  6. asking you if you have questions for them and looking like they expect you to have some really sensible and relevant questions when all you want is to know if they want to have you or not

The existence of the first step is being fed by sites like HackerRank, Codility and CodingGame, which should never be considered as anything else than learning tools, if not just silly games. However, since these people went through grueling university lectures about algorithms and then inflated their own ego playing on the web sites above they assume you should know about them too. It doesn't matter that they rarely found use of any of it when working on their projects, they just push it under the vague concept of "wanting to see how you think". However, they are not logical problems (like they used to put 5 years ago, copying from Google and the like), they are very specific coding situations. You may know the exact solution - because you played around with algorithms when bored - or you might have no idea how to solve the problem.

And here you are now, facing a guy that looks critically at you, while trying to think of the problem, finding the best solution and doing it before the guy gets bored. It will take him about 60 seconds to get bored, too, as he already knows the answer to the question and he feels it's obvious and the only one possible. Hell, he knew how to solve this before he even left university! It doesn't matter that several scenarios fight for supremacy in your head, that for each there is another solution, that the very simple solution feels too simple and your brain is wracking itself to find another one - that would be probably either wrong, over engineered or both. And you want, you really want to implement a three step algorithm that you know always works (1. Google it 2. Think of something better 3. Use the best implementation found), but you believe it would be perceived as not knowing your stuff. I mean, what if you are at work and your Internet dies? Surely you need to solve the problem anyway, right?

In truth, the lucky scenario is if they send you to HackerRank or something similar to solve a technical test before you meet with anybody. That goes over fast and easy, while you hack comfortably in your underwear and you have no stress about who is thinking what. The unlucky scenario is that you get a guy who thinks you are not a true developer unless you are working on open source projects on GitHub in your spare time. Oh, and they need to be interesting to him.

Yet, after you go through the first two steps, the rest are a breeze: you know your stuff, you know your languages, you can even think of a project or two that had something remotely instructive in them. It feels like you went over a bump, but now you can go full speed. They ask you various things about your past experience, you gladly oblige, make a few jokes, get some laughter, start to feel good about yourself. Surely, you will pass the interview.

And then you get "the call", where you know you have failed from the tone of the HR girl who needs to tell you that they won't be going further with it. You still hope against hope while she goes on and on and on through her complex script of letting you go easy. She thinks she's being thoughtful, yet you are a tech and you want the answer first and the explanations after. And you despair. Obviously, you suck. You will never get a job. You are worthless, less than worthless, a complete buffoon. And here you were, thinking that years of successful work with people that appreciated your efforts meant anything. When was the last time you learned something new anyway? Last month? Three programming languages and five new frameworks launched since then, not to count the new versions of old frameworks that you never got around to master. Who were you kidding? There are ten year olds that can code better than you. And they are not married yet! You know getting a dog will ruin your career. You are a fossil, admit it! In five years you will be begging for food on the street with a sign that says "Will code for bread". And you know what? You are right: you are an idiot!

I cannot claim absolute truth here, because I don't have enough data to arrive at a clear conclusion. That is because when they flunk you, the sweet HR girl stops contacting you altogether. If you are lucky the company didn't use their own human resources and instead you arrived through a dedicated HR firm (headhunters) and they have the decency to not only tell you didn't pass, but also make the effort to tell you why. The people you maybe knew at that company and were really supportive of you joining their team drop from the face of the Earth. Clearly, you were too stupid to work there, so they cut you off. At the very best you are an emotional mess and they don't want to have anything to do with that. So the next section is mostly speculation, but I will try to make it sound good.

The Explanation


There are a zillion reasons people don't want you in their team on the specific project they are working on and that have nothing to do with your value as a human being.

You might have asked for a sum that is too large for what they were prepared to offer. Even if you are that valuable, they are too cheap for it. It's like the girl who dumps you without telling you why (maybe mumbling something about you being insensitive) because you said you liked anal and she was afraid to try it. It may also be because you think you deserve more than people are actually willing to pay in general. That's on you. However, the correlation between your skills and your pay is not linear. It mostly depends on the market. After all, you are trying to sell yourself. You are already a whore, now you are just negotiating on the price. Today you may be a hero, tomorrow you will be the guy that made some money in that [enter fleeting fad] boom and lost it all in the subsequent crash.

People might put a lot of value on algorithms. It may be a good decision, because in their project they often meet situations where good algorithmic knowledge saves the day. If you are not good at it, or you couldn't prove it in the makeshift interview you flunked, they have every right to not go through with it. They might also not know any other way to test your knowledge and be too lazy to actually look for value in people. There a lot of other similar reasons that you may file under this scenario. They wanted something specific from you, didn't tell you and you don't have it.

They think you are old. And you may just well be. Age discrimination aside, why should they hire someone like you when they perceive the same value from a guy 20 years younger - and way cheaper? If I had to chose, I would have no qualms whatsoever. This is also linked to expectations. Remember when you were counseled to try to move to a manager position before you got to [enter ridiculously low number here]? That translates to the expectation that after that age, being a simple techie means there is something wrong with you. This will never ever change. Look at the age average in all the big companies: it's about 30. Startups even less, around 22. That doesn't mean you need to become a manager, I am sure old managers feel just as threatened. Plus, you might really suck as a manager. I know I would. Unofficial sources say that even the places that usually hire people for experience (like government jobs) stop looking at resumes for people over 45. Age does matter, so plan for it.

And then there is the idea that if you are inexperienced you can learn quicker the things that "you really need", like that framework that became famous while you were reading this blog post. You may be experienced, but will they need to fight with you on whether to use ASP.Net MVC over ASP.Net forms? (or is ASP.Net MVC obsolete already? I don't know, I was blogging). I don't know if that's true. I did learn quicker when I was young, but that was mostly by failing miserably again and again. On the other hand a job position where you are hired for your ability to fail your tasks sounds pretty good, doesn't it?

There is also the personal thing. You might have rubbed someone the wrong way. That means not that you are an awful person, but that you just don't match with a person who you might have had to work with if they went forward and hired you. Again, want to be married with the girl that hates you, no matter how big her boobs are? You may be an asshole, but maybe the other person was, too. Some people might feel threatened by you, either because you threatened their life if they don't hire you or because they think you are way sexier than them and you would cock block their attempts to woo the HR girl. You won't become a better person by trying to be liked by everyone. They might have hated your shirt, for example, the one that you thought would look really professional, but they saw as threatening, because they usually work in shorts and t-shirts. Point is, they had some expectations and you didn't meet them. Were they justified? You don't know and you shouldn't care.

The position you would be working on is equally important. You might be a brilliant web developer, but if they actually wanted a server guy, or viceversa, they will drop you. They will not admit that they were not specific in their job description, of course, and instead just blame you for not being "a good fit". Imagine you are a cube that is crying it didn't fit in a circular hole. Ridiculous, right? I mean, would the tears be blocky? As a specific example: I went somewhere for several interviews. I was "a perfect match" for one and not the right person for another. The JD document they sent for both positions was identical.

Solutions


Since I code and have an overview on life, I can definitely tell you how interviews should be conducted, but you will have to buy the paid version of this blog for that. See, I am learning fast, in one phrase I was both a tech, a business person and an asshole.

The truth is that the only way I could think of that wasn't insulting to everyone's intelligence was to actually show them a computer, a real problem, and let them fix it with me watching and helping next to them. And it still wouldn't be sufficient. If it were "real" enough, then it would take time to understand all of the aspects of the problem. The guy might be overly nervous with me next to him. I know people that can only work when no one is watching, and they do great work. Plus, they may have experience on that exact problem and suck at anything else.

Unfortunately, in all my interviews the only tools that I had to work with were pen and paper. Putting aside the fact that I don't even understand my own writing, the last time I had to actually write anything on paper was... oh yeah, the last time I was looking for a job. Does it make sense to conduct software development interviews with no computer? I would say no.

Conclusions


There is a schism between what they expect and what you expect, what you think of yourself and what they do. That is the real reason behind every failed interview. It doesn't really matter if they had unrealistic expectations, but it matters a lot if you had. Like every experiment: acquire data, reason about the data, propose a theory that explains it, test it against new data. The best way to achieve anything is to change your behavior towards the goal. The important thing here is to define the goal. Is it to get hired at any and all costs? Or is it to find the place where you will enjoy working, keep growing and be appreciated for your efforts?

In the end it so happens that not only did I get hired at the company I was aiming for, but I did it on the position I feel I was best suited for, rather than some mediocre second best. Like with dating girls, it is worth waiting for the right one. And with software, you get to do side projects, too!

Update 12 August 2018: I've updates the project to Core 2.1 and changed the Add method to treat OutOfMemoryExceptions in case of memory fragmentation.

I have been looking for a job lately and so I've run into these obnoxious interviews when someone asks you to find the optimal algorithm for some task. And as much as I hate those questions in an interview, they got me thinking about all kinds of normal situations where the implementation is not optimal. I believe it will be hard to find something less optimal than the List<T> implementation in .Net.

Reference


For reference, look at the implementation code in its entirety.

Story


I will be discussing some of the glaring issues with the code, but before that, let's talk about the "business case". You have a C-like programming language, armed with all the default types for such, like bool, int, array, etc. and you need a dynamically sized container, one where you can add, insert and remove elements. The first idea is to use an array, then resize it accordingly. Only you can't know how large the array will need to be and you can't just allocate memory and then resize that allocation, as other variables have already occupied the next blocks of memory. The solution might be to allocate an initial array, then - when its size is no longer sufficient - create a larger one, copy what you need and make the changes in it.

A comment in the source tells us what the developers meant to do: The list is initially empty and has a capacity of zero. Upon adding the first element to the list the capacity is increased to 16, and then increased in multiples of two as required. So, an empty list is just a wrapper for an empty array. A list with one element occupies the size of a 16 element array, but if you add up to 17 elements, the array will double and continue to double.

Implementation


Let's check the algorithmic complexity for the add operation. For the first 16 elements you just add elements in the internal array, n operations for n elements. Once you get to 17, the complexity increases, as you need to copy all previous 16 values first. Now it's 16+16+1, which continues up to 33, where you have 16+16+16+32+1. At 65 it's 16+16+16+32+32+64+1. So while we are adding element after element the operational complexity is on average twice as much as using an array. Meanwhile, the space occupied is half more than you actually need.

Insertion is similarly tricky, but even more inefficient. Basically, when you insert a value or a range, the first operation is EnsureCapacity, which may copy the entire array in a new array. Only afterward the insert algorithm is run and it again copies the part of the array found after the index for the insert.

Removal works in the opposite direction with the caveat that it never decreases the size of the array. If you added 10 million records in your list, then deleted them, your list now contains an internal array that is 10 million elements in size. There is a method called TrimExcess that tries to solve this, but you must call it manually. At least RemoveAll is using an O(n) algorithm instead of calling Remove repeatedly, which would have been a disaster.

The piece of code that sets the internal dimension of the list is actually in the setter of the Capacity property, and it dumbly creates an array and copies the values from the current one to the new one.

A lot of the other operations are implemented by calling the static methods on the Array class: Array.IndexOf, Array.Sort, Array.BinarySearch, Array.Reverse, etc.

The last issue that List has is that, as an array wrapper, it needs contiguous memory space. There will be times where your code will fail not because there is not enough free memory, but because it is fragmented and the runtime cannot find a free block that is large enough for your data.

Better solutions


Before I start spouting all kinds of stupid things, I will direct you to the venerable C5 collection library, which is very well designed, documented and tested. It contains all kinds of containers to optimize whichever scenario you might have been thinking about.

Let's think solutions now. The major problem of this implementation is the need of a continuous array. Why not solve it by adding more arrays instead of replacing the one with another twice as large? When the capacity is exceeded, we create a new array of similar size, then link it to our list. And since we want to have index access, not linked list access, we need to add this array into an array of arrays.

What would that mean? Memory doesn't need to be contiguous. Adding is twice as fast. Inserting is fast, also, as you only need to insert a new array between existing arrays and move around the data in a single inner array. Accessing by index is a bit more complicated, but not by much. Removal is just as simple, with the added bonus that some inner arrays might become empty and be removed directly.

This is by far not "the best" option, but just one level of optimization that tries to fix the biggest single problem with the current implementation. As one of my friends used to say: first collect data about your program, see where the bottlenecks are, then proceed to fix them in decreasing order of importance. The source can be found on GitHub.

Notes


A tester program of sorts is showing the Count, Capacity and time for random operations for a normal list and the FragmentList<T>. The next line shows the individual lengths of the inner arrays. Note that I cheated by using Lists instead of arrays. It only makes sense, as the simplistic operations of List<T> now have a limited negative impact on individual fragments. Take note of AutoDefragmentThreshold, which is 0.8 by default. It replaces all the internal lists with a single contiguous one when there are more than 80% of internal lists that are smaller than a tenth of the total count. To disable the feature you need to set the value to more than 1, not 0. I also implemented all the public methods of List<T>, although you might only need to implement IList<T> and the *Range methods.

Well, enjoy! Hope it helps.

and has 1 comment
Today I've learned something old. Yeah, it has been there from .NET 2.0 and probably at one time or another I knew about it, then I just forgot about it. I am talking about the string.Split method overload that accepts a StringSplitOptions enumeration. In fact, it is an enum just in name, because it has only two possible values: None and RemoveEmptyEntries.

I had forgotten that I can eliminate empty entries like that, even if, truth be said, string splitting with a regular expression and maybe then filtering the results with LInQ feels much better, both as readability and control over the result. So, what is your preferred method of splitting, say..., a sentence into words?
text.Split(" \t".ToCharArray(),StringSplitOptions.RemoveEmptyEntries);
or
Regex.Split(text,@"[ \t]+").Where(s=>!String.IsEmptyOrWhiteSpace(s));
? (I realize that in the case I am describing I don't need the LInQ at the end except in fringe cases, but it was an example)

No, it's not about mine, although this blog has had its ups and down. What I want to talk about is the list of blogs I am following and how it (d)evolved.

When I was an enthusiastic beginner in software development I was hunting for interesting blogs that would give me valuable insights into the minds of good developers, the quirks of frameworks, the hidden tools and processes that would make my life better. I was adding blog after blog to my RSS list. Later on, I kind of stopped. I had things to do, work to be done and unfortunately went through some jobs that were not conducive to learning. Perhaps seeing myself as an expert also hindered enthusiasm in learning (note to self: don't do that!). The obsolescence of the tool I was using to read RSS with and the death of Google Reader also did not help. So recently I just went back to that list of blogs and started organizing it with a new tool. I use Feedly now, in case you were wondering.

Today I had an epiphany. I have over 150 blogs that I am "following", 100 of which are software related, yet only very few of them are actually spewing content anymore. In my three year hiatus from blog reading most of the technical blogs just ... stopped. Some of them just plain vanished, complete with content that I had linked to in my own articles. At that time I was considering blogs as permanent as you can get. I mean people just write stuff for the heck of it, so others can read and learn. There would be no reason for any of this to disappear - there are still pages from 1990 active on the Internet, for crying out loud! So what happened?

One theory is that blogs were created as representations of a person's evolution. For example you are a good WPF programmer and you create something like Dr. WPF's blog. When you stop doing WPF (because Microsoft dropped the ball with it!) you stop writing. Perhaps the author still blogs in other places, other blogs that are thematic, I don't know. Another theory is that people just blog at a certain stage in their life; it's like a quarter-life crisis. When they mature, people stop blogging (which says something...). Maybe the social media explosion pushed people away from personalized platforms and they do all their publishing on Twitter, Facebook, LinkedIn, Medium and so on. As the IT industry moves at an ever increasing pace, the blogs may turn into antiquated relics that are obsolete by the time several posts have been published.

I feel sad either way. When I started blogging, people would come to me for help. After all I started the site years before StackOverflow arrived on the scene. I would write about programming, books, anime, life, personal ideas, jokes, space, science, rants, whatever. It happened several times that I was looking for a solution to a problem and found myself explaining it in an older post. People still praise some posts because they refer technology that is maybe a decade old. Others for getting the full picture on how I got to the end result. So for all these vanishing blogs, I feel a sense of loss for all the knowledge that was lost, for all the voices that turned silent.

I know that as a blog dies ten others appear, but there is no sense of origin anymore, no chronological timeline of the evolution of the person writing. I can even go down the "they are not making them like they used to" road. For me a blog would have functioned as a sort of resumé of someones's work. If I liked an article, I would look at others, maybe subscribe. This way I would be connected not with a concept, but with a person: as they grew, I grew. And SEO be damned, I don't care people don't discover my blog anymore because Google can't make up its mind on what I am actually writing about. When people do come, they see me, not just disparate out of context solutions to their 5 minute problems.

So I wrote this article to express my sorrow. I guess that I miss my friends, even if they never knew me.

and has 0 comments
Some times you need to repeat a code block a number of times and the solution is often a for block.
for (var i=0; i<n; i++)...
This is a complex line to write and most importantly obscures the intent of the code. Wouldn't it be better to have some kind of construct that says "repeat N times" and be intuitively easy to understand? Well there is one:
while (n-->0) ...
No, it isn't some C# construct that you have not heard of before, it's a while loop that checks on the value of n, then decrements it. But it looks great! It almost reads as "while n moves to 0". I liked it and I thought I should share.