I have been working on a REST API lately and, while using Entity Framework or some other similar framework to abstract the database is certainly possible, I wanted to control every aspect of the implementation. I know, reinventing wheels, but this is how one learns. One of the most annoying bits was trying to translate some complex object from JSON to the (two dimensional) database relational tables. This post will explore my attempts and the solutions I have found.

My first attempt was straightforward: just send all types as DataTables, with some extra property to define identity and parent entity. This relies on the Microsoft Server SQL mechanism that allows sending of table variables to stored procedures. But this approach has several downsides. One of them is that in order to send a datatable to SQL you need... DataTables. As I have pointed out in several blog posts, the DataTable object is slow, and sometimes downright buggy. Even if I didn't care about performance that much, in order for SQL to receive the content of the DataTable one must create corresponding User Defined Types on the database side. Working with UDTs is very difficult for several reasons: you cannot alter a UDT (unless employing some SQL voodoo that changes system tables), you can only drop it and recreate it. This does not work if you use the UDT anywhere, so a lot of renaming needs to be done. Even if you automate the process, it's still very annoying. Then the UDT definition has to be an exact duplicate of the DataTable definition. Move some columns around and it fails. Debugging is also made difficult by the fact that the SQL profiler does not see the content of table variables when sending them to the server.

Long story short, I was looking for alternatives and I found XML. Now, you might think that this leads to a simple, and maybe even obvious, solution. But it's not that easy. Imagine that you send a list of objects in an XML. Each object is represented by an XML element and each property by a child element. In order to get the value of a property you need to do iterate through all the nodes, for each node find the properties, for each property find the one element that defines it, then get the attribute value or content of the property, all while making sure you select everything in a table. It's not that easy.

The solution I found, which simplifies the SQL code (and hopefully brings some well needed performance to the table) is to serialize the objects in a way that makes the selection simple enough. Here is an example: I have a Configuration object with an Id and a Name that also has a property called Servers, containing Server objects having an Id and a Url. Here is an example of XML serialization from the DataContractSerializer:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration xmlns="http://schemas.datacontract.org/2004/07/SerializationTest" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Id>1</Id>
<Name>Test Config</Name>
<Servers>
<Server>
<Id>1</Id>
<Url>http://some.url</Url>
</Server>
<Server>
<Id>2</Id>
<Url>http://some.other.url</Url>
</Server>
</Servers>
</Configuration>

The SQL code to get the information from an XML variable with this content would look like this:

DECLARE @Xml XML='<?xml version="1.0" encoding="UTF-8"?>
<Configuration xmlns="http://schemas.datacontract.org/2004/07/SerializationTest" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Id>1</Id>
<Name>Test Config</Name>
<Servers>
<Server>
<Id>1</Id>
<Url>http://some.url</Url>
</Server>
<Server>
<Id>2</Id>
<Url>http://some.other.url</Url>
</Server>
</Servers>
</Configuration>'


;WITH XMLNAMESPACES(DEFAULT 'http://schemas.datacontract.org/2004/07/SerializationTest')
SELECT T.Item.value('(Id/text())[1]','INT') as Id,
T.Item.value('(Name/text())[1]','NVARCHAR(100)') as Name
FROM @Xml.nodes('//Configuration') as T(Item)

;WITH XMLNAMESPACES(DEFAULT 'http://schemas.datacontract.org/2004/07/SerializationTest')
SELECT T.Item.value('(Id/text())[1]','INT') as Id,
T.Item.value('(Url/text())[1]','NVARCHAR(100)') as Url
FROM @Xml.nodes('//Configuration/Servers/Server') as T(Item)


This works, but look at that code. In my case, the situation was worse, the object I was using was a wrapper which implemented IDictionary<string,object> and, even if it did implement ISerializable, both XmlSerializer and DataContractSerializer use the dictionary as their data and in the end I get ugly key elements and value elements that are even harder to get to and, I suppose, more inefficient to parse. Therefore I found the solution in IXmlSerializable, (yet) another serialization interface used exclusively by XML serializer classes. If every simple value would be saved as an attribute and every complex object in an element, then this could be the SQL code:

DECLARE @Xml XML='<?xml version="1.0" encoding="UTF-8"?>
<Configuration xmlns="http://schemas.datacontract.org/2004/07/SerializationTest" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Config Id="1" Name="Test Config">
<Servers>
<List>
<Server Id="1" Url="http://some.url" />
<Server Id="2" Url="http://some.other.url" />
</List>
</Servers>
</Config>
</Configuration>'


;WITH XMLNAMESPACES(DEFAULT 'http://schemas.datacontract.org/2004/07/SerializationTest')
SELECT T.Item.value('@Id','INT') as Id,
T.Item.value('@Name','NVARCHAR(100)') as Name
FROM @Xml.nodes('//Configuration/Config') as T(Item)

;WITH XMLNAMESPACES(DEFAULT 'http://schemas.datacontract.org/2004/07/SerializationTest')
SELECT T.Item.value('@Id','INT') as Id,
T.Item.value('@Url','NVARCHAR(100)') as Url
FROM @Xml.nodes('//Configuration/Config/Servers/List/Server') as T(Item)


Much easier to read and hopefully to parse.

I am not going to write here about the actual implementation of IXmlSerializable. There are plenty of tutorials on the Internet about that. It's not pretty, :) but not too difficult, either.

What was the purpose of this exercise? Now I can send a complex object to SQL in a single query, making inserts and updates simple and not requiring at a call for each instance of each type of complex object. Now, is it fast? I have no idea. Certainly if performance is needed, perhaps the UDT/DataTable approach is faster. However you will have to define a type for each type that you send as a DataTable to a stored procedure. An alternative can be a binary serializer and a CLR SQL function that translates it into tables. However, in my project I need to easily implement very custom API methods and to control every aspect, including tracing and profiling the various SQL calls. I believe the customized IXmlSerializable/XML in SQL approach is a reasonable one.

As always, I hope this helps someone.

I was debugging an application when I've noticed that there was an exception thrown in one of the legacy DAL modules. A method was using regular ADO.Net to run a stored procedure, fill a DataSet, then proceeded on getting an IDataReader for the set, using the CreateDataReader method. On reader.Read() the exception DataTableReader is invalid for current DataTable 'Table' was thrown.



I've investigated the issue only to notice that the DataSet was correctly retrieved from the database, the only problem came when creating the reader. Any meaningful property threw this error. I've found a thread that discussed this problem here, but the answer was not there. Instead, I read an obscure line somewhere that said this exception is thrown when the DataSet has changes and tried that solution: before running CreateDataReader, I ran DataSet.AcceptChanges. And it worked!



The strange part comes now: I did the AcceptChanges bit in the Watch window during debug, not as a permanent change to the code. From then on, the code worked, no matter how many times I ran iisreset or restarted the browser. I've added the solution in the code for good measure, but I am still not sure if this is the solution or simply some fluke of the universe. One possible answer is the "race condition" described in this discussion, which also suggests this happens in debug mode only. Strange, innit?



Update: AcceptChanges did not solve the issue on the production server. I am still investigating, but if you know what this is about, please share :)

I have been working on a silly little project that involved importing files and then querying the data in a very specific way. And I wanted to do it with the latest technologies so I used The Entity Framework! (imagine a little glowing halo around that name and a choir in the background).

Well, how do I do an efficient import in Linq to Entities? I can't! At most I can instantiate a lot of classes and add them to the DataModel, then SaveChanges. In the background this translates to a lot of insert statements. So it occurred to me that I don't really need Entities here. All I needed is good old fashioned ADO.Net and a SqlBulkCopy object. So I used it like that. A bit of unfortunate translation of objects to a DataTable because the SqlBulkCopy class knowns how to import only a DataTable and I was set.

Ok, now back to the querying the data. I could have used ADO.Net, of course, and in this project, I would probably have been right, but I suspected the requirements for the project will change so I used Entities. It worked like a charm and yes, the requirements did get bigger and stranger as I went. But then I had to select the users that have amassed a number of rows in two related tables (or use a value in the user table) but only if the total number of amassed rows would satisfy a formula based on a string column in the user table that mapped to a certain value stored in the web.config a complicated query. I did it (with some difficulty) in Linq, then I had to solve all kind of weird issues like not being able to compare a class variable with an enum value because the number of types that can be used in a Linq to Entities query is pretty limited at the moment.

Well, was it the best way? I don't know. The generated SQL is something containing a lot of select from select from select sometimes 6 or 7 levels deep. Even the joining is done with select from tables. Shouldn't I have used a stored procedure instead?

To top it off, I listened to a podcast today about Object Databases. They do away with ORMs because there is no relational to begin with. The guy argued that if you need to persist objects, wouldn't an Object Database be more appropriate? And, as reporting would be a bitch when having to query large amounts of tabular data, shouldn't one use a Relational Database for this particular task in the project?

So this is it. It got me thinking. Is the database/data access layer the biggest golden hammer out there? Are we trying to use a single data access model and then build our applications around it like big twisting spaghetti golden nails?

I wanted to use this Accordion control on a page and so I specified the ContentTemplate and HeaderTemplate and gave it a DataTable as a DataSource and DataBound it. Nothing! No errors, no warnings, no display of any kind. After a few frustrating minutes of trying to solve it I asked buddy Google about it and it turned out that the Accordion control MUST receive a DataView as a DataSource and nothing else. Using datatable.DefaultView solved it.

Update 19 February 2016:
I've done the test again, using another computer and .Net 4.6.1. The speed of filling the DataTableReplacement class given at the end of the article, plus copying the data into a DataTable object is 30% faster than using a DataTable directly with BeginLoadData/EndLoadData and 50% faster than using DataTable without the LoadData methods.

Now for the original post:

It was about time I wrote a smashing IT entry. Here is to the obnoxious DataTable object, something about I have written before of bugs and difficulty in handling. Until now I haven't really thought about what kind of performance issues I might face when using it. I mean, yeah, everybody says it is slow, but how slow can it be? Twice as slow? Computers are getting faster and faster, I might not need a personal research into this. I tried to make a DataTable replacement object once and it was not really compatible with anything that needed DataTables so I gave up. But in this article I will show you how a simple piece of code became 7 times faster when taking into account some DataTable issues.

But let's get to the smashing part :) I was using C# to transform the values in a column from a DataTable into columns. Something like this:

NameColumnValue
GeorgeMoney100
GeorgeAge31
GeorgeChildren1
JackMoney150
JackAge26
JackChildren0
JaneMoney300
JaneAge33
JaneChildren2



and it must look like this:

NameMoneyAgeChildren
George100311
Jack150260
Jane300332



I have no idea how to do this in SQL, if you have any advice, please leave a comment.
Update: Here are some links about how to do it in SQL and SSIS:
Give the New PIVOT and UNPIVOT Commands in SQL Server 2005 a Whirl
Using PIVOT and UNPIVOT
Transposing rows and columns in SQL Server Integration Services

Using PIVOT, the SQL query would look like this:

SELECT * 
FROM #input
PIVOT (
MAX([Value])
FOR [Column]
IN ([Money],[Age],[Children])
) as pivotTable


Anyway, the solution I had was to create the necessary table in the code behind add a row for each Name and a column for each of the distinct value of Column, then cycle through the rows of the original table and just place the values in the new table. All the values are present and already ordered so I only need to do it using row and column indexes that are easily computed.

The whole operation lasted 36 seconds. There were many rows and columns, you see. Anyway, I profiled the code, using the great JetBrains dotTrace program, and I noticed that 30 seconds from 36 were used by DataRow.set_Item(int, object)! I remembered then that the DataTable object has two BeginLoadData and EndLoadData methods that disable/enable the checks and constraints in the table. I did that and the operation went from 36 to 27 seconds.

Quite an improvement, but the bottleneck was still in the set_Item setter. So, I thought, what will happen if I don't use a DataTable at all. After all, the end result was being bound to a GridView and it, luckily, knows about object collections. But I was too lazy for that, as there was quite a complicated binding code mess waiting for refactoring. So I just used a List of object arrays instead of the DataTable, then I used DataTable.Rows.Add(object[]) from this intermediary list to the DataTable that I originally wanted to obtain. The time spent on the operation went from... no, wait

The time spent on the operation went from the 27 seconds I had obtained to 5! 5 seconds! Instead of 225.351 calls to DataRow.set_Item, I had 1533 calls to DataRowCollection.Add, from 21 seconds to 175 miliseconds!

Researching the reflected source of System.Data.dll I noticed that the DataRow indexer with an integer index was going through

DataColumn column=_columns[index]; return this[column];

How bad can it get?! I mean, really! There are sites that recommend you find the integer index of table columns and then use them as integer variables. Apparently this is NOT the best practice. Best is to use the DataColumn directly!

So avoid the DataRow setter.

Update July 18, 2013:

Someone requested code, so here is a console application with some inline classes to replace the DataTable in GridView situations:

class Program
{
    static void Main(string[] args)
    {
        fillDataTable(false);
        fillDataTable(true);
        fillDataTableWriter();
        Console.ReadKey();
    }

    private static void fillDataTable(bool loadData)
    {
        var dt = new DataTable();
        dt.Columns.Add("cInt", typeof(int));
        dt.Columns.Add("cString", typeof(string));
        dt.Columns.Add("cBool", typeof(bool));
        dt.Columns.Add("cDateTime", typeof(DateTime));
        if (loadData) dt.BeginLoadData();
        for (var i = 0; i < 100000; i++)
        {
            dt.Rows.Add(dt.NewRow());
        }
        var now = DateTime.Now;
        for (var i = 0; i < 100000; i++)
        {
            dt.Rows[i]["cInt"] = 1;
            dt.Rows[i]["cString"] = "Some string";
            dt.Rows[i]["cBool"] = true;
            dt.Rows[i]["cDateTime"] = now;
        }
        if (loadData) dt.EndLoadData();
        Console.WriteLine("Filling DataTable"+(loadData?" with BeginLoadData/EndLoadData":"")+": "+(DateTime.Now - now).TotalMilliseconds);
    }

    private static void fillDataTableWriter()
    {
        var dt = new DataTableReplacement();
        dt.Columns.Add("cInt", typeof(int));
        dt.Columns.Add("cString", typeof(string));
        dt.Columns.Add("cBool", typeof(bool));
        dt.Columns.Add("cDateTime", typeof(DateTime));
        for (var i = 0; i < 100000; i++)
        {
            dt.Rows.Add(dt.NewRow());
        }
        var now = DateTime.Now;
        for (var i = 0; i < 100000; i++)
        {
            dt.Rows[i]["cInt"] = 1;
            dt.Rows[i]["cString"] = "Some string";
            dt.Rows[i]["cBool"] = true;
            dt.Rows[i]["cDateTime"] = now;
        }
        var fillingTime = (DateTime.Now - now).TotalMilliseconds;
        Console.WriteLine("Filling DataTableReplacement: "+fillingTime);
        now = DateTime.Now;
        var newDataTable = dt.ToDataTable();
        var translatingTime = (DateTime.Now - now).TotalMilliseconds;
        Console.WriteLine("Transforming DataTableReplacement to DataTable: " + translatingTime);
        Console.WriteLine("Total filling and transforming: " + (fillingTime+translatingTime));
    }
}

public class DataTableReplacement : IEnumerable<IEnumerable<object>>
{
    public DataTableReplacement()
    {
        _columns = new DtrColumnCollection();
        _rows = new DtrRowCollection();
    }

    private readonly DtrColumnCollection _columns;
    private readonly DtrRowCollection _rows;

    public DtrColumnCollection Columns
    {
        get { return _columns; }
    }

    public DtrRowCollection Rows { get { return _rows; } }

    public DtrRow NewRow()
    {
        return new DtrRow(this);
    }

    public DataTable ToDataTable()
    {
        var dt = new DataTable();
        dt.BeginLoadData();
        _columns.CreateColumns(dt);
        _rows.CreateRows(dt);
        dt.EndLoadData();
        return dt;
    }

    #region Implementation of IEnumerable

    public IEnumerator<IEnumerable<object>> GetEnumerator()
    {
        foreach (var row in _rows)
        {
            yield return row.ToArray();
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    #endregion
}

public class DtrRowCollection : IEnumerable<DtrRow>
{
    private readonly List<DtrRow> _rows;

    public DtrRowCollection()
    {
        _rows = new List<DtrRow>();
    }

    public void Add(DtrRow newRow)
    {
        _rows.Add(newRow);
    }

    public DtrRow this[int i]
    {
        get { return _rows[i]; }
    }

    public void CreateRows(DataTable dt)
    {
        foreach (var dtrRow in _rows)
        {
            dt.Rows.Add(dtrRow.ToArray());
        }
    }

    #region Implementation of IEnumerable

    public IEnumerator<DtrRow> GetEnumerator()
    {
        return _rows.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    #endregion
}

public class DtrRow
{
    private readonly object[] _arr;
    private readonly DataTableReplacement _dtr;

    public DtrRow(DataTableReplacement dtr)
    {
        _dtr = dtr;
        var columnCount = _dtr.Columns.Count;
        _arr = new object[columnCount];
    }

    public object this[string columnName]
    {
        get
        {
            var index = _dtr.Columns.GetIndex(columnName);
            return _arr[index];
        }
        set
        {
            var index = _dtr.Columns.GetIndex(columnName);
            _arr[index] = value;
        }
    }

    public object this[int columnIndex]
    {
        get
        {
            return _arr[columnIndex];
        }
        set
        {
            _arr[columnIndex] = value;
        }
    }

    public object[] ToArray()
    {
        return _arr;
    }
}

public class DtrColumnCollection
{
    private readonly Dictionary<string, int> _columnIndexes;
    private readonly Dictionary<string, Type> _columnTypes;

    public DtrColumnCollection()
    {
        _columnIndexes = new Dictionary<string, int>();
        _columnTypes = new Dictionary<string, Type>();
    }

    public int Count { get { return _columnIndexes.Count; } }

    public void Add(string columnName, Type columnType)
    {
        var index = _columnIndexes.Count;
        _columnIndexes.Add(columnName, index);
        _columnTypes.Add(columnName, columnType);
    }

    public int GetIndex(string columnName)
    {
        return _columnIndexes[columnName];
    }

    public void CreateColumns(DataTable dt)
    {
        foreach (var pair in _columnTypes)
        {
            dt.Columns.Add(pair.Key, pair.Value);
        }
    }
}


As you can see, there is a DataTableReplacement class which uses three other classes instead of DataColumnCollection, DataRowCollection and DataRow. For this example alone, the DtrRowCollection could have been easily replaced with a List<DtrRow>, but I wanted to allow people to replace DataTable wherever they had written code without any change to the use code.

In the example above, on my computer, it takes 1300 milliseconds to populate the DataTable the old fashioned way, 1000 to populate it with BeginLoadData/EndLoadData, 110 seconds to populate the DataTableReplacement. It takes another 920 seconds to create a new DataTable with the same data (just in case you really need a DataTable), which brings the total time to 1030. So this is the overhead the DataTable brings for simple scenarios such as these.

You are trying to use a WebMethod or a web service ScriptMethod in Javascript and you get an InvalidOperationException saying something about a circular reference. It happened to me when trying to read a DataTable in Javascript.

Why. The Javascript serialization of DataSets, DataTables and DataRows was available once in the ASP.Net Ajax web extensions. That's why you probably found a lot of Google results with people that could either only serialize DataSets, but not DataTables, or people that made it work by magic by adding some lines in the converters section of web.config, things that can't possibly work with your setup. Then, the option was removed in the final version of ASP.Net Ajax, only to be readded in the ASP.Net Futures, which is a test playground for future features of the platform.

What. There are several options, one being to reference an older version of ASP.Net Ajax and uses the converters there. But why bother? It's unlikely you use the DataTable or some other object in Javascript with all the options of the C# object. You probably just want to itterate through rows and read properties. So build your own converter.

How. Create a new library project. Add a class named DataTableConverter that inherits from JavaScriptConverter, and implement: IEnumerable<Type> SupportedTypes, IDictionary<string, object> Serialize(object obj, JavaScriptSerializer serializer) and object Deserialize(IDictionary<string, object> dictionary, Type type, JavaScriptSerializer serializer).

You probably won't need to Deserialize anything, you can leave that unimplemented. The list of convertible types is easy enough, all you are left with is the Serialize code, which is actually very easy, too. Then all you need to do is add this in the web.config file:
<system.web.extensions>
<scripting>
<webServices>
<jsonSerialization >
<converters>
<add name="DataTableAjaxFix" type="AjaxTypeConverters.DataTableConverter"/>
</converters>
</jsonSerialization>


And here is the complete C# code of my DataTableConverter, but you can easily adapt it to anything:
using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.Web.Script.Serialization;

namespace AjaxTypeConverters
{
public class DataTableConverter : JavaScriptConverter
{
public override IEnumerable<Type> SupportedTypes
{
get { return new Type[] {typeof (DataTable)}; }
}

public override object Deserialize(IDictionary<string, object> dictionary, Type type,JavaScriptSerializer serializer)
{
throw new NotImplementedException();
}

public override IDictionary<string, object> Serialize(object obj, JavaScriptSerializer serializer)
{
DataTable listType = obj as DataTable;

if (listType != null)
{
// Create the representation.
Dictionary<string, object> result = new Dictionary<string, object>();
ArrayList itemsList = new ArrayList();
foreach (DataRow row in listType.Rows)
{
//Add each entry to the dictionary.
Dictionary<string, object> listDict = new Dictionary<string, object>();
foreach (DataColumn dc in listType.Columns)
{
listDict.Add(dc.ColumnName, row[dc]);
}
itemsList.Add(listDict);
}
result["Rows"] = itemsList;

return result;
}
return new Dictionary<string, object>();
}
}
}

You may have noticed that in debug mode, in Visual Studio, you have a little magnifier glass next to some of the variables in Autos, Local or Watch debug windows. Once you click on it, you get to visualize your data in a more comprehensive way. A good example are the DataSet Visualizer or the DataTable Visualizer which show you in a normal DataGridView a DataSet or DataTable.

The good news is that you can build your own visualizers and that in a very simple way. Here are the quick steps to achieving this, followed by some links to other people detailing:

  1. Create a new Visual Studio class library project
  2. Add a reference to the Microsoft.VisualStudio.DebuggerVisualizers library you can find directly in the .NET tab
  3. Go to Add New Item and choose Debugger Visualizer. That will create a small class for you with ToDos and stuff like that. What is important is that you don't really need to declare the Type of your data object in the class, as suggested by the template.
  4. Remove everything from the class except the override of the Show method
  5. Change the Show method in order to use your own data type.
  6. Add a reference to System.Windows.Forms
  7. Add a Windows Form to your library and make it display your data the way you like it
  8. Add the following lines to decorate the namespace of your visualizer class:

[assembly : DebuggerVisualizer(typeof (--YourVisualizer--),
Target = typeof (--Your Type--),
Description = "--Your Type-- Visualizer")]
namespace ...


Warning: use a different description from whatever default or own visualizers that are already there. Use stuff like "Siderite's DataTable Visualizer" not "DataTable Visualizer", since there is already an out-of-the-box visualizer with the same name and you won't get to see yours.

Now compile. The resulting DLL can be either copied in My Documents\Visual Studio 2005\Visualizers in this case any contained visualizers will be available only to that user or in C:\Program Files\Microsoft Visual Studio 8\Common7\Packages\Debugger\Visualizers to make the available to all users.

That is it! Now links to others explaining in more detail:
Writing a Visualizer at MSDN
Post on Debugger Visualizers from 4GuysFromRolla
Julia Lerman on Debug Visualizers

Update:
A problem with this solution is that the debugger visualizer expects your target type to be ISerializable. But what if it is not? The solution is to add another parameter to the DebuggerVisualizerAttribute like this:
[assembly : DebuggerVisualizer(typeof (--YourVisualizer--),
typeof (--YourVisualizerObjectSource--),
Target = typeof (--Your Type--),
Description = "--Your Type-- Visualizer")]
namespace ...


You see, the debugging is done through communication between a debuggee and a debugger. The VisualizerObjectSource is the debugee and the default one tries to serialize the target object and send it to the debugger. What you have to do it create your own class, inheriting from VisualizerObjectSource and overriding public void GetData(object target, System.IO.Stream outgoingData). This method has access to the actual object, so you can transform it into any other object, one that can be serializable.

A simple example is a DataView or a DataRow. You take the DataRow, you add it to a Table and you return the DataTable, which is serializable.

Another issue you might stumble upon is a weird Access Denied error for the DLL containing the visualizers, especially after adding a VisualizerObjectSource to the library. The solution is to add trust level to full to the site you are debugging. I am looking for a more elegant solution, but so far what you have to do is add this to the web.config of the site you are debugging:

<system.web>
<trust level="Full" originUrl="" />
</system.web>

More links, specific to this update:
RemoteObjectSourceException: Graphics is not marked as serializable
Visualizers For Web Debugging

Update (21Th of March 2008): Very important, you need to add the CreateChildControls override in order to work. Otherwise, because of something I can only consider a GridView bug, this will happen: the last page in a mock paging grid will have, let's say, 2 items when the PageSize is 10; on a postback, the number of rows created by the gridview will be 10! even if only 2 have data. Thus, after a postback that doesn't rebind the data in the grid, the GridView.Rows.Count will be PageSize, not the actual bound number.

Update: Recreated the code completely. Now it has both PageIndex and ItemCount.
Also: Actually there is a way to get only the rows that you need in SQL Server 2005. It is a function called Row_Number and that returns the index number of a row based on a certain ordering. Then you can easily filter by it to take items from 20 to 30, for example. In this case, another interesting property of the PagedDataSource is CurrentPageIndex, to set the displayed page number in the pager.

Now, for the actual blog entry.

Why would anyone want to change the PageCount, you ask? Well, assume you have a big big table, like hundreds of thousands of rows, and you want to page it. First you must put it in a DataTable from the SQL server, so that takes time, then the table set a datasource to the GridView, then it implements the paging.

Wouldn't it be nicer to only get the data that you need from the SQL Server, then change the PageCount to show the exact page count that should have been? However, the PageCount property of the GridView is read-only. One quick solution is to get only the data you need, then fill the resulting DataTable with empty rows until you get the real row count. However, adding empty rows to DataTables is excruciatingly slow, so you don't really gain anything, and the Grid works with a big table anyway.

So this is what you do:
First of all determine how much of the data to gather.

Afterwards you need to trick the GridView into creating a Pager that shows the real row count (and possibly page index). Unfortunately you can't do this from outside the GridView. You need to inherit the GridView control and add your stuff inside. After you do this, you need to override the InitializePager method, which is just about the only protected virtual thing related to Paging that you can find in the GridView.

Code:

using System.Web.UI.WebControls;

namespace Siderite.Web.WebControls
{
public class MockPagerGrid : GridView
{
private int? _mockItemCount;
private int? _mockPageIndex;

///<summary>
/// Set it to fool the pager item Count
///</summary>
public int MockItemCount
{
get
{
if (_mockItemCount == null)
{
if (ViewState["MockItemCount"] == null)
MockItemCount = Rows.Count;
else
MockItemCount = (int) ViewState["MockItemCount"];
}
return _mockItemCount.Value;
}
set
{
_mockItemCount = value;
ViewState["MockItemCount"] = value;
}
}

///<summary>
/// Set it to fool the pager page index
///</summary>
public int MockPageIndex
{
get
{
if (_mockPageIndex == null)
{
if (ViewState["MockPageIndex"] == null)
MockPageIndex = PageIndex;
else
MockPageIndex = (int) ViewState["MockPageIndex"];
}
return _mockPageIndex.Value;
}
set
{
_mockPageIndex = value;
ViewState["MockPageIndex"] = value;
}
}

///<summary>
///Initializes the pager row displayed when the paging feature is enabled.
///</summary>
///
///<param name="columnSpan">The number of columns the pager row should span. </param>
///<param name="row">A <see cref="T:System.Web.UI.WebControls.GridViewRow"></see> that represents the pager row to initialize. </param>
///<param name="pagedDataSource">A <see cref="T:System.Web.UI.WebControls.PagedDataSource"></see> that represents the data source. </param>
protected override void InitializePager(GridViewRow row, int columnSpan, PagedDataSource pagedDataSource)
{
if (pagedDataSource.IsPagingEnabled && (MockItemCount != pagedDataSource.VirtualCount))
{
pagedDataSource.AllowCustomPaging = true;
pagedDataSource.VirtualCount = MockItemCount;
pagedDataSource.CurrentPageIndex = MockPageIndex;
}
base.InitializePager(row, columnSpan, pagedDataSource);
}

protected override int CreateChildControls
(System.Collections.IEnumerable dataSource, bool dataBinding)
{
PageIndex = MockPageIndex;
return base.CreateChildControls(dataSource, dataBinding);
}
}
}


What, what, whaaat? What is a PagedDataSource? Inside the GridView, the paging is done with a PagedDataSource, a wrapper around a normal DataSource, which has some of the GridView paging properties like PageSize, PageCount, etc. Even if the PageCount is also a read-only property, you have the AllowCustomPaging property and then the VirtualCount and CurrentPageIndex properties that you can set.

In other words: the pager is initialized at databinding. Set MockItemCount and MockPageIndex before MockPagerGrid.DataBind();

That's it.

Update: People keep asking me to provide a code sample. Let's try together. First, let's see a classic GridView use example:

gridView.DataSource=getDataSource();
gridView.PageIndex=getPageIndex();
gridView.DataBind();
As you can see, we provide a data source programatically, then set the pageindex (let's assume we took it from the URL string) and then call DataBind(). In this situation, we would load the entire data source (say, 10000 rows) then give it to the grid, which would only render something like 20 rows. Very inefficient.

Now, let's replace the original GridView control with the with MockPagerGrid. The code would look like this:

mockPagerGrid.DataSource=getDataSource(2);
mockPagerGrid.MockPageIndex=getPageIndex();
mockPagesGrid.MockItemCount=10000;
mockPagerGrid.DataBind();
This gets the rows for the second page, sets the mock ItemCount and PageIndex to the total number of rows and the page we want and then calls DataBind(). In this situation getDataSource would load only the 20 rows of page 2, would display it, then the pager would show that it is on page 2 out of 500.

This is a very simple example. It assumes you already know the total number of rows returned. A more complete example would look like this:

// starting with an arbitrary page index
var pageIndex=getPageIndex();
// do operations on the database that would return the rows for the page
// with that index, having the size of the page size of the grid
// and also get the total number of rows in the data source
CustomDataSource dataSource=getDataSource(pageIndex,mockPagerGrid.PageSize);
// set the returned rows as the data source
mockPagerGrid.DataSource=dataSource.Rows;
// set the page index
mockPagerGrid.MockPageIndex=pageIndex;
// set the total row count
mockPagesGrid.MockItemCount=dataSource.TotalRowCount;
// databind
mockPagerGrid.DataBind();

// CustomDataSource would only have two properties: Rows and TotalRowCount
// The sql for getDataSource(index,size) would be something like
// SELECT COUNT(*) FROM MyTable -- get the total count
// SELECT * FROM MyTable WHERE RowIndex>=@index*@size
// AND RowIndex<(@index+1)*@size

// for convenience, I assumed that there is a column called RowIndex in
// the table that is set to the row index


Hopefully, this will help people use this code.

and has 0 comments
A while ago I wrote a post about the bug in the DataTable.Select method with columns with comma in their names.

Today I discovered another bug, when using DataTable.Select with an empty sort string, but not an empty filter string, there is an implicit sorting by the first column. Short example: a DataTable with a single column called "words" containing values c,b,a,d , when Selected with a filter like "words is not null" and a null or empty sort string, will return a,b,c,d.

The only solution for this was to drop DataTable.Select entirely and use DataView, with its NET 2.0 method DataView.ToTable. So the code to take a DataTable and return the filtered and sorted table would look like this:
public static DataTable Select(DataTable table, string filter, string sort)
{
if (table == null) return null;
var dv=new DataView(table);
dv.RowStateFilter=DataViewRowState.CurrentRows;
dv.RowFilter = filter;
dv.Sort = sort;
return dv.ToTable();
}

But DataView has the same problem with columns with comma in their names. We solve it in the same way we solved it in the previous post: we change the column names, the sort and filter strings, we select, then we change the column names back:
public static DataTable SelectSafe(this DataTable table, string filter, string sort)
{
var originalColumnNames = new Dictionary<string, string>();

foreach (DataColumn dc in table.Columns)
{
if (dc.ColumnName.IndexOf(',') > -1)
{
var columnName = dc.ColumnName;
var safeColumnName = columnName.Replace(",", ";");
var reg = new Regex(Regex.Escape("[" + columnName + "]"), RegexOptions.IgnoreCase);
dc.ColumnName = safeColumnName;
if (!String.IsNullOrEmpty(filter)) {
filter = reg.Replace(filter, "[" + safeColumnName + "]");
}
if (!String.IsNullOrEmpty(sort)) {
sort = reg.Replace(sort, "[" + safeColumnName + "]");
}
originalColumnNames[safeColumnName] = columnName;
}
}

var newTable = Select(table, filter, sort);

foreach (KeyValuePair<string, string> pair in originalColumnNames)
{
table.Columns[pair.Key].ColumnName = pair.Value;
newTable.Columns[pair.Key].ColumnName = pair.Value;
}
return newTable;
}

and has 0 comments
Well, basically, you can't do it.
I am looking at the internal ParseSortString(string sortString) in the DataTable object of NET 1.1, where the first thing the method does is to split the string by commas, then check for square brackets. This insures that there is no way to sort datatables by columns with commas in their names. The funny thing is that the filtering expression is treated like royalty by an ExpressionParser object, and allows column names with commas inside.
Now let's check the code for NET 2.0. It's identical.

The solution is a little annoying code like this:
private DataRow[] SelectSafe(DataTable dt, string filter, string sort)
{
var columns = new string[dt.Columns.Count];
for (var c=0; c<dt.Columns.Count; c++)
{
columns[c] = dt.Columns[c].ColumnName;
if (dt.Columns[c].ColumnName.IndexOf(',')>-1)
{
dt.Columns[c].ColumnName = dt.Columns[c].ColumnName.Replace(',', ';');
// assume that the column name was bracketed correctly in the select
sort = sort.Replace(
"[" + columns[c] + "]",
"[" + dt.Columns[c].ColumnName + "]");
}
}
var dr = dt.Select(filter, sort);
for (int c=0; c<dt.Columns.Count; c++) {
dt.Columns[c].ColumnName = columns[c];
}
return dr;
}


I am sure there is more elegant code, but this seems to be the only solution so far except manually sorting a DataTable.

This is a nice link about how to convert a UserControl to a WebControl.
Convert a Usercontrol to a WebControl - The Code Project - ASP.NET

Of course, afterwards it is best to take the time to really think the WebControl through, but for quick conversions like "I want a web control that has a datatable and a graph and another that is a textbox and a validator" it is perfect. Haven't really tested the result on real life user controls.