Escaping Regex replace patterns

Published May 5, 2007

Posted in
.NET
programming
C#

When one wants to replace a text, say, case insensitive, one uses the .NET Regex class. In order to make sure the text to be replaced is not interpreted as a Regex pattern, the Regex.Escape method is used. But what about the replacement string? What if you want to replace the text with "${0}", which in Regexian means "the entire matched string" ? You need to somehow escape the replace pattern.

~~I have no idea where to find this information on MSDN, although I am sure it is hidden somewhere in all that Regex labyrinth.~~ Here is the link on MSDN: Substitutions

So here is the info: You only need escaping the dollar sign, so the code would look like this:

Regex reg=new Regex("Text to replace",RegexOptions.CaseInsensitive);

string s="here is the text to replace";

s=Regex.Replace(s,"$${0}");

Now the value of s is "here is the ${0}".

How to put a databound ComboBox in a Column of a Datagrid

Published Apr 30, 2007

and has 0 comments

I found this little page while searching for a DataGridDropDownListColumn for NET 1.1 Windows Forms. It works and seems elegant enough.

To bind the column values, use column.myComboBox.DataSource, DisplayMember and ValueMember.

Saving Page ViewState out of the page!

Published Apr 26, 2007

Posted in
.NET
ASP.NET
programming
Ajax
C#

and has 0 comments

If you look for solutions to get rid of huge ViewStates from your pages you will get a lot of people telling you to override SavePageStateToPersistenceMedium and LoadPageStateFromPersistenceMedium in your pages and do complicated stuff like keeping the ViewState in the Cache or in a database, calculating strange keys, etc.

No more! Net 2.0 has something called a PageStatePersister. It is an abstract class and every Page has one. In case no override occurs, the default is a HiddenFieldPageStatePersister, but you can also use the provided SessionPageStatePersister like this:

protected override PageStatePersister PageStatePersister
{
get
{
return new SessionPageStatePersister(this);
}
}

And that's it! It works with Ajax and UpdatePanel, too.

However, this is no "Silver Bullet", as the SessionPageStatePersister will have issues with multiple windows open with the same session (like pop up windows) as exampled in this nice article. Also check out this situation when, during Ajax callbacks, a full ViewState is returned due to the ever troublesome ImageButtons.

There is no reason not to create your own PageStatePersister, though. The abstract class is public (not internal and sealed as Microsoft likes their most useful classes) and you can inherit it. You can even store the state in the Cache! :)

A very comprehensive article on ViewState is here.

Buggy DataTable Select strikes again: Safe Filtering without Sorting

Published Apr 25, 2007

Posted in
.NET
programming
C#

and has 0 comments

A while ago I wrote a post about the bug in the DataTable.Select method with columns with comma in their names.

Today I discovered another bug, when using DataTable.Select with an empty sort string, but not an empty filter string, there is an implicit sorting by the first column. Short example: a DataTable with a single column called "words" containing values c,b,a,d , when Selected with a filter like "words is not null" and a null or empty sort string, will return a,b,c,d.

The only solution for this was to drop DataTable.Select entirely and use DataView, with its NET 2.0 method DataView.ToTable. So the code to take a DataTable and return the filtered and sorted table would look like this:

public static DataTable Select(DataTable table, string filter, string sort)
{
    if (table == null) return null;
    var dv=new DataView(table);
    dv.RowStateFilter=DataViewRowState.CurrentRows;
    dv.RowFilter = filter;
    dv.Sort = sort;
    return dv.ToTable();
}

But DataView has the same problem with columns with comma in their names. We solve it in the same way we solved it in the previous post: we change the column names, the sort and filter strings, we select, then we change the column names back:

public static DataTable SelectSafe(this DataTable table, string filter, string sort)
{
    var originalColumnNames = new Dictionary<string, string>();
    
    foreach (DataColumn dc in table.Columns)
    {
        if (dc.ColumnName.IndexOf(',') > -1)
        {
            var columnName = dc.ColumnName;
            var safeColumnName = columnName.Replace(",", ";");
            var  reg = new Regex(Regex.Escape("[" + columnName + "]"), RegexOptions.IgnoreCase);
            dc.ColumnName = safeColumnName;
            if (!String.IsNullOrEmpty(filter)) {
                filter = reg.Replace(filter, "[" + safeColumnName + "]");
            }
            if (!String.IsNullOrEmpty(sort)) {
                sort = reg.Replace(sort, "[" + safeColumnName + "]");
            }
            originalColumnNames[safeColumnName] = columnName;
        }
    }

    var newTable = Select(table, filter, sort);

    foreach (KeyValuePair<string, string> pair in originalColumnNames)
    {
        table.Columns[pair.Key].ColumnName = pair.Value;
        newTable.Columns[pair.Key].ColumnName = pair.Value;
    }
    return newTable;
}

Visual Studio 2007 (Orcas) Beta 1 released

Published Apr 24, 2007

and has 0 comments

There is an article here that explains it better than I would. Basically, the new Visual Studio is finally released in beta and is supposed to be the IDE for .NET and Vista.
The .NET Framework 3.5 (that is 2.85 in normal versioning? :D ) is also to be released here.

Using Menu inside UpdatePanel

Published Apr 20, 2007

Posted in
.NET
ASP.NET
programming
Ajax
C#

and has 13 comments

A previous post of mine detailed the list of ASP.Net controls that cannot be used with UpdatePanel and ASP.Net Ajax. Since I provided a fix for the validators earlier on, I've decided to try to fix the Menu, as well. And I did! At least for my problem which involved using a two level dynamic menu inside an UpdatePanel.
Here is the code:

<script>
function FixMenu() {
    if (typeof(IsMenuFixed)!='undefined') return;
    if (!window.Menu_HideItems) return;
    window.OldMenu_HideItems=window.Menu_HideItems;
    window.Menu_HideItems=function(items) {
        try {
            OldMenu_HideItems(items);
        } catch(ex)
        {
            if (items && items.id) {
                PopOut_Hide(items.id);
            }
        }
    }
    IsMenuFixed=true;
}
</script>

Now all you have to do is load it at every page load:

ScriptManager.RegisterStartupScript(this,GetType(),"FixMenu","FixMenu();",true);

Explanation: the error I got was something like "0.cells is null or not an object", so I looked a little in the javascript code, where there was something like " for(i = 0; i < rows[0].cells.length; i++) {" and rows[0] was null. All this in a function called Menu_HideItems.

Solution 1: Copy the entire function (pretty big) and add an extra check for rows[0]==null.

Solution 2: Hijack the function, put it in a Try/Catch block and put the bit of the original function that appeared after the error in the catch. That I did.

ASP.Net Ajax compatible controls

Published Apr 18, 2007

and has 0 comments

Did you get a "0.cells is null or not an object" Javascript error while trying to use UpdatePanel and something like Menu or TreeView? The reason is that some controls are incompatible with UpdatePanel! Most amazingly, the newest controls seem to be the least compatible.

The web application you are attempting to access on this web server is currently unavailable

Published Apr 17, 2007

Posted in
.NET
ASP.NET
programming
IIS

and has 3 comments

You just copied a directory with an ASP.Net web site in your wwwroot directory, you created an ASP.Net application from IIS/Web Sites, yet you get an error, no matter what page you try to access: The web application you are attempting to access on this web server is currently unavailable.
It's an access issue. Give access to the directory to the ASPNET user.

Fixing validators in ASP.Net Ajax

Published Apr 12, 2007

Posted in
.NET
ASP.NET
programming
Ajax
C#

and has 3 comments

Update 2nd of July 2008:
The solution below doesn't always work. For a validator to not work you need to have .NET 2.0 installed without installing Service Pack 1 for it and you also need to add the validator dynamically (so it is not there when the page is loaded, but it appears there during an async postback).

The problem lies in the BaseValidator class itself, after the postback, the validator is not in the Page_Validators array. I've tried rehooking the validators, even enumerating all validators in the page and manually entering them in the Page_Validators array. It does not work. Mainly because the html spans of the validators are not the same thing as the validators and stuff like the id of the control to validate and other details are never rendered.

So you absolutely need to install the .Net 2.0 SP1 in order to use UpdatePanels with validators.
=====

Today I've encountered a strange error, where the validation did work inside an UpdatePanel, but the validator message would not appear. This in a project where I used validation in GridViews inside their own UpdatePanel and they worked!

So I guess part of the reason the error occurs could be any one of these:

validated control and validator are both in a UserControl
validator is loaded at start of the page, it does not appear during Ajax calls, but it is replaced during Ajax calls
validated control is a ListBox, worse, an object inherited from a ListBox

Anyway, the true reason why the validators behaved in this fashion was that they were different from the html node of the validators. In other words, the Javascript array Page_Validators was filled with the validators correctly, did have the evaluation function, did return isvalid=false, but then, when the style.display attribute was changed to inline, the validator was not part of the html page DOM, as it was changed by Ajax.

Solution: a Javascript function that enumerates the validators, gets the validator DOM node by way of document.getElementById, stores the validator properties into this node and then replaces the element in the Page_Validators array.
Problem: where should one load it? Possible solutions include submit button onclick events, form onsubmit events and in the Page_Load method. I would not use onclick, since I should have added onclick events on all possible submit buttons. I would not use RegisterOnSubmitStatement, since it ran the code after checking if the validators are valid or not (even if the validation itself took place afterward; now that is weird). The only solution is to use Page_Load.

There are various ways of doing that, too, since you could use MasterPages, custom made Page objects and even HttpHandlers or HttpModules. You also could have custom controls or objects that don't have a reference to the Ajax ScriptManager object or even to the Ajax library itself. Yes, I know, I am very smart. In my project, all the pages were inherited from a custom Page object, also in the project. So it was relatively easy.

Here is the code:


// in Page_Load
string script=@"
if (typeof(Page_Validators)!='undefined')
for (var i=0; i<Page_Validators.length; i++)
{
 // get DOM node
   var vld=document.getElementById(Page_Validators[i].id);
   if (vld) {
 for (var key in Page_Validators[i])
  // check if the Page_Validators element has extra attributes
  // and add them to the node
  if ((vld[key]==null)&&(Page_Validators[i][key]!==null))
   vld[key]=Page_Validators[i][key];
 // replace the Page_Validators element with the reconstructed validator
 Page_Validators[i]=vld;
   }
}
";
// get current ScriptManager for this page
ScriptManager sm=ScriptManager.GetCurrent(this);
if ((sm!=null)&&(sm.IsInAsyncPostback)) {
 // if we did an Ajax postback, fix validators
 ScriptManager.RegisterStartupScript(Page, GetType(),"FixValidators",script,true);
}

Hot to put a custom object into the ViewState or how to serialize an object

Published Apr 10, 2007

Posted in
.NET
ASP.NET
programming
C#

and has 2 comments

In order to access an object even after postbacks, you need to put it either in the Session or the ViewState. The ViewState is preserved only between postbacks, not between different pages and it is a Page property, so it is more efficient to use it. The problem with this method is that every object you put in the ViewState must be serializable.
So, the quick and dirty path: if you don't have strange custom serializing to do, all you have to do it to decorate the object with the [Serializable] flag, like this:

[Serializable]

public class MyDictionary StateDict:Dictionary<int,bool> {

Say you wouldn't have done this, you would have probably met with the "Class is not marked as Serializable". Duh!. However, in this situation above I have inherited from an object that implements ISerializable. I will get an error "The constructor to deserialize an object of type ... was not found". What that means is that the object must have a constructor that accepts two parameters, a SerializationInfo and a StreamingContext object. So we must add it to the object, like this:

[Serializable]

public class MyDictionary StateDict:Dictionary<int,bool> {

public MyDictionary(SerializationInfo info, StreamingContext context) : base(info, context) { }

public MyDictionary() {}

I added the second constructor because when adding a parametrized constructor, the default empty one is no longer inherited. So no more new MyDictionary() unless one adds it.

That does it! Please do check out the entire ISerializable interface documentation, since it requires, besides the constructor, a GetObjectData method, with the same parameters as the constructor, which controls the custom serialization of the object.

Super Fast and Accurate string distance algorithm: Sift3

Published Apr 4, 2007

and has 66 comments

Update November 2014: Sift4 is here!! Check out the new improved version of Sift here: Super Fast and Accurate string distance algorithm: Sift4

Update October 6 2014: New stuff, compare Levenstein vs Sift here:

Algorithm: Levenstein Sift

String 1: String 2:

Result:

Update June 25th 2013: I've decided to play a little with the suggestions in the comments and check for validity. This was spurned by the realization that a lot of people use my algorithm. So, in order to celebrate this, here is the "3B" version of the Sift3 algorithm:
It is made in Javascript, this time, as it was easier to test and has the following extra features:

a maxDistance value that tells the algorithm to stop if the strings are already too different.
two pointers c1 and c2, rather than a single pointer c and two offsets
Instead of dividing to 2 the total length of the strings compared, now I divide it with 1.5. Why? Because this way the value is closer to the Levenshtein distance computed per random strings

~~Happy usage!~~ The variant I posted was totally buggy. I removed it. Just use sift3Distance.

Update: the Sift algorithm is now on Github.

A while ago I wrote an entry here about Sift2, an improvement of Sift, the original and silly string distance algorithm. Now I am publishing Sift3, which is way more accurate and even simpler as an algorithm.

I found out that my algorithm is part of a class of algorithms that solve the Longest Common Substring problem, therefore I calculated the LCS, not the distance, then the distance from the LCS. The result is way more robust, easy to understand and closer to the Levenshtein algorithm both on random strings and user databases. Not to mention that there is no goto in this one.

BTW, if you are looking for an algorithm that detects switched words, this is not it :) This just looks for typos and small regional differences between the strings. I mean, you could normalize the strings, so that words are ordered by some mechanism, then it would work because the words wouldn't be switched :)

I promise to work on a word switching algorithm, but not in the near future.
Without further ado, here is the code:

The C# code is a method in an object that has a private member maxOffset. As in Sift2 maxOffset should be around 5 and it represents the range in which to try to find a missing character.

public float Distance(string s1, string s2, int maxOffset)
{
if (String.IsNullOrEmpty(s1))
{
return String.IsNullOrEmpty(s2)
? 0
: s2.Length;
}
if (String.IsNullOrEmpty(s2))
{
return s1.Length;
}
int c = 0;
int offset1 = 0;
int offset2 = 0;
int lcs = 0;
while ((c + offset1 < s1.Length)
&&
(c + offset2 < s2.Length))
{
if (s1[c + offset1] ==
s2[c + offset2])
lcs++;
else
{
offset1 = 0;
offset2 = 0;
for (int i = 0;
i < maxOffset;
i++)
{
if ((c + i < s1.Length)
&&
(s1[c + i] == s2[c]))
{
offset1 = i;
break;
}
if ((c + i < s2.Length)
&&
(s1[c] == s2[c + i]))
{
offset2 = i;
break;
}
}
}
c++;
}
return (s1.Length + s2.Length)/2 - lcs;
}

And here is the T-Sql code. This version is actually an improvement of my original source, gracefully provided by Todd Wolf:

CREATE FUNCTION [DBO].[Sift3distance2]
(
@s1 NVARCHAR(3999),@s2 NVARCHAR(3999),@maxOffset INT
)
RETURNS FLOAT
AS 
BEGIN
DECLARE @s1LEN INT,@s2LEN INT

SELECT @s1LEN=Len(Isnull(@s1,'')),@s2LEN=Len(Isnull(@s2,''))

IF @s1LEN=0 RETURN @s2LEN
ELSE
IF @s2LEN=0 RETURN @s1LEN

IF Isnull(@maxOffset,0)=0 SET @maxOffset=5

DECLARE @currPos INT,@matchCnt INT,@wrkPos INT,@s1Offset INT,@s1Char VARCHAR,@s1Pos INT,@s1Dist INT,@s2Offset INT,@s2Char VARCHAR,@s2Pos INT,@s2Dist INT

SELECT @s1Offset=0,@s2Offset=0,@matchCnt=0,@currPos=0

WHILE(@currPos+@s1Offset<@s1LEN AND @currPos+@s2Offset<@s2LEN)
BEGIN
SET @wrkPos=@currPos+1

IF(Substring(@s1,@wrkPos+@s1Offset,1)=Substring(@s2,@wrkPos+@s2Offset,1)) SET @matchCnt=@matchCnt+1
ELSE
BEGIN
SET @s1Offset=0

SET @s2Offset=0

SELECT @s1Char=Substring(@s1,@wrkPos,1),@s2Char=Substring(@s2,@wrkPos,1)

SELECT @s1Pos=Charindex(@s2Char,@s1,@wrkPos)-1,@s2Pos=Charindex(@s1Char,@s2,@wrkPos)-1

SELECT @s1Dist=@s1Pos-@currPos,@s2Dist=@s2Pos-@currPos

IF(@s1Pos>0 AND (@s1Dist<=@s2Dist OR @s2Pos<1) AND @s1Dist<@maxOffset) SET @s1Offset=(@s1Pos-@wrkPos)+1
ELSE
IF(@s2Pos>0 AND (@s2Dist<@s1Dist OR @s1Pos<1) AND @s2Dist<@maxOffset) SET @s2Offset=(@s2Pos-@wrkPos)+1
END

SET @currPos=@currPos+1
END

RETURN(@s1LEN+@s2LEN)/2.0-@matchCnt
END

It doesn't give the same exact results as my own code, yet the result is close enough and the speed is about 20% higher.

And thanks to Diogo Nechtan, the version in PHP:

function sift3Plus($s1, $s2, $maxOffset) {
$s1Length = strlen($s1); 
$s2Length = strlen($s2);
if (empty($s1)) {
return (empty($s2) ? 0 : $s2Length);
}
if (empty($s2)) {
return $s1Length;
}
$c1 = $c2 = $lcs = 0;

while (($c1 < $s1Length) && ($c2 < $s2Length)) {
if (($d = $s1{$c1}) == $s2{$c2}) {
$lcs++;
} else {
for ($i = 1; $i < $maxOffset; $i++) {
if (($c1 + $i < $s1Length) && (($d = $s1{$c1 + $i}) == $s2{$c2})) {
$c1 += $i;
break;
}
if (($c2 + $i < $s2Length) && (($d = $s1{$c1}) == $s2{$c2 + $i})) {
$c2 += $i;
break;
}
}
}
$c1++;
$c2++;
}
return (($s1Length + $s2Length) / 2 - $lcs);
}

And thanks to Fernando Jorge Mota, the version in Python:

Also, here is the Javascript version, used in Mailcheck, by Derrick Ko and Wei Lu.

function sift3Distance(s1, s2) {
if (s1 == null || s1.length === 0) {
if (s2 == null || s2.length === 0) {
return 0;
} else {
return s2.length;
}
}

if (s2 == null || s2.length === 0) {
return s1.length;
}

var c = 0;
var offset1 = 0;
var offset2 = 0;
var lcs = 0;
var maxOffset = 5;

while ((c + offset1 < s1.length) && (c + offset2 < s2.length)) {
if (s1.charAt(c + offset1) == s2.charAt(c + offset2)) {
lcs++;
} else {
offset1 = 0;
offset2 = 0;
for (var i = 0; i < maxOffset; i++) {
if ((c + i < s1.length) && (s1.charAt(c + i) == s2.charAt(c))) {
offset1 = i;
break;
}
if ((c + i < s2.length) && (s1.charAt(c) == s2.charAt(c + i))) {
offset2 = i;
break;
}
}
}
c++;
}
return (s1.length + s2.length) / 2 - lcs;
}

~~Another implementation, this time in Java, by Eclesia:~~

You might also be interested in a customised version in AutoKey, by Toralf:

Thanks all for your comments and I look forward to more. Just tell me it worked or not and, most important, why. Good luck!

Copying a database from Sql2005 to Sql2000

Published Mar 29, 2007

Posted in
database
programming

and has 0 comments

You sometimes need to copy the exact structure of a database to an Sql2000 server, even if the source server is 2005.

Follow these steps:

open 2005 Sql Server Management Studio

right click on the offending database and go to Tasks -> Generate Scripts

do NOT check Script all objects in the selected database

click Next

set Include if NOT EXISTS to False

set Script for Server Version to SQL Server 2000

try to check only the objects and types of objects you actually need

create the script

delete all occurences of "WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)" from the generated script

Now the script should work on an SQL 2000 Server..
For the copying of data, the Server Management Studio has an option called ~~Copy Database~~Export Data, also in Tasks, that now accepts an Sql 2000 Server as destination.

Simplify Programmatic File Access - by John Cronan

Published Mar 28, 2007

Posted in
.NET
programming

and has 0 comments

I've found this interesting article by John Cronan about using the Abstract Factory pattern to access files, no matter if they are on FTP, HTTP or the local or networked file system. Basically he uses WebRequest.Create rather than any *Stream* class.

Interesting enough, he seems to be the only one providing a solution to the problem of accessing local file system resources when the default access rights do not allow you to, even if the logged on credentials would normally give you the access, thus solving an issue of the FileWebRequest class. Unfortunately he uses P/Invoke, which kind of contradicts the whole "more flexible than thou" approach of the article.

Overall an interesting read which gives you flexibility of file support, while taking away some of the specific advantages like seeking or appending. It's a definitely better approach than StreamReader and the ugly "URI formats are not supported." error.

A bonus for using this method is that it is compatible with the Office 2007/Vista Open Packaging addressing model, by way of the PackWebRequest class.

ODBC Escape sequences!

Published Mar 21, 2007

Posted in
database
programming

and has 0 comments

In other words: those curly bracket things in SQL. What? curly brackets in SQL? Yes! Imagine that :)

The idea is that most database systems adhere to the ODBC standard, at least ODBC 1.0. That means that, when you communicated with a database, you can send so called ODBC escape sequences that are translated into the SQL engine native objects.

Quick example: SELECT {d '2007-03-15'} will work in all ODBC 1.0 compliant DBMSs, including Microsoft SQL Server, MySql, PostgreSQL, Oracle, etc. and select a date object from 15 of March 2007, no matter the server configured country or language.

Interested yet? You can read the ODBC Programmer's Reference for more details. Short story shorter, here are the working and interesting parts (to me) of the ODBC escape sequences:
select {d '2007-02-13' }
select {t '22:20:30' }
select {ts '2007-02-13 22:20:30' }
select {fn curdate()}
select {fn curtime()}
select {fn User()}
select {fn Database()}
select {fn week(getdate())}
select {fn quarter(getdate())}
select {fn monthname(getdate())}
select {fn dayname(getdate())}
select {fn curdate()}
select {fn dayofweek(getdate())}
select {fn dayofyear(getdate())}
select {guid '12345678-1234-1234-1234-123456789012'}

Ajax enabling your controls

Published Mar 20, 2007

Posted in
.NET
ASP.NET
programming
Ajax
C#

and has 0 comments

Ok, so you have the greatest control library ever made and Microsoft releases Asp.Net Ajax and none of them work anymore. What is one to do?

Eilon Lipton to the rescue! He writes a very good article about Ajax enabling your controls without linking to the System.Web.Extensions dll.

However, the article is a bit outdated. Here is a piece of code that solves the problems (at least for the latest version of Asp.Net Ajax):

Type scriptManagerType = Type.GetType("System.Web.UI.ScriptManager, System.Web.Extensions, Version=1.0.61025.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35", false);

 if (scriptManagerType != null)

 RegisterClientScriptResourceMethod = scriptManagerType.GetMethod("RegisterClientScriptResource", new Type[] { typeof(Control), typeof(Type),typeof(string) });

 RegisterStartupScriptMethod = scriptManagerType.GetMethod("RegisterStartupScript", new Type[] { typeof(Control), typeof(Type), typeof(string), typeof(string), typeof(bool) });

This is because the namespace has changed since the writing of Elion's article from Microsoft.Web.UI to System.Web.UI and there are two methods named RegisterClientScriptResource and two named RegisterStartupScript so you have to get the right one. Else you get the "Ambiguous match found" error.

There you have it!