Intro

  At the moment I am looking for a job, so in my research into coding test platforms, the problems that cannot be solved with just a little thinking and some loops are 80% covered by just 8 categories of problems:

  1. Optimization with overlapping subproblems (coin change, longest increasing subsequence, knapsack variants, matrix chain) - solved with Dynamic Programming (memoization/tabulation).
  2. Pathfinding, connectivity, or level-order processing in grids/graphs (number of islands, shortest path in maze, rotten oranges) - solved with Graph Traversal: BFS and DFS.
  3. Hierarchical data processing (tree diameter, in order traversal, BST validation, lowest common ancestor) - solved with Tree Traversals and Properties (pre/in/post-order, level-order, recursion/DFS on trees).
  4. Exhaustive search with pruning (permutations, subsets, N-Queens, sudoku solver, word search) - solved with Backtracking.
  5. Local optimal choices that work globally (jump game, task scheduler, fractional knapsack) - solved by Greedy Algorithms.
  6. Efficient "top-k" or priority processing (merge k sorted lists, kth largest element, task scheduler with cooldown) - solved with Heaps / Priority Queues.
  7. Efficient union of sets / cycle detection in graphs (redundant connection, number of provinces, Kruskal's MST) - solved with Union-Find (Disjoint Set Union) with path compression & union-by-rank.
  8. Searching in sorted/monotonic spaces (search in rotated sorted array, minimum in rotated array, binary search on answer for capacity problems) - solved by Binary Search (advanced variants).

Bonus:

Today I will cover Backtracking.

When to use Backtracking

Recognize a problem that can be solved with Backtracking by these signs:

  • The problem asks for all possible solutions / combinations / permutations / arrangements / configurations.
  • You need to find one (or all) valid solution(s) by making a sequence of choices/decisions.
  • There is a decision tree structure: at each step you choose one option from several candidates.
  • The number of possibilities is exponential (n!, 2ⁿ, etc.) but you can prune invalid branches early to avoid full enumeration.
  • You can build the solution incrementally and check validity as you go (partial solutions are meaningful).
  • Constraints allow early termination of a branch (e.g., sum exceeds target, board conflict, duplicate choice, invalid move).
  • The problem involves puzzles, combinatorial search, or constraint satisfaction (Sudoku, N-Queens, crosswords, partitioning).

Words/phrases in the problem: "all possible", "generate", "find all combinations", "all permutations", "solve", "place", "fill", "arrange", "valid configurations", "every way to..."

Backtracking is a class of algorithms for finding solutions to some computational problems, notably constraint satisfaction or enumeration problems, that incrementally builds candidates to the solutions, and abandons a candidate ("backtracks") as soon as it determines that the candidate cannot possibly be completed to a valid solution.

Classic backtracking problem patterns:

  • Generate all subsets / combinations / permutations (with or without duplicates)
  • N-Queens / chess piece placement puzzles
  • Sudoku solver
  • Word search / Boggle
  • Palindrome partitioning
  • Restore IP addresses
  • Combination sum (with target)
  • Letter combinations of a phone number
  • Generate parentheses
  • Rat in a maze / path finding with constraints

One-sentence rule: If the problem feels like "try every reasonable choice, go deeper, undo if it fails, and collect valid complete solutions" it is almost certainly backtracking.

Example - N-Queens

This is a problem often called "N-queens"

Problem statement: Place N queens on an N x N chessboard so that no two queens attack each other. Queens attack horizontally, vertically, or diagonally. Return one valid board configuration (or all, as a variation).

Represent the board as a list of strings where 'Q' = queen and '.' = empty.

Constraints: 1 <= N <= 9 (for N=1 to N=9 there is always at least one solution).

Example for N=4 (one possible solution):

. Q . .
. . . Q
Q . . .
. . Q .

Output as: [".Q..","...Q","Q...","..Q."]

Here is a naive implementation:

using System.Text;

var count = 0;

IList<IList<string>> SolveNQueens(int n)
{
    var result = new List<IList<string>>();
    char[,] board = new char[n, n];
    for (var i = 0; i < n; i++)
        for (var j = 0; j < n; j++)
            board[i, j] = '.';

    PlaceQueensNaive(board, 0, n, result);
    return result;
}

void PlaceQueensNaive(char[,] board, int row, int n, IList<IList<string>> result)
{
    count++;
    if (row == n)
    {
        // Check if valid (very late)
        if (IsValid(board, n))
        {
            var sol = new List<string>();
            for (var i = 0; i < n; i++)
            {
                var sb = new StringBuilder();
                for (var j = 0; j < n; j++)
                    sb.Append(board[i, j]);
                sol.Add(sb.ToString());
            }
            result.Add(sol);
        }
        return;
    }
    for (var col = 0; col < n; col++)
    {
        board[row, col] = 'Q';
        PlaceQueensNaive(board, row + 1, n, result);
        board[row, col] = '.';  // backtrack
    }
}

bool IsValid(char[,] board, int n)
{
    // Check columns and diagonals for each queen
    for (var r = 0; r < n; r++)
    {
        for (var c = 0; c < n; c++)
        {
            if (board[r, c] != 'Q') continue;

            // Check right, down-right, down-left from this queen
            // (we only need to check one direction because we fill row by row)

            // Same column below
            for (var i = r + 1; i < n; i++)
                if (board[i, c] == 'Q') return false;

            // Down-right diagonal
            for (var i = r + 1, j = c + 1; i < n && j < n; i++, j++)
                if (board[i, j] == 'Q') return false;

            // Down-left diagonal
            for (var i = r + 1, j = c - 1; i < n && j >= 0; i++, j--)
                if (board[i, j] == 'Q') return false;
        }
    }

    return true;
}

var result = SolveNQueens(4);
Console.WriteLine("Executions: " + count);
foreach (var solution in result)
{
    Console.WriteLine(string.Join("\r\n", solution));
    Console.WriteLine();
}

Now, this outputs the two possible solutions for N=4 and 341 executions.

The IsValid method is not the most efficient implementation, but it's readable (also not really the focus of the post).

There are several issues with PlaceQueensNaive, though:

  • Tries every cell -> nn possibilities (for n=9: ~387 million)
  • Validation only at the end -> wastes time exploring invalid partial boards
  • No early pruning -> extremely slow even for n=8

Let's see a good solution:

using System.Text;

var count = 0;

IList<IList<string>> SolveNQueens(int n)
{
    var result = new List<IList<string>>();
    var board = new char[n, n];
    for (var i = 0; i < n; i++)
        for (var j = 0; j < n; j++)
            board[i, j] = '.';

    var cols = new bool[n];
    var diag1 = new bool[2 * n - 1];  // row + col
    var diag2 = new bool[2 * n - 1];  // row - col + (n-1)

    Backtrack(0, board, cols, diag1, diag2, n, result);
    return result;
}

void Backtrack(int row, char[,] board, bool[] cols, bool[] diag1, bool[] diag2, int n, IList<IList<string>> result)
{
    count++;
    if (row == n)
    {
        var sol = new List<string>();
        for (var i = 0; i < n; i++)
        {
            var sb = new StringBuilder();
            for (var j = 0; j < n; j++) sb.Append(board[i, j]);
            sol.Add(sb.ToString());
        }
        result.Add(sol);
        return;
    }

    for (var col = 0; col < n; col++)
    {
        var d1 = row + col;
        var d2 = row - col + n - 1;

        if (cols[col] || diag1[d1] || diag2[d2]) continue;  // prune - attacked

        // Place
        board[row, col] = 'Q';
        cols[col] = true;
        diag1[d1] = true;
        diag2[d2] = true;

        Backtrack(row + 1, board, cols, diag1, diag2, n, result);

        // Backtrack
        board[row, col] = '.';
        cols[col] = false;
        diag1[d1] = false;
        diag2[d2] = false;
    }
}

var result = SolveNQueens(4);
Console.WriteLine("Executions: " + count);
foreach (var solution in result)
{
    Console.WriteLine(string.Join("\r\n", solution));
    Console.WriteLine();
}

Why this is optimal:

  • One queen per row -> n! instead of n2
  • Immediate pruning with column + diagonal tracking -> most branches die early
  • Time: O(n!) worst case but in practice much faster due to heavy pruning
  • For n=9 it runs almost instantly

The count of executions is 17.

Backtracking template

While the code above was simple and clear, it still carries a lot of the particular problem details. Can we imagine a generic backtracking solution? And we kind of can:

  • start with a mechanism of generating candidate steps or a list of them
  • then prepare a result which is a list of lists of steps
    • each list of steps will represent a solution and the result is the list of solutions
  • then call the backtracking method, by adding an empty list as a partial solution
  • In the method code:
    • first check if solution is valid, and if so add it to the result and return from the method
      • you may want to exit here permanently if the requirement is to find just a single solution
    • then check if you can prune it (invalidate partial solution as soon as possible), and if so return (abandon it)
    • then for all possible steps:
      • check if the step is valid for the partial solution (and if not, skip it)
      • add (place) the step to the solution - this represents the solution with an extra step
      • call the method on the new solution
      • remove (backtrack) the step from the solution - this prepares the solution for the next step

That's it. You note that we've used the recursive function idea here, but we can use a queue with the same purpose:

  • start with a mechanism of generating candidate steps or a list of them
  • then prepare a result which is a list of lists of steps
    • each list of steps will represent a solution and the result is the list of solutions
  • then prepare a queue of lists of steps
    • each list of steps is a partial solution, the queue is the work list
  • while there are items in the queue:
    • dequeue an item
    • first check if solution is valid, and if so add it to the result and continue to the next item in the queue
      • you may want to exit here permanently if the requirement is to find just a single solution
    • then check if you can prune the solution, and if so continue to the next item in the queue
    • then for all possible valid steps:
      • enqueue a new item from the solution with the step applied

So... where's the backtracking? These two forms are called recursive and iterative backtracking. The second gives you more control and might be needed when there are a lot of possible steps to escape the limitation of the call stack size. You may recognize it as the BFS/DFS method described in my previous post. Queue for BFS, stack for DFS.

You see, a lot of people refer to just the recursive version of backtracking as backtracking. It is:

  • shorter and more elegant to write
  • the "backtrack" action is implicit (you just return from the method)
  • the undo step is done automatically when the method exits
  • recursion uses the call stack

The iterative version is:

  • more verbose
  • you have to create a solution explicitly (generate/copy and add new list to queue)
  • uses the heap, so it doesn't do stack overflows
  • it doesn't backtrack as much as it doesn't follow up on invalid partial solutions 

Personally, I prefer the second, because it's more flexible and offers more control, but you be the judge.

Example - Generate parentheses

An almost as common interview question for backtracking is the Generate Parentheses problem.

Problem statement: Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.

Example: n = 3
Output: ["((()))","(()())","(())()","()(())","()()()"]

And here is the solution:

IList<string> Backtrack(int n, string current="", int open=0, int close=0,  IList<string> result=null)
{
    result ??= [];
    // Base case: we used exactly n pairs and string is complete
    if (current.Length == n * 2)
    {
        result.Add(current);
        return result;
    }

    // Choice 1: add an opening parenthesis (if we can)
    if (open < n)
    {
        Backtrack(n, current + "(", open + 1, close, result);
    }

    // Choice 2: add a closing parenthesis (only if more opens than closes)
    if (close < open)
    {
        Backtrack(n, current + ")", open, close + 1, result);
    }
    return result;
}

var result = Backtrack(3);
Console.WriteLine(string.Join("\r\n",result));

I got a little fancy here:

  • I used the same method as the solve and the backtrack method, taking advantage of the default parameter value feature of C#, but that meant I had to return the result all the time
  • There is no loop over steps, because there are just two of them
  • The pruning is done before calling the backtracking method, by checking the open and close values
  • There is no backtracking per se, since we are using a new copy (concatenated string) in all calls
  • You want to get extra fancy? Use a StringBuilder instead of a string, Append the parentheses, then decrement the StringBuilder length when you exit, thus implementing literal backtracking.
    • there is a gotcha here, though, see below in the Generic backtracking section

The output is the five possible solutions.

This is optimal because:

  • Time complexity: O(4n / sqrt(n)) - this is the number of valid Catalan number sequences, roughly O(4n).
  • Backtracking prunes almost all invalid paths early.
  • Space complexity: O(n) recursion depth + O(4n) for output storage.
  • No extra arrays or complex state - just track count of open and close parentheses.
  • Very clean and efficient - runs instantly even for n=8.

Generic backtracking

I have tried to implement a generic class that does backtracking solving. And I made it, but it was really ugly. I tried simplifying it in several ways, but in the end I gave up. Why? Because backtracking in itself is such a general solution to a lot of problems that trying to create a class to encapsulate that results in a very thin thing that requires the solution type to implement everything.

If you think about it, the solver has to implement:

  • a type for the solution
  • a type for the current state
  • a way to generate valid steps based on current state (and possibly the list of existing solutions)
  • a way to apply the steps on the current solution and state (and unapply them, if you want standard backtracking)
  • a way to determine validity of a solution
  • a way to extract/clone the result from the solution

And you realize that you either have to provide a lot of functions as parameters to the solver, or you have to combine the solution type and state type and implement the methods there. And in the second case, the solver itself does nothing in particular.

My implementation, in order to solve the generate parentheses problem, assumed any solution is an IList<TItem> and the TItem was (char ch, int open, int closed, int pairCount), which was frankly ridiculous.

Sometimes it pays off, though, to try to achieve such things in order to understand why they would not work and what are the underlying assumptions in your code.

For example, you never considered that in both examples above we needed to clone the solution in order to add it to the result. In the first example we were putting in the result sb.ToString(), which generates a clone of the StringBuilder, and in the second example we were using string concatenation, which generates a new string every time. If you had added the StringBuilder itself to the result list, then it would have held the reference to the same object several times.

Conclusion

Backtracking is an important tool in any developer's toolkit and it is one of the first theoretical computer science concepts you are being taught in school. The magic is in three steps: choose an option, explore deeper by recursing, unchoose when you return. Everything else (pruning, skipping duplicates, validity checks) is just smart decoration around this loop.

But once you look "under the hood" you realize that in order to implement classic backtracking you are relying on programming language constructs that add syntactic sugar to an otherwise differently defined problem: growing a list of possible solutions and eliminating them early if they cannot yield a valid solution.

Comments

Be the first to post a comment

Post a comment