and has 0 comments

  I will be frank (pun not intended) and say that this book shocked me with how good it is. It is not very accessible, as it is fairly philosophical and technical - and the technical side may be a lot of mumbo jumbo, but I think this book shows what Frank Herbert was capable of at the height of his prowess.

  In short, Destination: Void is about a crew of four people on a disabled ship who need to construct an artificial intelligence in order to save the ship and their lives. There is only one snag: no one has managed to successfully build an AI that didn't end up disastrous. Here you have to accept a concept without which the book will not work: that an ultimately conscious entity has full access to the universe, giving them godly powers. This is not only a book about building a computer system, but a philosophical dissection of what consciousness is, what is intelligence, how the human mind works and should we, when building mechanical intelligence, even follow that design as a model.

  This book features many of the brand Herbert ideas: the deeply meaningful thoughts, conversations and actions between an isolated group of people, the inner thought voiced in the writing, the declared and hidden agendas of people, the oppressive society that uses immoral methods to get to its goals, the great potential of human beings that can only be unleashed by extreme circumstances, the religious and sexual components of human drive, the archetypal roles of the characters, etc. And the insane pacing puts those ideas even more into terrifying focus.

  Again, I was amazed by this book, all but the ending. I would have loved an entire series following the spirit of most of it, unfortunately the next three books go in a completely different direction: the nature of godhood. Perhaps that is why this is not considered the first book in the "sequence", but book 0.5, because if the next ones focus on a god, this one focuses on building one. Or perhaps because Pandora is not even part of the story here.

  In conclusion, I recommend reading this book as a standalone story. Kudos if you want to read and enjoy the entire Pandora series, but in my mind Destination: Void is quite different from the others.

and has 0 comments

  Frank Herbert's writing feels paradoxical to me, as he examines the minutiae of individual characters or particular scenes, yet his main focus always remains on the situation as a whole. His heroes are worlds entire, with people just instruments of inevitable evolution or death. The Eyes of Heisenberg might be Herbert's alternative to Zamyatin's We or Aldous Huxley's Brave New World. The same oppressive dystopia of clinical control of society, the rebels, the groups of people vying for control and/or survival, the epic sweeping finale. Yet, where a central protagonist was the focus of those books, this one refuses to hold any one person to a rank high enough to outshine all of the others.

  Imagine a world ruled by Optimen, immortal people living in their own bubble of beliefs and absolute power, served by the Folk, cloned and genetically engineered people destined for a centuries life of predetermined work, yet still mortal, rarely rewarded for their servitude with the permission to procreate. The world has become this after a terrible war between Optimen and cyborgs, in which the Optimen prevailed. A couple of young parents come to the clinic for the "cutting", where the embryo is examined, genetically manipulated against flaws, then put in a growing vat. But this embryo is special! A race between several groups of people is on to hide, preserve, destroy or use it as bait.

  You know that I don't usually describe the book plot in that much detail for fear of spoiling the story, but in this case I feel it is warranted, as The Eyes of Heisenberg is so full of technobabble it takes great effort to start reading it. Once the names and who is who are clear, the book is easy to read, but the beginning of the book... ugh! Especially since genetics wasn't really developed at the time, and all of the futuristic mumbo jumbo is obviously bull.  

  I really liked the idea of the story. Herbert always had great imaginative ideas that were not limited by his ability to express them. He will spend as much time or explanation for any detail or person as he needs, then sweep them over like they never mattered just a bit later. The idea was always first! It took me some time to realize this, but Herbert always rushes the endings. He builds this incredible set of worlds and then, at the very end, he gets impatient and does it over with. It's not as bad as Peter F. Hamilton, but it's there. I guess it takes a lot of determination and planning to keep a consistent pace throughout a book.

  I am sure you will be curious to know if this book, published in 1966, just a year after Dune (together with two other novels), is anything like the book that made Herbert famous. It does. People are cloned in axolotl tanks, organizations form around their approach to the solution of life: technical minded cyborgs, sterile immortals manipulating genes, couriers developing humanistic methods of communication and analysis. Some of the inner thoughts put on page, the tool that made me fall in love with Dune in the first place, is there. There is also that permeating generic idea of the strong coupling between environment and life. Somehow I want Herbert to come back and write books in the Starcraft or Alien universes, I am sure he would have loved those worlds.

  Bottom line: not a perfect book and feeling a bit dated - note that I did compare it with work written three or four decades before - but still entertaining and evocative of Herbert's general ideas and style. Pandora is coming next, all four books.

and has 0 comments

  1966 was a prolific year for Frank Herbert. A year before he had published Dune and now he won a Hugo for it, he published the first book of the Pandora series, The Eyes of Heisenberg and the book I am reviewing now: The Green Brain. It features a lot of the recurrent ideas of ecology versus politics, how the environment defines and shapes life, including people, warnings about the human abuse of nature and the deeper interactions between people - complete with inner thoughts, Dune-style.

  However, the book feels rough. The plot is immediately revealed by both title and early scenes, the female character is pretty much a joke and, while the premise is great, the execution is rather bland, for example with characters that appear in some chapters then are completely forgotten, and most of it is a pointless trip through a jungle. I liked it, but I can't but feel that it was something that was partially written in the past and got published only because Dune was a hit.

  I can only recommend it for Herbert fans, because analyzed by its own it's pretty average and has a lot of unfulfilled potential.

and has 0 comments

  The Dragon in the Sea could have been a story about real life submariners as, other than a few details really, the novel is barely science fantasy. The story is about a near future in which the West and the East are in an eternal Cold War where no one trusts anyone because of deeply embedded sleeper agents and where conflict is fought in the ocean between sophisticated nuclear submarines over underwater oil reserves. Places like the British Isles have been nuked into oblivion and the big prize is bringing home petrol syphoned from the other side.

  The entire action of the book happens in such a submarine, tasked to go through enemy lines and extract oil from a hidden reserve. There are no chapters, just one long and action filled story. Yet the focus is not so much on the world or the technology, although both are described pretty well, but on the characters, on why and how they function, on what such a prolonged and tense conflict can do to people's psyche. The main character is indeed a psychologist, while also an electronics specialist, in a crew of four - including the captain.

  The careful analysis of character motivation and inner thoughts is reminiscent of Dune, but also the idea of global conflict over a finite resource affecting the entire ecology and sociology of the planet and extreme peril changing people to their core. Ten years before Frank Herbert was publishing Dune, its seeds were clearly already planted.

  To me it was a fascinating read. It was one nonstop trip filled with danger, but the author was clearly interested in how the characters were functioning under extreme stress and how it translated at a very visceral and atavistic level. It was a combination of action and psychoanalysis, still a bit unpolished, but deep and insightful. I liked how Herbert hinted at what the world had come to by just placing a few crumbs of information in an otherwise uninterrupted sub adventure. Imagine Das Boot, but with a socioeconomic and psychological message in it. I liked it! 

and has 0 comments

  Unpublished Stories is a collection of 13 short stories written by Frank Herbert and never published during his lifetime, only two of them sci-fi, which was published in 2016.  One can see the focus of Herbert on the characters, on their motivations and their inner thoughts, the way their actions affect the whole.

  The collection consists of:

  • The Cage - a soldier is sent to a psych ward after a head injury where he is tortured by a sadistic caretaker under the threat of pinning some mental illness on him
  • The Illegitimate Stage - a couple of play stage professionals are hired to materialize a play written by a wealthy sponsor, then start to form a bond with the hapless woman
  • A Lesson in History - a husband experiences the tension of remembering his war days and his mistress then, while having to hide all signs from his wife
  • Wilfred - a story about the total psychopathic transformation of a man and the bafflement of society around him
  • The Iron Maiden - a young soldier begs for advice from his more experienced friend on how to woo the girl he is in love with
  • The Wrong Cat - a woman is terrorized by a murderous madman
  • The Yellow Coat - a cowardly man becomes stronger from pushing through danger and trauma, but no one believes it
  • The Heat's On - a fireman investigates a strange series of deaths by fire
  • The Little Window - an unexpected event shakes the owner of a shoe shop and his nephew from their complacency
  • The Waters of Kan-E -  a story of survival in the Polynesian ocean
  • Paul's Friend - another story about survival at sea
  • Public Hearing - a scientist explains to helpless politicians that their armed power has become obsolete when everyone can build a world destroying weapon
  • The Daddy Box - an alien device starts fixing humanity by starting small

 Even without any actual connection to Dune, there is evidence of the seeds of the novel in many of the stories within. For example in A Lesson in History, there is the idea that a woman can discern the thoughts of a man from tiny disparate actions and gestures. In The Yellow Coat a man's psyche is transformed by adversity. In Public Hearing the weapon described is very similar to a Dune lasgun, while The Daddy Box features a way to change a society by tackling the basics of the family unit.

  The stories are short and the collection is not a big book. If you are interested in how Frank Herbert's mind worked, this is something that is worth reading, without any of the stories inside being really that special. I enjoyed the book, but without my interest in the author I would not probably have recommended it to anyone.

and has 0 comments

  Oh, the disappointment! Considering I loved the Frank Herbert's Dune books and I've read them repeatedly, I was expecting to at least like something from his son's books set in the same universe. I mean, how bad could it be? He even wrote it in collaboration with a seasoned writer. Well, bad! I hated everything: the writing, the world which is completely different from Frank's, but mostly that Brian Herbert seems to have missed the point of Dune completely!

  Gone are the superhuman abilities of people that had ten millennia to evolve, after escaping A.I. annihilation and brutally training themselves  on hostile planets to become the best version of a human being. Gone are the thoughtful insights into people, the careful dialogues, the grand visions. What we get instead is formulaic trope after formulaic trope, the standard writing style taught by hacks in most "writing classes" in the U.S., dull characters, boring writing, dumb people, unneeded attention to technology and little to worldbuilding or character development, cramming all storylines and possible characters and references to the original books together. And then the way things people have not learned about the Dune universe until the sixth book, just casually blurted in a prequel book, just because Brian wanted to check all the boxes.

  I mean, there were moments when something was happening, like a full Reverend Mother assessing the situation in a dangerous context. And I was thinking "It's on now! She will come with brilliant insights, impossible strategies, use her..." and Brian started to describe the lighting in the room! Consider that this book has a lot going for it in terms of source material. I love the original Dune books so each reference, each character, each world, each culture that existed in those books should have anchored me to this one. But even so I couldn't damn finish it. After three weeks of forcing myself to read it I have barely reach half. No more!

  I am not unreasonable. I know that probably Brian Herbert was pushed to be a writer, even if he didn't have the skills or maybe even the drive. I know that people are not instantly good at what they are doing and that after a shitty book they have the opportunity to grow when writing the next ones. There are 25 books and comics in the Dune universe now! When the hell did he write all of those? Surely at least some of them would be good. But this first one is so bad, so incredibly bland, that I have no desire to read anything written by Brian Herbert ever again, except perhaps the biography of his father. I mean, at least he will have been invested in that one, right? He can't murder his dad's story like he did his legacy!

  I would rather (and I actually plan to) reread everything Frank Herbert ever wrote than try another butchery of Dune by Brian Herbert.

and has 0 comments

Ancient Enemy is one of those card games that abstract a journey of discovery and battle. You charge your magic by playing Solitaire "combos", then fire at the enemy. The choices you make on your journey don't matter at all, they are just levels to pass through that barely differ from each other. That's the entire game!

So why did I play it? Well, because the sound and the texts that my character was "saying" were intriguing. Ironically enough, the game had a "Skip story" button, when in fact that was the only thing that interested me - I wanted a "Skip game" button. Alas, the end of the journey was a complete let down, with a generic enemy that presented no challenge and a blunt and uninspired story ending.

Honestly, when I was playing it I thought: anybody can make games and sell them if this is a Steam game that people pay money for. Just look at the official site of this 2018 game: it looks like it was made in 2000.

Bottom line: fascinating how soundscape can make even the most boring games hold one's interest. Here is a gameplay video:

[youtube:dcglX1KP4XQ]

and has 0 comments

  So I was happily minding my own business after a production release only for everything to go BOOM! Apparently, maybe because of something we did, but maybe not, the memory of the production servers was running out. Exception looked something like:

System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
 at System.Reflection.Emit.TypeBuilder.SetMethodIL(RuntimeModulemodule, Int32tk ,BooleanisInitLocals, Byte[] body, Int32 bodyLength, 
    Byte[] LocalSig ,Int32sigLength, Int32maxStackSize, ExceptionHandler[] exceptions, Int32numExceptions ,Int32[] tokenFixups, Int32numTokenFixups)
 at System.Reflection.Emit.TypeBuilder.CreateTypeNoLock() 
 at System.Reflection.Emit.TypeBuilder.CreateType()
 at System.Xml.Serialization.XmlSerializationReaderILGen.GenerateEnd(String []methods, XmlMapping[] xmlMappings, Type[] types) 
 at System.Xml.Serialization.TempAssembly.GenerateRefEmitAssembly(XmlMapping []xmlMappings, Type[] types, StringdefaultNamespace ,Evidenceevidence)
 at System.Xml.Serialization.TempAssembly..ctor(XmlMapping []xmlMappings, Type[] types, StringdefaultNamespace ,Stringlocation, Evidenceevidence)
 at System.Xml.Serialization.XmlSerializer.GenerateTempAssembly(XmlMappingxmlMapping, Typetype ,StringdefaultNamespace, Stringlocation, Evidence evidence)
 at System.Xml.Serialization.XmlSerializer..ctor(Typetype, XmlAttributeOverrides overrides, Type[] extraTypes, 
     XmlRootAttributeroot, StringdefaultNamespace, Stringlocation, Evidence evidence)
 at System.Xml.Serialization.XmlSerializer..ctor(Typetype, XmlAttributeOverrides overrides) 

At first I thought there was something else eating away the memory, but the exception was repeatedly thrown at this specific point. And I did what every senior dev does: googled it! And I found this answer: "When an XmlSerializer is created, an assembly is dynamically generated and loaded into the AppDomain. These assemblies cannot be garbage collected until their AppDomain is unloaded, which in your case is never." It also referenced a Microsoft KB886385 from 2007 which, of course, didn't exist at that URL anymore, but I found it archived by some nice people.

What was going on? I would tell you, but Gergely Kalapos explains things much better in his article How the evil System.Xml.Serialization.XmlSerializer class can bring down a server with 32Gb ram. He also explains what commands he used to debug the issue, which is great!

But since we already know links tend to vanish over time (so much for stuff on the Internet living forever), here is the gist of it all:

  • XmlSerializer generates dynamic code (as dynamic assemblies) in its constructors
  • the most used constructors of the class have a caching mechanism in place:
    • XmlSerializer.XmlSerializer(Type)
    • XmlSerializer.XmlSerializer(Type, String)
  • but the others do not, so every time you use one of those you create, load and never unload another dynamic assembly

I know this is an old class in an old framework, but some of us still work in companies that are firmly rooted in the middle ages. Also since I plan to maintain my blog online until I die, it will live on the Internet for the duration.

Hope it helps!

and has 0 comments

Update: apparently, the site is back on!

Now for the previous content:

  With sadness I found out that Z-Library, a web site filled with (pirated) free books and science papers has been shut down by FBI, in collaboration with Argentina, Amazon and Google. Strange how much effort can be done to close down sites with actual information on them.

  Not unexpected, the site operators were Russian, Anton Napolsky and Valeriia Ermakova, and were apprehended in Argentina by local forces, although it's not clear if they will be extradited to the U.S. or not. We know from numerous examples that the American government is terrified of technical people with access to information and the ability to disseminate it and banks on making examples of them as much as possible, so while I am rooting for them, I doubt the two will ever be truly free every again.

  The investigation was of course helped by Google and Amazon, but it's pretty clear that the duo did not use any smart way of protecting their identity. Apparently, they just got too annoying and were crushed as soon as the will to stop them materialized.

  It is not clear exactly if all of the Z-Library team have been identified and stopped. Perhaps the web site will surface once again. For certain, some alternative will pop up sooner or later as books are small and easy to share.

  Z-Library has started as a clone of Library Genesis, another similar site, but it had gotten clearly bigger and better. Too big for its own good. 

and has 0 comments

  Recently I've been to Madeira, a Portuguese island colony with a powerful blend of nature, culture, feudal interests and overall corruption. The game Alba: A Wildlife Adventure made me feel like I was back!

  First of all, this is a game that was clearly done with a lot of heart. Every aspect of the gameplay is fun, positive and well crafted in all the things that matter. After playing the game, not only was I happier and more content, but I felt refreshed by imagining the pride the creators must have felt designing and finishing development. There is NO sense whatsoever of cutting corners, trying to money grab or following any political agenda (other than wildlife conservation, if you count that one). In spirit, it reminded me of the Sierra Entertainment Games.

  The plot is also very refreshing: you are playing a little girl who has grandparents interested in nature and which inspired her to care about life as well. Returning to a fictional Spanish island of her early childhood, Alba will clean up the island, photograph all kinds of animals, help them survive and thrive and fight commercial interests that threaten the island's wildlife. It is a casual gameplay - to the point that whenever you don't find something you are looking for, it means you just have to end the day and it will magically appear in the future for you - in which you explore a lovely island filled with birds, beautiful flora and tropical climate.

  What amazed me most was that there are 51 species of animals portrayed in the game and every one of them is behaving like the actual live species. The walk, the flight patterns, the speed, the sounds, they are all very precise (in the confines of a pretty simplistic graphical interface). Then there are the little things like when you have to make a yes/no decision you use the mouse to bob the head vertically or horizontally. There are no places where your 3D character gets hung up in some wall or caught between obstacles or  seeing/passing through objects. Even the interaction with objects take into account where the player is looking, it's not just a lazy area effect where you can just move around and press the action button aimlessly.

  Another nice thing is that I can imagine small children playing this game and feeling inspired and empowered, while at the same time adults understand the ironic undertones of some of the scenes. Stuff like the little girl gathering garbage from the ground and putting it in the bin, then talking to a native of the island sitting on a bench nearby and complaining about random stuff. Or everybody condescendingly commending the girl for her efforts, but mostly doing nothing to help. Or even phrases like "oh, yeah, it was a great idea to fix the stairs. Why didn't I do it before? Oh, well, it's good you did it".

  The only possible complain I might have is that the game is pretty linear and the replay value is small to nonexistent. But for the few hours that it takes to finish (which was good for me since I am... married  ), it was refreshing and calming.

  I admit, I am not a gamer. This might be just one of a large category of similar games for all I know, but I doubt it. I worked in the video game industry and I am willing to bet this game is as special as I feel it is. I warmly recommend it.

  Here is the gameplay trailer for the game:

[youtube:a-Eu9WE3grA]

  There is a common task in Excel that seems should have a very simple solution. Alas, when googling for it you get all these inexplainable crappy "tutorial" sites that either show you something completely different or something that you cannot actually do because you don't have the latest version of Office. Well, enough of this!

  The task I am talking about is just selecting a range of values and concatenating them using a specified separator, what in a programming language like C# is string.Join or in JavaScript you get the array join function. I find it very useful when, for example, I copy a result from SQL and I want to generate an INSERT or UPDATE query. And the only out of the box solution is available for Office 365 alone: TEXTJOIN.

  You use it like =TEXTJOIN(", ", FALSE, A2:A8) or =TEXTJOIN(", ", FALSE, "The", "Lazy", "Fox"), where the parameters are:

  • a delimiter
  • a boolean to determine if empty cells are ignored
  • a series or text values or a range of cells

  But, you can have this working in whatever version of Excel you want by just using a User Defined Function (UDF), one specified in this lovely and totally underrated Stack Overflow answer: MS Excel - Concat with a delimiter.

  Long story short:

  • open the Excel sheet that you want to work on 
  • press Alt-F11 which will open the VBA interface
  • insert a new module
  • paste the code from the SO answer (also copy pasted here, for good measure)
  • press Alt-Q to leave
  • if you want to save the Excel with the function in it, you need to save it as a format that supports macros, like .xlsm

And look at the code. I mean, it's ugly, but it's easy to understand. What other things could you implement that would just simplify your work and allow Excel files to be smarter, without having to code an entire Excel add-in? I mean, I could just create my own GenerateSqlInsert function that would handle column names, NULL values, etc. 

Here is the TEXTJOIN mimicking UDF to insert in a module:

Function TEXTJOIN(delim As String, skipblank As Boolean, arr)
    Dim d As Long
    Dim c As Long
    Dim arr2()
    Dim t As Long, y As Long
    t = -1
    y = -1
    If TypeName(arr) = "Range" Then
        arr2 = arr.Value
    Else
        arr2 = arr
    End If
    On Error Resume Next
    t = UBound(arr2, 2)
    y = UBound(arr2, 1)
    On Error GoTo 0

    If t >= 0 And y >= 0 Then
        For c = LBound(arr2, 1) To UBound(arr2, 1)
            For d = LBound(arr2, 1) To UBound(arr2, 2)
                If arr2(c, d) <> "" Or Not skipblank Then
                    TEXTJOIN = TEXTJOIN & arr2(c, d) & delim
                End If
            Next d
        Next c
    Else
        For c = LBound(arr2) To UBound(arr2)
            If arr2(c) <> "" Or Not skipblank Then
                TEXTJOIN = TEXTJOIN & arr2(c) & delim
            End If
        Next c
    End If
    TEXTJOIN = Left(TEXTJOIN, Len(TEXTJOIN) - Len(delim))
End Function

Hope it helps!

  I got lazy about the blog and it shows. I rarely write about anything technical and the state of the existing posts is pretty poor. I want to improve this, but I don't have a lot of time, I would certainly appreciate feedback on what you find fixable (in general or for particular posts) or new features that you might like.

  I've updated the file structure of the blog, working mostly on fixing the comments. It should now be easier to structured text in a comment, like HTML, code, links, etc. Also tried to fix the old comments (there were three different sources of comments until now, with some duplicated)

  Please let me know if you see issues with adding comments or reading existing comments. Or perhaps missing comments.

  I've added the link to a post you can comment in for general issues (see the top right section of the blog, the last icon is for comments )

  Thanks!

  I haven't been working on the Sift string distance algorithm for a while, but then I was reminded of it because someone wanted it to use it to suggest corrections to user input. Something like Google's: "Did you mean...?" or like an autocomplete application. And it got me thinking of ways to use Sift for bulk searching. I am still thinking about it, but in the meanwhile, this can be achieved using the Sift4 algorithm, with up to 40% improvement in speed to the naïve comparison with each item in the list.

  Testing this solution, I've realized that the maxDistance parameter did not work correctly. I apologize. The code is now fixed on the algorithm's blog post, so go and get it.

  So what is this solution for mass search? We can use two pieces of knowledge about the problem space:

  • the minimum possible distance between two string of length l1 and l2 will always abs(l1-l2)
    • it's very easy to understand the intuition behind it: one cannot generate a string of size 5 from a string of size 3 without at least adding two new letters, so the minimum distance would be 2
  • as we advance through the list of strings, we have a best distance value that we keep updating
    • this molds very well on the maxDistance option of Sift4

  Thus armed, we can find the best matches for our string from a list using the following steps:

  1. set a bestDistance variable to a very large value
  2. set a matches variable to an empty list
  3. for each of the strings in the list:
    1. compare the minimum distance between the search string and the string in the list (abs(l1-l2)) to bestDistance
      1. if the minimum distance is larger than bestDistance, ignore the string and move to the next
    2. use Sift4 to get the distance between the search string and the string in the list, using bestDistance as the maxDistance parameter
      1. if the algorithm reaches a temporary distance that is larger than bestDistance, it will break early and report the temporary distance, which we will ignore
    3. if distance<bestDistance, then clear the matches list and add the string to it, updating bestDistance to distance
    4. if distance=bestDistance, then add the string to the list of matches

  When using the common Sift4 version, which doesn't compute transpositions, the list of matches is retrieved 40% faster on average than simply searching through the list of strings and updating the distance. (about 15% faster with transpositions) Considering that Sift4 is already a lot faster than Levenshtein, this method will allow searching through hundreds of thousands of strings really fast. The gained time can be used to further refine the matches list using a slower, but more precise algorithm, like Levenshtein, only on a lot smaller set of possible matches.

  Here is a sample written in JavaScript, where we search a random string in the list of English words:

search = getRandomString(); // this is the search string
let matches=[];             // the list of found matches
let bestDistance=1000000;   // the smaller distance to our search found so far
const maxOffset=5;          // a common value for searching similar strings
const l = search.length;    // the length of the search string
for (let word of english) {
    const minDist=Math.abs(l-word.length); // minimum possible distance
    if (minDist>bestDistance) continue;    // if too large, just exit
    const dist=sift4(search,word,maxOffset,bestDistance);
    if (dist<bestDistance) {
        matches = [word];                  // new array with a single item
        bestDistance=dist;
        if (bestDistance==0) break;        // if an exact match, we can exit (optional)
    } else if (dist==bestDistance) {
        matches.push(word);                // add the match to the list
    }
}

  There are further optimizations that can be added, beyond the scope of this post:

  • words can be grouped by length and the minimum distance check can be done on entire buckets of strings of the same lengths
  • words can be sorted, and when a string is rejected as a match, reject all string with the same prefix
    • this requires an update of the Sift algorithm to return the offset at which it stopped (to which the maxOffset must be added)

  I am still thinking of performance improvements. The transposition table gives more control over the precision of the search, but it's rather inefficient and resource consuming, not to mention adding code complexity, making the algorithm harder to read. If I can't find a way to simplify and improve the speed of using transpositions I might give up entirely on the concept. Also, some sort of data structure could be created - regardless of how much time and space is required, assuming that the list of strings to search is large and constant and the number of searches will be very big.

  Let me know what you think in the comments!

and has 0 comments

  Have you ever heard the saying "imitation is the sincerest form of flattery"? It implies that one copying another values something in the other person. But often enough people just imitate what they want, they pick and choose, they imitate poorly or bastardize that which they imitate. You may imitate the strategy a hated opponent uses against you or make a TV series after books that you have never actually read. I am here to argue that satire cannot be misused like that.

  Remember when Star Trek: Lower Decks first appeared? The high speed spoken, meme driven, filled with self deprecating jokes, having characters typical to coastal cities of the United States and, worse of all, something that made fun of Star Trek? After having idiots like J. J. Abrams completely muddle the spirit of Trek, now come these coffee drinking groomed beard bun haired hipsters to destroy what little holliness is left! People were furious! In fact, I remember some being rather adamant that The Orville is an unacceptable heresy on Star Trek.

  Yet something happened. Not immediately, it took a few episodes, sometimes a season, for the obvious jokes to be made, the frustrations exhausted, for characters to grow. And then there is was: true Star Trek, with funny characters following the spirit of the original concept. No explosions, no angry greedy violent people imposing their culture over the entire universe, but rather explorers of the unknown, open to change and new experiences, navigating their own flaws as humans in a universe larger than comprehension. And also honest and funny!

  It was the unavoidable effect of examining something thoroughly for a longer period of time. One has to understand what they satirize in order to make it good. Not just skim the base ideas, not just reading the summaries of others. Satire must go deep into the core of things, the causes not just the effects, the expressions, the patterns, the in-jokes. Even when you are trying to mock something you hate, like a different ideology or political and religious belief, you can only do it for a really short time or become really bad at what you are doing, a sad caricature to which people just as clueless as you are attempting to disguise anger by faking amusement. If you do it well and long enough, every satire makes you understand the other side.

  Understanding something does not imply accepting it, but either accepting or truly fighting something requires understanding. You want a tool to fight divisiveness, this artificial polarization grouping people into deaf crowds shouting at each other? That's satire! The thing that would appeal to all sides, for very different reasons, yet providing them with a common field on which to achieve communication. If jokes can diffuse tension in an argument between two people, satire can do that between opposing groups.

  And it works best with fiction, where you have to create characters and then keep them alive as they develop in the environment you have created, but not necessarily. I've watched comedians making political fun of "the other side" for seasons on end. They lost me every time when they stopped paying attention and turned joke to insult, examination to judgement. But before that, they were like a magnifying glass, both revealing and concentrating heat. At times, it was comedians who brought into rational discussion the most serious of news, while the news media was wallowing in political messaging and blind allegiance to one side or the other. When there is no real journalism to be found, when truth is hidden, polluted or discouraged, it is in jokes that we continue the conversation.

  So keep it coming, the satire, the mocking, the ridicule. I want to see books like Harry Potter and the Methods of Rationality, shows like Big Mouth and The Orville and ST: Lower Decks, movies like Don't Look Up! Give me low budget parodies of Lovecraft and Tolkien and James Bond and Ghost Busters and Star Wars and I guarantee you than by the second season they will be either completely ignored by the audience and cancelled or better than the "official" shows, for humor requires a sharp wit and a clear view of what you're making fun of.

  Open your eyes and, if you don't like what you see, make fun of it! Replace shouting with laughter, outrage with humor, indifference with amusement. 

  Today I had a very interesting discussion with a colleague who optimized my work in Microsoft's SQL Server by replacing a table variable with a temporary table. Which is annoying, since I've done the opposite plenty of time, thinking that I am choosing the best solution. After all, temporary tables have the overhead of being stored into tempdb, on the disk. What could possibly be wrong with using a table variables? I believe this table explains it all:

First of all, the storage is the same. How? Well, table variables start off in memory, but if they go above a limit they get saved to tempdb! Another interesting bit is the indexes. While you can create primary keys on table variables, you can't use other indexes - that's OK, though, because you would hardly need very complex variable tables. But then there is the parallelism: none for table variables! As you will see, that's rather important. At least table variables don't cause recompilations. And last, but certainly not least, perhaps the most important difference: statistics! You don't have statistics on table variables.

Let's consider my scenario: I was executing a stored procedure and storing the selected values in a table variable. This SP had the single reason to filter the ids of records that I would then have to extract - joining them with a lot of other tables - and could return 200, 800 or several hundred thousand rows.

With a table variable this means :

  1. when inserting potentially hundreds of thousands of rows I would have no parallelism (slow!) and it would probably save it to tempdb anyway (slow!)
  2. when joining other tables with it, not having statistics, it would just treat it like a short list of values, which it potentially wasn't, and looping through it : Table Spool (slow!)
  3. various profiling tools would show the same or even less physical reads and the same SQL server execution time, but the CPU time would be larger than execution time (hidden slow!)

This situation has been improved considerably in SQL Server 2019, to the point that in most cases table variables and temporary tables show the same performance, but versions previous to that would show this to a larger degree.

And then there are hacks. For my example, there is reason why parallelism DOES occur:

So are temporary tables always better? No. There are several advantages of table variables:

  1. they get cleared automatically at the end of their scope
  2. result in fewer recompilations of stored procedures
  3. less locking and resources, since they don't have transaction logs

For many simple situations, like where you want to generate some small quantity of data and then work with that, table variables are best. However, as soon as the data size or scenario complexity increases, temporary tables become better.

As always, don't believe me, test! In SQL everything "depends", you can't rely on fixed rules like "X is always better" so profile your particular scenarios and see which solution is better.

Hope it helps!