and has 0 comments

  Winter Tide, the first book of the series, was a refreshing blend of Lovecraftian Mythos and a perspective focused on balance and peace, rather than power problem solving. So, a year and a half ago, while I said I enjoyed reading it, I was also saying that it was a bit slow in the beginning.

  Deep Roots has a few things going against it. The novelty wore off, for once. Then the characters are not reintroduced to the reader, there are no flashbacks or summaries, so I had no idea who everyone was anymore. Finally, it had almost the same structure as the first book, but without introducing new lore elements and instead just popping up new characters, as if keeping track of existing ones wasn't enough work. It starts slow and the pace only picks up towards the end. This made it hard for me to finish the book and maintain interest in the story.

 This time, Aphra and her motley crew need to stop the Outer Ones, ancient creatures with immense power, from saving humanity from extinction. Yes, it is a worthy goal, but they want to do it by enslaving and controlling Earth, treating us like the impetuous children that we are. With such a cosmic threat I would have expected cosmic scenes, powerful emotions, explosive outcomes, but it was all very civilized and ultimately boring.

  Bottom line: Ruthanna Emrys' fresh perspective persists in this second installment of the Innsmouth Legacy series, but it isn't fresh anymore. I am sure the experience is much better if you read Winter Tide just before it. As it stands, Deep Roots reads like a slightly boring detective story with some mystical elements sprayed in. I don't regret reading it, but I don't feel the need for more.

and has 0 comments

  Just wanted to give you a heads up that Siderite's blog now has a Discord server and you can easily talk to me and other blog viewers that connected there by clicking on the Discord icon on top of the search box. I just heard of this Discord app and it seems to permit this free chat server and invite link. Let me know if anything goes wrong with it.

  Try it! 

and has 0 comments

  One can take a container in which there is water and keep pouring oil in and after a time there will be more oil than water. That's because oil is hydrophobic, it "fears water" in a direct translation of the word. You can then say that the percentage of oil is higher than the percentage of water, that there is more oil in the container. Skin color in a population doesn't work like that, no matter how phobic some people are. Instead of water and oil, it's more like paint. One can take a container in which there is white paint and keep pouring black, red, yellow and brown paint in, but from a very early stage, that paint is no longer white.

  I keep finding these statistics about which part of the world is going to have Whites in a minority after a while. Any statistic counting people by color of skin is purist in nature and, as we should know by now, the quest for purity begets violence. The numbers are irrelevant if the basis of these statistics is conceptually wrong. In a true openly diverse population, white skin color should disappear really quick. The only chance for it to exist is if people with white skin would not mingle with people of any other color.

  What is a White person? Someone who has white skin? Someone who has European ancestry? Someone who has no ancestry that is not European? Are Jews white? How about coptic Egyptians? Some Asians are really white, too. There is no argument that uses the concept of White which is not directly dependant on the idea of racial purity. And then there is Non-White. A few days ago someone was noting that it feels weird to use the term Latino, considering how many different countries and interests are represented by the people labeled as such. So how can anyone meaningfully use a term like Non-White, which groups together Black people, Mexicans, Chinese, Indians, Eskimos and Native-Americans, among many others? Two "African-American" people of identical skin color may be as different as someone can imagine: one a many generations American with slave ancestors, the other a middle-class African recently arrived in the US.

  What I am saying is that the most politically correct terms, used (and imposed) by proponents (and arbiters) of racial justice and equality, are as purist as they could be. The only argument that one can possibly bring here is that purism is somehow different and distinct from racism. This is absurd. One can be a purist and not be racist, but not the other way around. In fact, when people are trying to limit your freedom of expression because some of your words or concepts may be offensive, they are in fact fighting for the purity of ideas, one that is not marred by a specific idea of purity that they are against. These are similar patterns, so similar in fact that I can barely see a difference. No wonder this kind of thinking has taken root most in a country where a part of its founders were called Puritans!

  So how about we change the rhetoric to something that does not imply segregation or a quest for purity or a war on something or cancelling other people or creating safe spaces or hating something that is other? And the phrase above is not ironic, since I am not proposing we fight against this kind of ideas, only that we acknowledge their roots and that we come up with new ones. Let us just grow in different directions, rather than apart.

and has 0 comments

   This is one of those books that when summarized sounds a lot better than the actual story. In The Sorcerer of the Wildeeps, we discover a brutal medieval and magical world, where magic is related to people from the stars who are considered gods, but may just be very technologically advanced humans. Yet this is just a vague backdrop for a two hundred and something pages novel. The main focus is on a guy who alternates between a very rough vernacular Black English and technobabble which one else seems to understand, who has magical powers and is part of a gang of fighters that work as security for caravans. He is a demigod, the blood of the star people courses through his veins, but he hides that part of him from the world. He is also in love with another guy who has the blood and much of the text concerns this gay relationship, which he also hides from the world. There are also some brutal fight scenes, but they don't bring anything to the story other than to make it a fantasy.

  I saw other people just as confused as I am. Maybe there was some subtlety that I missed and that is why so many people praise this first novel of Kai Ashante Wilson, a guy who started writing in 2013. Why are the reviews so wonderful? To me it felt like an above average pulp story, akin to those about cowboys riding dinosaurs. The writing style is also difficult enough to make the book less entertaining that it could have been. It's like Wilson rubs our noses into some intellectual shit that I can't even smell. Or is it that is it just another mediocre book that gets positive political reviews because it promotes Black culture and features gay people?

  So my conclusion is that I did not enjoy the book. It took me ages to finish it and I couldn't relate to any of the characters. I thought the world was very interesting though, which makes this even more frustrating, since it was barely explored.

and has 0 comments

  It has been a long time since I've finished a book. I just didn't feel like it, instead focusing on stupid things like the news. It's like global neurosis: people glued to their TV screens listening to what is essentially the same thing: "we have no control, we don't know enough and we feel better bitching about it instead of doing anything to change it". I hope that I will be able to change my behavior and instead focus on what really matters: complete fiction! :)

  Anyway, Unfettered is a very nice concept thought up by Shawn Speakman: a contribution based anthology book. Writers provide short stories, complete with a short introduction, as charity. The original Unfettered book was a way by which writers helped Speakman cover some of his medical expenses after a cancer diagnosis and the idea continued, helping others with the same problem. This way of doing things, I believe, promotes a more liberating way of writing. There is no publisher pressure, no common theme, writers are just exploring their own worlds, trying things out.

  Unfettered III contains 28 shorts stories from authors like Brandon Sanderson, Lev Grossman, Mark Lawrence, Terry Brooks, Brian Herbert, Scott Sigler and more. Funny enough, it was Sanderson's own addition to the Wheel of Time literature that I found most tedious to finish, mostly because I couldn't remember what the books were about anymore and who all the characters were. But the stories were good and, even if the book is twice as large as I think it should have been, it was entertaining. Try it out, you might enjoy this format.

and has 2 comments

  There is this feeling in the online community that no matter what governments and corporations do, we will find ways of avoiding restrictions and remain free. Nowhere is this feeling stronger than in the media and software piracy circles. Yet year after year people get more and more complacent, moving from having to find and download the content they enjoy to streaming services that end up asking for more money than TV and cinema combined, moving from desktop games to mobiles, switching from software you own to software you subscribe to and lease. And every year more and more "hydra heads" get cut and none grow back.

  Today we say goodbye to HorribleSubs.info, a web site that provided a free archive of hundreds of anime show torrent/magnet links which were subtitled in English. "You could technically say COVID killed HorribleSubs.", the notice on the web site now says. If you think about it, it was difficult to understand how the site survived for so long, when my blog was closed for showing a manga image taken from Google and a YouTube video after a copyright request from Japan. How could these guys maintain a directory of almost every popular anime and get away with it? But they will be missed, regardless of the real reason for their disappearance.

  It's hard to say how this will affect people. TorrentFreak hasn't even written something about it yet. Will this mean that less translated anime will be available? Or maybe even make it harder to find anime at all? It's a shocking development... Hail Hydra!

and has 0 comments

  I've read today this CNN article: 'Star Trek: Discovery' to introduce history-making non-binary and transgender characters. And it got me thinking on what this means for the Star Trek universe. It means absolutely nothing. Star Trek has had people turned into other species, duplicated, merged, their genetic code altered and fixed, made young and old. It has had species with no gender, multiple genders and various philosophies. It has interspecies relationships, including sexual.

  Star Trek has tackled intolerance many times, usually by showing the Federation crew having contact with an alien species that does the same things we do today, in caricature. It tackled race intolerance, from Kirk's kiss with Uhura to the episode with the species with black on one side and white on the other discriminating the people who had their colors the other way around. It tackled gender discrimination in multiple situations. It tackled sex change and identity change with the Trill. It featured multi sex civilisations. The happy tolerance train seems to stop with anything related to using inorganic technology with the human body, but no one is perfect and Janeway was awful with everybody.

  A person who is biologically a man yet desires to be treated as a woman would be normal for Star Trek. It would be inconsequential. If they go the way of the oppressed member of another culture that they meet, they will not solve anything, they will just have another weird alien around, which defeats the purpose. If they go with a non-binary crewmember they should not acknowledge the fact except in passing. Yes, habituate the public with the concept, let them see it in a series and get used to it, but the people in Star Trek should already have passed that point. Hell, they could go with a person who changes their sex every one in a while, to spice things up.

  What I would do is have a character who is clearly of a different sex than the gender they identify with and someone badgering them to have a proper sex change while they refuse. Now that would be a Star Trek worthy dilemma. You want to make it spicy? Have them go to the doctor and change their race instead, behave like a black person while wearing the high tech equivalent of blackface. What? Are you refusing someone the ownership of their identity?

  I really doubt they will go that way, though. Instead they will find some way of bringing the subject up again and again and again and throw it in our faces like a conflict that has to be resolved. In the bright and hopeful future, there should be no conflict about it! This CBS announcement should not have existed. You want to put some transgender people in, OK, put them in. It's not a boasting point, is it? The announcement basically does the opposite of what they claim to do: "Oh, look, we put non binary people in our series! How quaint! Hurrah! Only we do it, come watch the freak show!".

  Please, writers, please please please, don't just write stories then change the gender or race of characters because it's en vogue. Stop it with the gender swapping, which is the creative equivalent of copy and paste. Write with the story in mind, with the context, with the characters as they would normally behave. Don't add characters after you've thought of the story just to make them diverse either. Just write stories with characters that make sense! You don't know people from that demographic? Find one, spend time with them, then adjust your characters accordingly. I am so tired of tiny female action heroes, flamboyant and loud gays and the wise old lesbian. How come no one finds those offensive? It's like someone said "OK, we will have shitty black and female and non-cis characters for now. When people get used to them, we will actually have them do something and be realistic and perhaps in 2245 we'll even have them be sympathetic".

  They tried the woke way from the very beginning in Discovery, with the Stamets/Culber gay couple. They kept showing them kissing and washing their teeth together and other stuff like that, when it made little difference to the story. Most people on Star Trek are written as single, for some weird reason that makes no sense, unless their relationship furthers the story. Riker and Troi could be the exception, though, yet even they were not kissy kissy on the bridge all the time. I never understood that couple. Dax and Worf made more sense, for crying out loud! And remember Starfleet is a military organization. You may put women and men and trans and aliens and robots together in a crew, but their role is to do their job. Their sex, their gender even less, makes no difference.

  Gene Roddenberry was a dreamer of better futures, where all of our idiotic problems have been left behind and reason prevailed, but even he imagined a third World War leading to humanity changing its ways as a start. Star Trek has always analysed the present from the viewpoint of an idyllic future, a way of looking back that is inherently rational: "Imagine the future you want, then wonder what would people from that time think of you". It's brilliant! Don't break that to bring stupid into the future. To tackle present social issues you have to first be a Trekkie, already there in the exalted future, before you consider the dark ages of the 21st century with a fresh perspective.

  I've just read a medical article that seems to be what we have been looking for since this whole Covid thing started: an detailed explanation of what it does in the body. And no, it didn't come from doctors in lab coats, it came from a supercomputer analysing statistical data. Take that, humans! Anyway... First of all, read the article: A Supercomputer Analyzed Covid-19 — and an Interesting New Theory Has Emerged. And before you go all "Oh, it's on Medium! I don't go to that crap, they use a paywall!", know that this is a free article. (also you can read anything on Medium if it seems to be coming from Twitter)

  Long story short (you should really read the article, though) is that the virus binds to the ACE2 receptors - and degrades them, then tricks the body to make even more ACE2 receptors (even in organs that normally don't express them as much) to get even more virus in. The virus also tweaks the renin–angiotensin system  which leads to a Bradykinin storm which causes multiple symptoms consistent with what is seen in hospitals and leaves many a doctor stumped: dry cough, blood pressure changes, leaky blood vessels, a gel filling one's lungs (making ventilators ineffective), tiredness, dizziness and even loss of smell and taste. Also, because of a genetic quirk of the X chromosome, women are less affected, which also is shown in statistical data on severe cases.

  Quoting from the article: several drugs target aspects of the RAS and are already FDA approved to treat other conditions. They could arguably be applied to treating Covid-19 as well. Several, like danazol, stanozolol, and ecallantide, reduce bradykinin production and could potentially stop a deadly bradykinin storm. Others, like icatibant, reduce bradykinin signaling and could blunt its effects once it’s already in the body.

  Good stuff, people! Good stuff! The person responsible for this is Daniel A Jacobson and his research assistants should take all the credit! Just kidding.

  But how new is this? Bradykinin is not an unknown peptide and we have known from the very beginning what ACE does and that Covid binds to it. My limited googling shows doctors noticing this as soon as the middle of March. In fact, the original article that the Medium article is based on is from July 7! Here is a TheScientist take on it: Is a Bradykinin Storm Brewing in COVID-19?

  For more info, here is a long video talking about the paper: Bradykinin Storm Instead of Cytokine Storm?

[youtube:tDbRfur36sE]

  If you really are into medicine, check this very short but very technical video about Bradykinin, from where I also stole the image for this post: Bradykinin | Let the Drama begin!

[youtube:d39-IcoWHkY]

  I hope this provided you with some hope and a starting point for more research of your own.

and has 0 comments

  For a more in depth exploration of the concept, read Towards generic high performance sorting algorithms

Sorting

  Consider QuickSort, an algorithm that uses a divide and conquer strategy to sort efficiently and the favourite in computer implementations.

  It consists of three steps, applied recursively:

  1. find a pivot value
  2. reorder the input array so that all values smaller than the pivot are followed by values larger or equal to it (this is called Partitioning)
  3. apply the algorithm to each part of the array, before and after the pivot

  QuickSort is considered generic, meaning it can sort any type of item, assuming the user provides a comparison function between any two items. A comparison function has the same specific format: compare(item1,item2) returning -1, 0 or 1 depending on whether item1 is smaller, equal or larger than item2, respectively. This formalization of the function lends more credence to the idea that QuickSort is a generic sorting algorithm.

  Multiple optimizations have been proposed for this algorithm, including using insertion sort for small enough array segments, different ways of choosing the pivot, etc., yet the biggest problem was always the optimal way in which to partition the data. The original algorithm chose the pivot as the last value in the input array and the average complexity was O(n log n), but worse case scenario was O(n^2), when the array was already sorted and the pivot was the largest value. Without extra information you can never find the optimal partitioning schema (which would be to choose the median value of all items in the array segment you are sorting).

  But what if we turn QuickSort on its head? Instead of providing a formalized comparison function and fumbling to get the best partition, why not provide a partitioning function (from which a comparison function is trivial to obtain)? This would allow us to use the so called distribution based sorting algorithms (as opposed to comparison based ones) like Radix, BurstSort, etc, which have a complexity of O(n) in a generic way!

  My proposal for a formal signature of a partitioning function is partitionKey(item,level) returning a byte (0-255) and the sorting algorithm would receive this function and a maximum level value as parameters.

  Let's see a trivial example: an array of values [23,1,31,0,5,26,15] using a partition function that would return digits of the numbers. You would use it like sort(arr,partFunc,2) because the values are two digits numbers. Let's explore a naive Radix sorting:

  • assign 256 buckets for each possible value of the partition function result and start at the maximum (least significant) level
  • put each item in its bucket for the current level
  • concatenate the buckets
  • decrease the level and repeat the process

Concretely:

  • level 1: 23 -> bucket 3, 1 -> 1, 31 -> 1, 0 -> 0, 5 -> 5, 26 -> 6, 15 -> 5 results in [0,1,31,5,15,6]
  • level 0: 0 -> 0, 1 -> 0, 31 -> 3, 5 -> 0, 15 -> 1, 6 -> 0 results in [0,1,5,6,15,31]

Array sorted. Complexity is O(n * k) where k is 2 in this case and depends on the type of values we have, not on the number of items to be sorted!

  More complex distribution sorting algorithms, like BurstSort, optimize their function by using a normal QuickSort in small enough buckets. But QuickSort still requires an item comparison function. Well, it is easy to infer: if partFunc(item1,0) is smaller or larger than partFunc(item2,0) then item1 is smaller or larger than item2. If the partition function values are equal, then increase the level and compare partFunc(item1,1) to partFunc(item2,1).

  In short, any distribution sorting algorithm can be used in a generic way provided the user gives it a partitioning function with a formalized signature and a maximum level for its application.

  Let's see some example partitioning functions for various data types:

  • integers from 0 to N - maximum level is log256(N) and the partition function will return the bytes in the integer from the most significant to the least
    • ex: 65534 (0xFFFE) would return 255 (0xFF) for level 0 and 254 (0xFE) for level 1. 26 would return 0 and 26 for the same levels.
  • integers from -N to N - similarly, one could return 0 or 1 for level 0 if the number is negative or positive or return the bytes of the equivalent positive numbers from 0 to 2N 
  • strings that have a maximum length of N - maximum level would be N and the partition function would return the value of the character at the same position as the level
    • ex: 'ABC' would return 65, 66 and 67 for levels 0,1,2.
  • decimal or floating point or real values - more math intensive functions can be found, but a naive one would be to use a string partitioning function on the values turned to text with a fixed number of digits before and after the decimal separator.
  • date and time - easy to turn these into integers, but also one could just return year, month, day, hour, minute, second, etc based on the level
  • tuples of any of the types above - return the partition values for the first item, then the second and so on and add their maximum levels

  One does not have to invent these functions, they would be provided to the user based on standard types in code factories. Yet even these code factories will be able to encode more information about the data to be sorted than mere comparison functions. Stuff like the minimum and maximum value can be computed by going through all the values in the array to be sorted, but why do it if the user already has this information, for example.

  Assuming one cannot find a fixed length to the values to be sorted on, like real values or strings of any length, consider this type of sorting as a first step to order the array as much as possible, then using something like insertion or bubble sort on the result.

Finding a value or computing distinct values

  As an additional positive side effect, there are other processes on lists of items that are considered generic because they use a formalized form function as a parameter. Often found cases include finding the index of an item in a list equal to a given value (thus determining if the value exists in a list) and getting the distinct values from an array. They use an equality function as a parameter which is formalized as returning true or false. Of course, a comparison function could be used, depending on if its result is 0 or not, but a partitioning function can also be used to determine equality, if all of the bytes returned on all of the levels are equal.

  But there is more. The format of the partition function can be used to create a hash set of the values, thus reducing the complexity of the search for a value from O(n) to O(log n) and that of getting distinct values from O(n^2) to O(n log n)!

  In short, all operations on lists of items can be brought together and optimized by using the same format for the function that makes them "generic": that of a partitioning function.

Conclusion

  As you can see, I am rather proud of the concepts I've explained here. Preliminary tests in Javascript show a 20 fold improvement in performance for ten million items when using RadixSort over the default sort. I would really love feedback from someone who researches algorithms and can even test these assumptions under benchmark settings. Them being complex as they are, I will probably write multiple posts on the subject, trying to split it (partition it?) into easily digestible bits

 The concept of using a generic partitioning function format for operations on collections is a theoretical one at the moment. I would love to collaborate with people to get this to production level code, perhaps taking into account advanced concepts like minimizing cache misses and parallelism, not only the theoretical complexity.

 More info and details at Towards generic high performance sorting algorithms

and has 0 comments

  There are a lot of fascinating ideas and anecdotes in this book, especially in the areas which I wouldn't have considered interesting before reading it. Rabid is the type of book that I love, both because the subject is fascinating but also because of the effort the author made to research and write the content in a digestible format.

  In this book Bill Wasik and Monica Murphy describe the history of the rabies virus, how it affected humankind culturally, historically and, of course, medically. We learn in this book that there is a strong possibility that the myths of vampire and werewolf stem from the behaviour of people affected by rabies, the theme of beast biting person and turning them into one of their own proven irresistible even in times where no one understood how diseases work. Was Hector rabid when fighting Achilles? Were berserkers affected by rabies? Then we go into the actual zoonotic origin of the virus, a staggering 60% of infectious diseases affecting humans being of animal original initially. An idea I found extremely interesting is that farmers took over from hunter gatherers in so little time and so thoroughly because raising animals made them get new diseases to which they developed immunity, any contact with non farming populations thus fatally destroying them. Finally, a very nice perspective on Louis Pasteur, who is more popularly renowned for developing pasteurization and thus providing us with better tasting drinks than his final triumph which was a vaccine for rabies and an institute dedicated to studying infectious diseases.

  Bottom line: it might sound like a weird subject to read about or at least one hard to digest. The authors' writing is very good, the research splendid, and the book short enough to not take too much of the reader's time. I recommend it!

and has 0 comments

  The Book of the Ancestor trilogy consists of Red Sister, Grey Syster, Holy Sister, books that tell the continuous story of Nona, a girl with magical powers who is trained as a warrior nun by the church on a feudal world called Abeth. It feels almost the same as the Harry Potter books: a school for children where they learn only exciting stuff like magic and fighting and where the group of friends that coalesces around the main character has to solve more and more complex and dangerous problems. And it pretty much has the same issues, as any of the actors in the story could have easily handled a little child regardless of her powers because... she's a child! Also, the four "houses" are here replaced with genetic lines that provide the owner with various characteristics.

  Anyway, I liked all three books, although I have to say that I liked them less and less as the ending approached. Tools used to solve some problems were not used for similar issues later on, the girls were learning more and more stuff and become more and more powerful, while all of their opponents seemed to lack the ability to reach their level even with greater numbers and funding and, maybe worse of all, whenever it was inconvenient to detail the evolution of the characters and the story, Mark Lawrence just skips to some point in the future. Thus, each of the last two books is separated from the previous one by two years!

  Another qualm that I have with the series is that the author spent a lot of effort to create a magical world, with a dying sun and with a vague history that may or may not have involved spaceships and an alien race, with various magical tools that can be combined to various and epic effects, with several kingdoms, each different from another. Then the story ends, as if all we could or should ever care about is what happens to Nona.

  Bottom line: if you liked Harry Potter, you might want to read this series. It pushes the same buttons, while getting less and less consistent as more stuff is added, then leaving you wanting more of the world that was described, even if you didn't especially liked the characters or their choices.

and has 0 comments

  The Broken Ladder is a sociology book that is concise and to the point. I highly recommend it. Keith Payne's thesis is that most of the negative issues we associate with poverty or income are statistically proven to be more correlated with inequality and status. And this is not a human thing, as animal studies show that this is a deeply rooted behavior of social animals like monkeys and has a genetic component that can be demonstrated to as simple creatures as fruit flies.

  There are nine chapters in the book, each focusing on a particular characteristic of effect of social inequality. We learn that just having available a sum of money or a set of resources is meaningless to the individual. Instead, more important is how different those resources are from other people in the same group. Inequality leads to stress, which in turn leads to toxic behaviors, health problems, developmental issues. It leads to risk taking, to polarization in politics, it affects lifespan, it promotes conspiracy theories, religious extremism and racism.

  It is a short enough book that there is no reason for me to summarize it here. I believe it's a very important work to examine, as it touches on many problems that are present, even timeless. Written in 2017, it feels like a explanatory pamphlet to what gets all the media attention in 2020.

and has 0 comments

  I have to admit this is a strange time for me, in which I struggle with finishing any book. My mind may drift away at a moment's notice, thoughts captured by random things that inflame my interest. And with limited time resources, some things fall through the cracks, like the ability to dedicate myself to books that don't immediately grab my attention. Such a book is The Ten Thousand Doors of January.

  And you know the type. It's one of those where the way the word sounds as you read it is more important that what it says, a sort of magical white poetry that is attempting to evoke rather than tell, feel rather than reason, while also maintaining a rigorous intellectual style. Alix E. Harrow is a good writer and it shows, however she is too caught up in her own writing. This story features a girl with an ability that is manifested when she writes words of power. She is an avid reader and, in order to learn about her capabilities, she receives a book that tells the story of another girl who was similar to her. And the style of that book is, you guessed it, words crafted to evoke rather than tell.

  So at about 40% of the book nothing had happened other than a girl of color living in a house of plenty, but imprisoned by rules and starved of knowledge or power. Her captor and adoptive father is a white and powerful aristocrat, cold as ice and authoritative in every action or word, while she is a good girl caught between her desires and her upbringing. I've read books like this before and I liked some of them a lot. And this may yet evolve into a beautiful story, but as I was saying above, I am not in the mood for paying that much attention before something happens.

  In conclusion, while I get a feeling of being defeated and a desire to continue reading the book, I also have to accept I don't have the resources to struggle with it. I would rather find a more comfortable story for me at this time.

Intro

  There is a saying that the novice will write code that works, without thinking of anything else, the expert will come and rewrite that code according to good practices and the master will rewrite it so that it works again, thinking of everything. It applies particularly well to SQL. Sometimes good and well tried best practices fail in specific cases and one must guide themselves either by precise measurements of by narrow rules that take decades to learn.

  If you ever wondered why some SQL queries are very slow or how to write complex SQL stored procedures without them reaching sentience and behaving unpredictably, this post might help. I am not a master myself, but I will share some quick and dirty ways of writing, then checking your SQL code.

Some master rules

  First of all, some debunking of best practices that make unreasonable assumptions at scale:

  1. If you have to extract data based on many parameters, then add them as WHERE or ON clauses and the SQL engine will know how to handle it.

    For small queries and for well designed databases, that is correct. The SQL server engine is attempting to create execution plans for these parameter combinations and reuse them in the future on other executions. However, when the number of parameters increases, the number of possible parameter combinations increases exponentially. The execution optimization should not take more than the execution itself, so the engine if just choosing one of the existing plans which appears more similar to the parameters given. Sometimes this results in an abysmal performance.

    There are two solutions:

    The quick and dirty one is to add OPTION (RECOMPILE) to the parameterized SELECT query. This will tell the engine to always ignore existing execution plans. With SQL 2016 there is a new feature called Query Store plus a graphical interface that explores execution plans, so one can choose which ones are good and which ones are bad. If you have the option, you might manually force an execution plan on specific queries, as well. But I don't recommend this because it is a brittle and nonintuitive solution. You need a DBA to make sure the associations are correct and maintained properly.

    The better one, to my own surprise, is to use dynamic SQL. In other words, if you have 20 parameters to your stored procedure, with only some getting used at any time (think an Advanced Search page), create an SQL string only with the parameters that are set, then execute it.

    My assumption has always been that the SQL engine will do this for me if I use queries like WHERE (@param IS NULL OR <some condition with @param>). I was disappointed to learn that it does not always do that. Be warned, though, that most of the time multiple query parameters are optimized by running several operations in parallel, which is best!

  2. If you query on a column or another column, an OR clause will be optimal. 

    Think of something like this: You have a table with two account columns AccId and AccId2. You want to query a lot on an account parameter @accountId and you have added an index on each column.

    At this time the more readable option, and for small queries readability is always preferable to performance improvement, is WHERE AccId=@accountId OR AccId2=@accountId. But how would the indexes be used here, in this OR clause? First the engine will have to find all entries with the correct AccId, then again find entries with the correct AccId2, but only the entries that have not been found in the first search.

    First of all, SQL will not do this very well when the WHERE clause is very complex. Second of all, even if it did it perfectly, if you know there is no overlap, or you don't care or you can use a DISTINCT further on to eliminate duplicates, then it is more effective to have two SELECT queries, one for AccId and the other for AccId2 that you UNION ALL afterwards.

    My assumption has always been that the SQL engine will do this automatically. I was quite astounded to hear it was not true. Also, I may be wrong, because different SQL engines and their multitude of versions, compounded with the vast array of configuration options for both engine and any database, behave quite differently. Remember the parallelism optimization, as well.

  3. Temporary tables as slow, use table variables instead.

    Now that is just simple logic, right? A temporary table uses disk while a table variable uses memory. The second has to be faster, right? In the vast majority of cases this will be true. It all depends (a verb used a lot in SQL circles) on what you do with it.

    Using a temporary table might first of all be optimized by the engine to not use the disk at all. Second, temporary tables have statistics, while table variables do not. If you want the SQL engine to do its magic without your input, you might just have to use a temporary table.

  4. A large query that does everything is better than small queries that I combine later on.

    This is a more common misconception than the others. The optimizations the SQL engine does work best on smaller queries, as I've already discussed above, so if a large query can be split into two simpler ones, the engine will be more likely able to find the best way of executing each. However, this only applies if the two queries are completely independent. If they are related, the engine might find the perfect way of getting the data in a query that combines them all.

    Again, it depends. One other scenario is when you try to DELETE or UPDATE a lot of rows. SQL is always "logging" the changes that it does on the off chance that the user cancels the query and whatever incomplete work has been done has to be undone. With large amounts of data, this results into large log files and slow performance. One common solution is to do it in batches, using UPDATE (TOP 10000) or something similar inside a WHILE loop. Note that while this solves the log performance issue, it adds a little bit of overhead for each executed UPDATE

  5. If I have an index on a DATETIME column and I want to check the records in a certain day, I can use CAST or CONVERT.

    That is just a bonus rule, but I've met the problem recently. The general rule is that you should never perform calculations on columns inside WHERE clauses. So instead of WHERE CAST(DateColumn as DATE)=@date use WHERE DateColumn>=@date AND DateColumn<DATEADD(DAY,1,@date). The calculation is done (once) on the parameters given to the query, not on every value of DateColumn. Also, indexes are now used.

Optimizing queries for dummies

So how does one determine if one of these rules apply to their case? "Complex query" might mean anything. Executing a query multiple times results in very different results based on how the engine is caching the data or computing execution plans.

A lot of what I am going to say can be performed using SQL commands, as well. Someone might want to use direct commands inside their own tool to monitor and manage performance of SQL queries. But what I am going to show you uses the SQL Management Studio and, better still, not that horrid Execution Plan chart that often crashes SSMS and it is hard to visualize for anything that the most simple queries. Downside? You will need SQL Management Studio 2014 or higher.

There are two buttons in the SSMS menu. One is "Include Actual Execution Plan" which generates an ugly and sometimes broken chart of the execution. The other one is "Include Live Query Statistics" which seems to be doing the same, only in real time. However, the magic happens when both are enabled. In the Results tab you will get not only the query results, but also tabular data about the execution performance. It is amazingly useful, as you get a table per each intermediary query, for example if you have a stored procedure that executes several queries in a row, you get a table for each.

Even more importantly, it seems that using these options will start the execution without any cached data or execution plans. Running it several times gives consistent execution times.

In the LiveQuery tables, the values we are interested about are, in order of importance, EstimateIO, EstimateCPU and Rows.

EstimateIO is telling us how much of the disk was used. The disk is the slowest part of a computer, especially when multiple processes are running queries at the same time. Your objective is to minimize that value. Luckily, on the same row, we get data about the substatement that generated that row, which parameters were used, which index was used etc. This blog is not about how to fix every single scenario, but only on how to determine where the biggest problems lie.

EstimateCPU is saying how much processing power was used. Most of the time this is very small, as complex calculations should not be performed in queries anyway, but sometimes a large value here shows a fault in the design of the query.

Finally, Rows. It is best to minimize the value here, too, but it is not always possible. For example a COUNT(*) will show a Clustered Index Scan with Rows equal to the row count in the table. That doesn't cause any performance problems. However, if your query is supposed to get 100 rows and somewhere in the Live Query table there is a value of several millions, you might have used a join without the correct ON clause parameters or something like that.

Demo

Let's see some examples of this. I have a Main table, with columns ID BIGINT, Random1 INT, Random2 NVARCHAR(100) and Random3 CHAR(10) with one million rows. Then an Ind table, with columns ID BIGINT, Qfr CHAR(4) and ValInd BIGINT with 10000 rows. The ID table is common with the Main table ID column and the Qfr column has only three possible values: AMT, QTY, Sum.

Here is a demo on how this would work:

DECLARE @r1 INT = 1300000
DECLARE @r2 NVARCHAR(100) = 'a'
DECLARE @r3 CHAR(10) = 'A'
DECLARE @qfr CHAR(4) = 'AMT'
DECLARE @val BIGINT = 500000

DECLARE @r1e INT = 1600000
DECLARE @r2e NVARCHAR(100) = 'z'
DECLARE @r3e CHAR(10)='Z'
DECLARE @vale BIGINT = 600000

SELECT *
FROM Main m
INNER JOIN Ind i
ON m.ID=i.ID
WHERE (@r1 IS NULL OR m.Random1>=@r1)
  AND (@r2 IS NULL OR m.Random2>=@r2)
  AND (@r3 IS NULL OR m.Random3>=@r3)
  AND (@val IS NULL OR i.ValInd>=@val)
  AND (@r1e IS NULL OR m.Random1<=@r1e)
  AND (@r2e IS NULL OR m.Random2<=@r2e)
  AND (@r3e IS NULL OR m.Random3<=@r3e)
  AND (@vale IS NULL OR i.ValInd<=@vale)
  AND (@qfr IS NULL OR i.Qfr=@qfr)

I have used 9 parameters, each with their own values, to limit the number of rows I get. The Live Query result is:

You can see that the EstimateIO values are non-zero only on the Clustered Index Scans, one for each table. Where is how the StmtText looks like: "|--Clustered Index Scan(OBJECT:([Test].[dbo].[Ind].[PK__Ind__DEBF89006F996CA8] AS [i]),  WHERE:(([@val] IS NULL OR [Test].[dbo].[Ind].[ValInd] as [i].[ValInd]>=[@val]) AND ([@vale] IS NULL OR [Test].[dbo].[Ind].[ValInd] as [i].[ValInd]<=[@vale]) AND ([@qfr] IS NULL OR [Test].[dbo].[Ind].[Qfr] as [i].[Qfr]=[@qfr])) ORDERED FORWARD)".

This is a silly case, but you can see that the @parameter IS NULL type of query condition has not been removed, even when parameter is clearly not null.

Let's change the values of the parameters:

DECLARE @r1 INT = 300000
DECLARE @r2 NVARCHAR(100) = NULL
DECLARE @r3 CHAR(10) = NULL
DECLARE @qfr CHAR(4) = NULL
DECLARE @val BIGINT = NULL

DECLARE @r1e INT = 600000
DECLARE @r2e NVARCHAR(100) = NULL
DECLARE @r3e CHAR(10)=NULL
DECLARE @vale BIGINT = NULL

Now the Live Query result is:

Same thing! 5.0 and 7.2

Now, let's do the same thing with dynamic SQL. It's a little more annoying, mostly because of the parameter syntax, but check it out:

DECLARE @sql NVARCHAR(Max)

DECLARE @r1 INT = 300000
DECLARE @r2 NVARCHAR(100) = NULL
DECLARE @r3 CHAR(10) = NULL
DECLARE @qfr CHAR(4) = NULL
DECLARE @val BIGINT = NULL

DECLARE @r1e INT = 600000
DECLARE @r2e NVARCHAR(100) = NULL
DECLARE @r3e CHAR(10)=NULL
DECLARE @vale BIGINT = NULL


SET @sql=N'
SELECT *
FROM Main m
INNER JOIN Ind i
ON m.ID=i.ID
WHERE 1=1 '
IF @r1 IS NOT NULL SET @sql+=' AND m.Random1>=@r1'
IF @r2 IS NOT NULL SET @sql+=' AND m.Random2>=@r2'
IF @r3 IS NOT NULL SET @sql+=' AND m.Random3>=@r3'
IF @val IS NOT NULL SET @sql+=' AND i.ValInd>=@val'
IF @r1e IS NOT NULL SET @sql+=' AND m.Random1<=@r1e'
IF @r2e IS NOT NULL SET @sql+=' AND m.Random2<=@r2e'
IF @r3e IS NOT NULL SET @sql+=' AND m.Random3<=@r3e'
IF @qfr IS NOT NULL SET @sql+=' AND i.Qfr=@qfr'
IF @vale IS NOT NULL SET @sql+=' AND i.ValInd<=@vale'

PRINT @sql

EXEC sp_executesql @sql,
  N'@r1 INT, @r2 NVARCHAR(100), @r3 CHAR(10), @qfr CHAR(4),@val BIGINT,@r1e INT, @r2e NVARCHAR(100), @r3e CHAR(10),@vale BIGINT',
  @r1,@r2,@r3,@qfr,@val,@r1e,@r2e,@r3e,@vale

Now the Live Query results are:

At first glance we have not changed much. IO is still 5.0 and 7.2. Yet there are 3 less execution steps. There is no parallelism and the query has been executed in 5 seconds, not 6. The StmtText for the same thing is now: "|--Clustered Index Scan(OBJECT:([Test].[dbo].[Ind].[PK__Ind__DEBF89006F996CA8] AS [i]), ORDERED FORWARD)". The printed SQL command is:

SELECT *
FROM Main m
INNER JOIN Ind i
ON m.ID=i.ID
WHERE 1=1  AND m.Random1>=@r1 AND m.Random1<=@r1e

Conclusion

Again, this is a silly example. But with some results anyway! In my work I have used this to get a stored procedure to work three to four times faster!

One can optimize usage of IO, CPU and Rows by adding indexes, by narrowing join conditions, by reducing the complexity of executed queries, eliminating temporary tables, partitioning existing tables, adding or removing hints, removing computation from queried columns and so many other possible methods, but they amount to nothing if you cannot measure the results of your changes.

By using Actual Execution Plan together with Live Query Statistics you get:

  • consistent execution times and disk usage
  • a clear measure of what went on with each subquery

BTW, you get the same effect if you use SET STATISTICS PROFILE ON before the query. Yet, I wrote this post with someone that doesn't want to go into extra SQL code in mind. Also, when calculating performance, it is recommended to add a DBCC FREEPROCCACHE line before execution OR add the option RECOMPILE to your query (this doesn't work on a stored procedure execution, you would have to change the SP queries to include RECOMPILE).

I wish I had some more interesting examples for you, guys, but screenshots from the workplace are not something I want to do and I don't do any complex SQL work at home. I hope this helps. 

  When I was looking at Javascript frameworks like Angular and ReactJS I kept running into these weird reducers that were used in state management mostly. It all felt so unnecessarily complicated, so I didn't look too closely into it. Today, reading some random post on dev.to, I found this simple and concise piece of code that explains it:

// simple to unit test this reducer
function maximum(max, num) { return Math.max(max, num); }

// read as: 'reduce to a maximum' 
let numbers = [5, 10, 7, -1, 2, -8, -12];
let max = numbers.reduce(maximum);

Kudos to David for the code sample.

The reducer, in this case, is a function that can be fed to the reduce function, which is known to developers in Javascript and a few other languages, but which for .NET developers it's foreign. In LINQ, we have Aggregate!

// simple to unit test this Aggregator ( :) )
Func<int, int, int> maximum = (max, num) => Math.Max(max, num);

// read as: 'reduce to a maximum' 
var numbers = new[] { 5, 10, 7, -1, 2, -8, -12 };
var max = numbers.Aggregate(maximum);

Of course, in C# Math.Max is already a reducer/Aggregator and can be used directly as a parameter to Aggregate.

I found a lot of situations where people used .reduce instead of a normal loop, which is why I almost never use Aggregate, but there are situations where this kind of syntax is very useful. One would be in functional programming or LINQ expressions that then get translated or optimized to something else before execution, like SQL code. (I don't know if Entity Framework translates Aggregate, though). Another would be where you have a bunch of reducers that can be used interchangeably.