and has 0 comments

  In a world where humans have solved the issues with biological-electronic interfacing you have people, electronically enhanced people, biologically enhanced robots and robots. One of these part biological robots is thinking for itself and... that's the story in All Systems Red. Some corporate shenanigans, some shooting, some world building, but in the end I wasn't charmed by the characters, the idea or the world itself. Probably it all becomes better in the next (at least five) books written by Martha Wells in the same series, but I don't think I am going to follow through.

  Don't get me wrong, I enjoyed reading the book. It was fun, it was pulp, it was short, but I didn't feel that need for more when it ended.

  One of the things that turn me off from AI stories is when they act and feel and think exactly like a human. In this book in particular, this makes sense somewhat, because the main character is a mix of electronics and biological tissue, but I felt no real difference between the bio-robot and the robo-human characters. System AIs were stupid and robotic while Murderbot is watching TV shows for fun because... it has a skin?

  I can only assume that further down the line they discover it's a Robocop-like situation, that might fix this obvious issue with the story, but frankly I don't care.

  Bottom line: a short fun read that lead me nowhere, but was good while on vacation.

and has 0 comments

  I only remember about Ready Player One that it was fun and pleasant to read, with kids exploring a virtual universe of cultural references to reach the magical MacGuffin. Ready Player Two is almost none of that, instead being boring, by the numbers and most of it written as exposition. It's like Sorento tried to write a Ready Player One book. I really did not like it. What was Ernest Cline thinking?!

  The exposition writing style is the thing that annoyed me first. You know when you are reading a book and it has to explain some thing that happened in a previous book, so it takes some well placed paragraphs to talk in the past about that? Well, this book starts with a third of it written like this. A complete third of the book is just exposition! And maybe it would have been OK if it were fun exposition, but no. It basically says "remember the good fun we had in the other book and the glorious feeling of victory? Well, that all went to shit immediately".

  It then proceeds on explaining (also in past tense) how two incredibly sci-fi things just... happened: first a complete machine to brain interface that is just there and you can put it on your head and then... an interstellar starship?! Which, BTW, does nothing for the entire book. It's an impossible to believe part of the story that then has no impact on it.

  Since the Oasis is basically Meta, with a working metaverse, the author does some lazy mental gymnastics to explain how it is still a good thing and how Wade is not Zuckerberg. Only it fails completely. I mean, we are meant to believe Wade temporarily joins the dark side only to recover later, while still remaining a positive character, but he comes up as a hypocrite who has no actual control over himself or what happens. After reading the first half of the book you hope Zuckerberg is going to take over, because Wade is so much worse. And then, the antagonist and a new quest are revealed by matter-of-factly presenting another impossible technological leap.

  No. This book is a total failure. Every character (including the wonderful do-gooder Samantha, voice of conscience and princess of awesome) is unlikeable, the writing style is amateurish and feels like an accountant explains in a board meeting what has happened while the plot is full of holes and deus-ex-machinas. But worst of all, by far, is that the book is not fun at all. 

and has 0 comments

  Clive Thompson is a technology journalist and therefore perfectly position to write a book about how digital technology really affects us. Does it destroy the world? No! Instead, it makes it better. Most of the time and if used well. In Smarter Than You Think, we read about how computers take over some of our tasks, then enhance them when used cooperatively, how new ways of thinking, awareness and literacy are unlocked by technology and how education can be used to improve how we use tech which then in turn can be used to upgrade education. So this is one non fiction book that paints technology in a rosy light and looks forward to the future. We need more of these.

  A few things popped up for me while reading this book. First a quote about teachers and medics. If you reach into the past and you pluck a doctor from 20 years ago and bring them in the present, they will not function well, as they did not keep up to date with the latest discoveries and techniques developed. However, a teacher from 200 years ago can still find a job teaching children. The job hasn't fundamentally changed in centuries... until now. Reading about how good teachers have evolved to make use of digital technology is inspiring.

  Then there was the concept of pluralistic ignorance, where people choose to behave in ways they do not adhere to because they are unaware of the position of the people around them. It was sobering. The book shows how the Internet can help dispel this problem by sharing awareness. That is not the same as "spreading awareness", the governmental and social warrior mindset which requires all people to think alike, but the increase in transparency of what people really think.

  Finally there was a small bit about how pessimistic or negative views are statistically interpreted as more serious, realistic and intelligent than positive ones. Which makes writing the book a bit braver and also explains why everyone is whining all the time.

  Of course, this book was written in 2013. Many things have happened since and the toxicity of public discourse combined with the insidious techniques corporations and groups in power use to manipulate everything can sour even the most optimistic of people. However I found the book still relevant and bringing a fresh sense of hope, without feeling like someone tried to push their worldview down my throat or predict the future for me. Instead it studies the many and often unpredictable ways in which people use technology to make things better.

  I can't say it's a masterpiece, but I enjoyed reading a positive and realistic book like Smarter Than You Think. It was a welcome alternative to the gloom and doom we see directed towards us on a daily basis.

and has 0 comments

  There is a psychological theory that tries to categorize behavior and personality into three: the Child, the Parent and the Adult. I am not really a specialist (I feel that the word "psychological" is an oxymoron), but in short you get the Child, who feels things and acts on impulse and pleasure and is creative, the Parent, who respects and enforces rituals that hold society together and free individuals from trivial decisions, and the Adult, who tries to do the best to mediate between the other two states by striving towards an objective view of reality.

  The roots of Star Trek, from this point of view, are that of an Adult that sometimes leans towards Parent. The show examines our current beliefs by creating fictional situations where they are put to the test. Characters or even entire societies assume archetypal roles, child-like, parent-like, while the role of the heroic Federation crew is to mediate some sort of understanding between them. As any good sci-fi, it is meant to make people think for themselves.

  No other show makes this mission clearer than Star Trek Discovery, which failed miserably to be Star Trek because it pushed its agenda on the viewer, rather than letting them think for themselves and make their own choice. Star Trek has touched so many controversial subjects, usually without taking things too far, but occasionally doing a brilliant job to inspire introspection.

  For example the Borg, which were always "evil" in their attempt to circumvent individuality and absorb everything and everybody in their megaorganism. Yet, with characters such as Hugh and Seven of Nine, grey areas were explored, culminating, I believe, with the conflict between Seven and Janeway, when her individuality is returned to her, but then her choices to return to the Collective are rejected. I still believe that they could have done a deeper job here, but times being what they were and the show being American, they got pretty far as it is. Personally, I would make an entire show about humans and a Borg-like species only.

  Frustrated by rules and rituals (heh!), Seth McFarlane, a huge Star Trek fan, decided to stop begging people to let him do a Star Trek show and created his own, borrowing what he could from the original show and improving or changing things to escape the confines of copyright. The Orville was born, a show that is a must see for any Star Trek fan. And I have to admit that when I decided to write this post, I was planning to talk about the differences between shows such as Star Trek Next Generation (and DS9 and especially Voyager), which leans a little too much toward the Parent role, and The Orville, which does a pretty good job being an Adult. But then I've changed my mind.

  The reason why I've changed my mind is the story of Topa. If you have not watched The Orville yet, please do so because I am going to spoil it for you.

  OK, so Topa is the female child of a two males Moclan couple in a society that considers females a genetic aberration. When a female infant is born, they immediately change their sex to male and never tell the children they were born different. How apropos this subject is, a society of homosexual males forcefully trans-forming any female baby, analyzed from our current socio-political point of view. And they did a fantastic job... at the beginning.

  You see, the first part of the story is about the disagreement between one parent and the other about if they should obey the mandated custom of their home planet, even if they are on a Federation (sorry, Union) ship. You can guess which part the crew was leaning toward, yet they had to accept the decision of the people in the culture that child was born... which was to proceed with the transformation. A disappointment for our American minded future union of planets, but what an episode finale! And before that, the revelation that the most revered poet of the Moclan culture is actually a female living in secrecy and willing to reveal herself to "fight for the cause".

  The second part is when the femaleness of Topa surfaces and makes her feel she lives in the wrong body. Again a lot of politics and scandal and opinions back and forth. This time, the episode is less ambiguous and I think the writers were actually afraid to do it any other way. Or they were lazy. Because at the end they skirt the law and the agreements between species and they reveal to Topa that she was born a female and immediately revert her to a female state in the same episode. A lot of effort went into making the supportive parent look good and the reticent parent look bad.

  Finally (maybe) the episode I saw today, where the female poet, now leader of a colony of all female Moclans that are protected from their homeworld's wrath by a Union agreement, tries to coopt Topa to be part of the "resistance" and she, hero-pressured, accepts, then almost loses her life at the hands of the evil all male Moclan military. I applauded the way it exposed the hypocrisy of the female leader, using a child to further her agenda and also endangering the entire colony that she was responsible for. However, again I felt like the conflict was resolved too quickly and too swiftly towards what we would accept as agreeable: Topa escapes with her life, the entire Union rejects the Moclan way of life and even the conservative parent makes a comeback complete with a full reversal of his opinions. How is the Union going to keep itself together if they can't accept the local idiosyncrasies of member states?

  And here is where the Parent, Adult, Child analysis feels appropriate. Topa, the child who wants to do what she feels is right and damn the consequences, Klyden the parent who won't renounce his custom and beliefs regardless of who that hurts and Bortus, the other parent - with an entire interstellar Union to support him, who has to find an adult way forward in which harm is minimized.

  I feel like the first episode about Topa lifted Orville above Star Trek shows. I know, blasphemy! How can I discount the eternal greatness of Star Trek? Well, because I compare the whole thing with the Seven of Nine storyline, where the show quickly dismissed her desire to return to the Collective as childish and went full Parent Janeway on her, even working towards a Mother/Daughter dynamic between them to justify it all. The Orville episode looked at individual opinions, cultural clashes, diplomatic discourse, the feelings of everyone involved and made the brave choice to not give the audience what it hoped for. Thus, making them think about the whole thing. Now with the other two episodes, I feel like the writers succumbed to societal pressure to resolve the conflict the only way the viewers would accept. And pronto! Before they #metoo McFarlane! Or maybe that's just stupid and childish, I don't know. I just liked the first episode so much compared with the "classical" other two.

  I think the PAC (Parent-Adult-Child) model is pretty useful in dissecting these Star Trek-like situations. I find it inspiring that the Adult, which is something people supposedly should strive to achieve psychologically, cannot exist in a vacuum. Without Adults and Children, it has no direction, it's like an AI system without a value function, while the two other roles generate this direction from feeling and instinct (genetics) and experience and tradition (culture). Whenever the crew encounter an alien species and enter the inevitable conflict, they have to not only solve the problem, but also do it in a way that is objectively and morally better, while also catering to their often strong feelings about a subject. Fascinating!

  We must be aware of the attraction we people have for strong authoritative figures that "know what's best", just as we must be aware of how easy solutions that feel good in the moment may have disastrous consequences further down the line. In some way, accepting everything from Picard-like people is almost as dangerous as acting like Q all the time.

  Haven't you ever wondered what a show like Star Trek would be like if situations were actually dangerous, where tech solutions would not solve everything in minutes and the alternatives are run, negotiate, intimidate or attack? When meeting some backwater one planet civilization that sentences your people to death for stamping on a flower, instead of spending one hour to save them using some loophole in the local law system to just arm photon torpedoes and say "Choose a city. Any city. Preferably one that you won't need anymore." Or if phasers would be set on "cut through stone" whenever firing at an alien lunging towards the crew? Or using any and all technology one finds to increase the tactical advantages of your ship and navy?

  But that's the whole point! Star Trek is not about levelling up, is about finding yourself with just shitty options and still choosing the one that is most principled and logical for everyone involved. About examining one's preconceptions and reaching not a conclusion, but a point of decision where the viewer can spend some time and think. It's about good writing! Compare that with Kirk on a motorcycle and you realize what the roots of Star Trek are all about.

  I wanted to write a post about how Star Trek treats too many situations as a Parent, probably because it was created by people in the 60s and 70s, and is sometimes too eager to put characters in their place because family (yeah, The Fast and the Furious doesn't have a monopoly on that) and how The Orville is going above that. Then I realize that they are actually doing the same thing, most of the time, with Orville just freshening things up and having a little bit more courage when writing their stories. And I love it! 

  Happy Trekking!

  This is a very basic tutorial on how to access Microsoft SQL Server data via SQL queries. Since these are generic concepts, they will be applicable in most other SQL variants out there. My hope is that it will provide the necessary tools to quickly "get into it" without having to read (or understand) too much. Where you go from there is on you.

  There are a lot of basic concepts about SQL, this post will be pretty long.

Table of contents

Connecting to a database

  Let's start with tooling. To access a database you will need SQL Server Management Studio, in my case version 2022, but I will not do anything complicated with it here, therefore any version will do just fine. I will assume you have it installed already as installation is beyond the scope of the blog post. Starting it will prompt for a connection:

  To connect to the local computer, the server will be either . or (local) or the computer name. You can of course connect to any server and you can specify the "instance" and the port number as well. An instance is a specific named installation of SQL server which allows one to have multiple installations (and even versions) of SQL Server. In fact, each instance has its own port, so specifying the port number will ignore the name of the instance. The default port is usually 1433.

  Example of connection server strings: Computer1\SQLEXPRESS, sql.corporate.com,1433, (local), .

  The image here is from a connection to the local machine using Windows Authentication (your windows user). You can connect using SQL Server Authentication, which means providing a username and a password, or using one of the more modern Azure Active Directory methods.

  I will also assume that the connection parameters are known to you, so let's go to the next step.

  Once connected, the Object Explorer window will display the connection you've opened.

  Expanding the Databases node will show the available databases.

  Expanding a database node we get the objects that are part of the database, the most important being:

  • Tables - where the actual data resides
  • Views - abstractions over more complex queries that behave like tables as much as possible, but with some restrictions
  • Stored Procedures - SQL code that can be executed with parameters and may return data results
  • Functions - SQL code that can be executed and returns a value (which can be scalar, like a number of string, or a table type, etc.) 

  In essence they are the equivalent of data stores and code that is executed to use those stores. Views, SPs and functions will not be explained in this post, but feel free to read about them afterwards.

  If one expands a table node, the child nodes will contains various things, the most important of which are:

  • Columns - the names and types of each column in the table
  • Indexes - data structures designed to increase performance to various ways of accessing the data in the table
  • Constraints and Keys - logical restrictions and relationships between tables

  Tables are kind of like Excel sheets, they have rows (data records) and columns (record properties). The power of SQL is a way to declare what you want from tabular representations of data and get the results quickly and efficiently.

  Last thing I want to show from the graphical interface is right clicking on a table node, which shows multiple options, including generating simple operations on the table, the CRUD (Create, Read, Update, Delete) operations mostly, which in SQL are called INSERT, SELECT, UPDATE and DELETE respectively.

  The keywords are traditionally written in all caps, I am not shouting at you. Depending on your preferences and of course the coding standards that apply to your project you can capitalize SQL code however you like. SQL is case insensitive.

Anyway, whatever you are going to choose to "script" it's going to open a so called query window and show you a text with the query. You then have the option of executing it. Normally no one uses the UI to generate scripts except for getting the column names in order for SELECT or INSERT operations. Most of the time you will just right click on a database and choose New Query or select a database and press Ctrl-N, with the same result.

Getting data from tables

Finally we get to doing something. The operation to read data from SQL is called SELECT. One can specify the columns to be returned or just use * to get them all. It is good practice to always specify the column names in production code, even if you intend to select all columns, as the output of the query will not change if we add more columns in the future. However, we will not be discussing software projects, just how to get or change the data using SQL server, so let's get to it.

The simplest select query is: SELECT * FROM MyTable, which will return all columns of all records of the table. Note that MyTable is the name of a table and the least specific way of accessing that table. The same query can be written as: SELECT * FROM [MyDatabase].[dbo].[MyTable], specifying the database name, the schema name (default one is dbo, but your database can use multiple ones) and only then the table name.

The square bracket syntax is usually not required, but might be needed in special cases, like when a column has the same name as a keyword or if an object has spaces or commas in it (never a good idea, but a distinct possibility), for example: SELECT [Stupid,column] FROM [Stupid table name with spaces]. Here we are selecting a badly named column from a badly named table. Removing the square brackets would result in a syntax error.

In the example above we selected stuff from table CasesSince100 and we got tabular results for every record and the columns defined in the table. But that is not really useful. What we want to do when getting data is:

  • getting data from specific columns
  • formatting the data for our purposes
  • filtering the data on conditions
  • grouping the data
  • ordering the results

So here is a more complex query:

-- everything after two dashes in a line is a comment, ignored by the engine
/* there is also
   a multiline comment syntax */
SELECT TOP 10                            -- just the first 10 records
    c.Entity as Country,                 -- Entity will be returned with the name Country
    CAST(c.[Date] as Date) as [Date],    -- Unfortunate naming, as Date is also a type
    c.cases as Cases                     -- capitalized alias
FROM CasesSince100 c                     -- source for the data, aliased as 'c'
WHERE c.Code='ROU'                       -- conditions to filter by
    AND c.[Date]>'2020-03-01'
ORDER BY c.[Date] DESC                   -- ordering in descending order

  The query above will return at most 10 rows, only for Romania, for dates larger than March 2020, but ordered from the newest to oldest. Data returned will be the country name, the date (which was originally a DATETIME and now is cast to a timeless DATE type) and the number of cases.

  Note that I have aliased all columns, so the resulting table has columns named as the aliases. I've also aliased the table name as 'c', which helps in several ways. First of all, Intellisense works better and faster when specifying the table name. All you have to do is type c. and the list of columns will pop up and be filtered as you type. The second reason will become apparent when I am talking about updating and deleting. For the moment just remember that it's a good idea to alias your tables.

  You can alias a table by specifying a name to call it by next to its own name and optionally using 'as', like SELECT ltn.* FROM Schema.LongTableName as ltn. It helps differentiating between ambiguous names (like if two joined tables have columns with the same name), simplifying the code for long named tables and helping with code completion. Even when aliased, the table name can be used and one can specify or ignore the name of the table if the column names are unambiguous.

Of course these are trivial examples. The power of SQL is that you can get information from multiple sources, aggregate them and structure your database for quick access. More advanced concepts are JOINs and indexes, and I hope you will read until I get there, but for now let's just go through the very basics.

Here is another query that groups and aggregates data:

SELECT TOP 10                            -- top 10 results
    c.Entity as Country,                 -- country name
    SUM(CAST(c.cases as INT)) as Cases   -- cases is text, so we transform it to int
FROM CasesSince100 c
WHERE YEAR([Date])=2020                  -- condition applies a function to the date
GROUP BY c.Entity                        -- groups by country
HAVING SUM(CAST(c.cases as INT))<1000000 -- this is filtering on grouped values
ORDER BY SUM(CAST(c.cases as INT)) DESC  -- order on sum of cases

This query will show us the top 10 countries and the total sum of cases in year 2020, but only for countries where that total is less than a million. There is a lot to unpack here:

  • cases column is declared as NVARCHAR(150) meaning Unicode strings of varied length, but at most 150 characters, so we need to cast it to INT (integer) to be able to apply summing to it
  • there are two different ways of filtering: WHERE, which applies to the data before grouping, then HAVING, which applies to data after grouping
  • filtering, grouping, ordering all work with unaliased columns, so even if Entity is returned as Country, I cannot do WHERE Country='Romania'
  • grouping allows to get a row for each combination of the columns the grouping is done and compute some sort of aggregation (in the case above, a sum of cases per country)

Here are the results:

Let me rewrite this in a way that is more readable using what is called a subquery, in other words a query from which I will query once again:

SELECT TOP 10
    Country,
	SUM(Cases) as Cases
FROM (
    SELECT
        c.Entity as Country,
        CAST(c.cases as INT) as Cases,
	    YEAR([Date]) as [Year]
FROM CasesSince100 c
) x
WHERE [Year]=2020
GROUP BY Country
HAVING SUM(Cases)<1000000
ORDER BY Cases DESC

Note that I still have to use SUM(Cases) in the HAVING clause. I could have grouped it in another subquery and selected again and so on. In order to select from a subquery, you need to name it (in our case, we named it x). Also I selected Country from x, which I could have also written as x.Country. As I said before, table names (aliased or not) are optional if the column name if unambiguous. Also you may notice that I've given a name to the summed column. I could have skipped that, but that would mean the resulting columns would have had no name and the query itself would have been difficult to use in code (extracted column values would have had to be retrieved by index and not by name, which is never recommended).

If you think about it, the order of the clauses in a SELECT operation has a major flaw: you are supposed to write SELECT, then specify what columns you want and only then specify where you want the columns to be read from. This makes code completion problematic, which is why the in code query language for .NET (LInQ) puts the selection at the end. But even so there is a trick:

  • SELECT * and then complete the query
  • go back and replace the * with the column names you want to extract (you will now have Intellisense code completion)
  • the alias of the tables will now come in handy, but even without aliases one can press Ctrl-Space and get a list of possible values to select

Defining tables and inserting data

Before we start inserting information, let's create a table:

CREATE TABLE Food(
    Id INT IDENTITY(1,1) PRIMARY KEY,
    FoodName NVARCHAR(100),
    Quantity INT
)

One important concept in SQL is the primary key. It is a good idea in most cases that your tables have a primary key which identifies each record uniquely and also makes them easy to reference. Let me give you an example. Let's assume that we would put no Id column in our Food table and then we would accidentally add cheese twice. How would you reference the first record as opposed to the second? How would you delete the second one?

A primary key is actually just a special case of a unique index, clustered by default. We will get to indexes later, so don't worry about that yet. Enough to remember that it is fastest (most efficient) to find records by the primary key than any other column combination and the way records are uniquely identified. 

The IDENTITY(1,1) notation tells SQL Server that we will not insert values in that column and instead let it put values starting with 1, then increasing with 1 each time. That functionality will become clear when we INSERT data in the table:

INSERT INTO Food(FoodName,Quantity)
VALUES('Bread',1),('Cheese',1),('Pork',2),('Chilly',10)

Selecting from our Food table now gets us these results:

As you can see, we've inserted four records, by only specifying two out of three columns - we skipped Id. Yet SQL has filled the column with values from 1 to 4, starting with 1 and incrementing each time with 1.

The VALUES syntax is specifying inline data, but we could, in fact, insert into a table the results of a query, something like this:

INSERT INTO Food(FoodName,Quantity)
SELECT [Name],Quantity
FROM Store
WHERE [Type]='Food'

There is another syntax for insert that is useful with what are called temporary tables, tables created for the purpose of your session (lifetime of the query window) and that will automatically disappear once the session is over. It looks like this:

SELECT FoodName,Quantity
INTO #temp
FROM Food

This will create a table (temporary because of the # sign in front of it) that will have just FoodName and Quantity as columns, then proceed on saving the data there. This table will not have a primary key nor any types of indexes and it will work as a simple dump of the data selected. You can add indexes later or alter the table in any way you want, it works just like a regular table. While a convenient syntax (you don't have to write a CREATE TABLE query or think of the type of columns) it has a limited usefulness and I recommend not using it in application code.

Just as one creates a table, there are DROP TABLE and ALTER TABLE statements that delete or change the structure of the table, but we won't go into that.

Changing existing data

So now we have some data in a table that we have defined. We will see how the alias syntax I discussed in the SELECT section will come in handy. In short, I propose you use just two basic syntax forms for all CRUD operations: one for INSERT and one for SELECT, UPDATE and DELETE.

But how can you use the same syntax for statements that are so different, I hear you ask? Let me give you some example of similar code doing just that before I dive in what each operation does.

SELECT *
FROM Food f
WHERE f.Id=4

UPDATE f
SET f.Quantity=9
FROM Food f
WHERE f.Id=4

DELETE FROM f
FROM Food f
WHERE f.Id=4

The last two lines of all operations are exactly the same. These are simple queries, but imagine you have a complex one to craft. The first thing you want to see is that you are updating or deleting the right thing, therefore it makes sense to start with a SELECT query instead, then change it to a DELETE or UPDATE when satisfied. You see I UPDATE and DELETE using the alias I gave the table.

When first learning UPDATE and DELETE statements, one usually gets to this syntax:

UPDATE Food     -- using the table name is cumbersome if in a complex query
SET Quantity=9  -- unless using Food.Quantity and Food.Id
WHERE Id=4      -- you don't get easy Intellisense

DELETE          -- this seems a lot easier to remember
FROM Food       -- but it only works with one table in a simple query
WHERE Id=4

I've outlined some of the reasons I don't use this syntax in the comments, but the most important reason why one shouldn't use them except for very simplistic cases is that you are trying to create a query to destructively change the data in the database and there is no fool proof way to duplicate the same logic in a SELECT query to verify what you are going to change. I've seen people (read that as: I was dumb enough to do it myself) who created an entire different SELECT statement to verify what they would do, then realize to their horror the statements were not equivalent and they had updated or deleted the wrong thing!

OK, let's look at UPDATE and DELETE a little closer.

One of the useful clauses for these statements is, just like with SELECT, the TOP clause, which instructs SQL to affect just a finite number of rows. However, because TOP has been added later for write operations, you need to encase the value (or variable) in parentheses. For SELECT you can skip the parentheses for constant values (you still need them for variables)

DELETE TOP (10) FROM MyTable

Another interesting clause, that frankly I have not used a lot, but is essential in some specific cases, is OUTPUT. One can delete or update some rows and at the same time get the rows they have changed. The reason being that first of all in a DELETE statement the rows will be gone, so you won't be able to SELECT them again. But even in an UPDATE operation, the rows chosen to be updated by a query may not be the same if you execute them again. 

SQL does not guarantee the order of rows unless specifically using ORDER BY. So if you execute SELECT TOP 10 * FROM MyTable twice, you may get two different results. Moreover, between the time you UPDATE some rows and you SELECT them in another query, things may change because of other processes running at the same time on the same data.

So let's say we have some for of Invoices and Items tables that reference each other. You want to delete one invoice and all the items associated with it. There is no way of telling SQL to DELETE from multiple tables at the same time, so you DELETE the invoice, OUTPUT its Id, then delete the items for that Id.

CREATE TABLE #deleted(Id INT) -- temporary table, but explicitly created

DELETE FROM Invoice 
OUTPUT Deleted.Id    -- here Deleted is a keyword
INTO #deleted        -- the Id from the deleted rows will be stored here
WHERE Id=2           -- and can be even be restored from there

DELETE 
FROM Item
WHERE Id IN (
  SELECT Id FROM #deleted
)  -- a subquery used in a DELETE statement

-- same thing can be written as:
DELETE FROM i
FROM Item i
INNER JOIN #deleted d  -- I will get to JOINs soon
ON i.Id=d.Id

I have been informed that the INTO syntax is confusing and indeed it is:

  • SELECTing INTO will create a new table with results and throw an exception if the table already exists. The table will have the names and types of the selected values, which may be what one wants for a quick data dump, but it may also cause issues. For example the following query would throw an exception:
    SELECT 'Blog' as [Name]
    INTO #temp
    
    INSERT INTO #temp([Name]) -- String or binary data would be truncated error
    VALUES('Siderite')
    ​

    because the Name column of the new temporary table would be VARCHAR(4), just like 'Blog' and 'Siderite' would be too long

  • UPDATEing or DELETEing with OUTPUT INTO will require an existing table with the same number and types of columns as the columns specified in the OUTPUT clause and will throw an exception if it doesn't exist

One can use derived values in UPDATE statements, not just constants. One can reference the columns already existing or use any type of function that would be allowed in a similar SELECT statement. For example, here is a query to get the tax value of each row and the equivalent update to store it into a separate column:

SELECT
    i.Price, 
    i.TaxPercent, 
    i.Price*(i.TaxPercent/100) as Tax  -- best practice: SELECT first
FROM Item i

UPDATE i
SET Tax = i.Price*(i.TaxPercent/100)   -- UPDATE next
FROM Item i

So here we first do a SELECT, to see if the values we have and calculate are correct and, if satisfied, we UPDATE using the same logic. Always SELECT before you change data, so you know you are changing the right thing.

There is another trick to help you work safely, one that works on small volumes of data, which involves transactions. Transactions are atomic operations (all or nothing) which are defined by starting them with BEGIN TRANSACTION and are finalized with either COMMIT TRANSACTION (save the changes to the database) or ROLLBACK TRANSACTION (revert changes to the database). Transactions are an advanced concept also, so read about it yourself, but remember one can do the following:

  • open a new query window
  • execute BEGIN TRANSACTION
  • do almost anything in the query window
  • if satisfied with the result execute COMMIT TRANSACTION
  • if any issue with what you've done execute ROLLBACK TRANSACTION to undo the changes

Note that this only applies for stuff you do in that query window. Also, all of these operations are being saved in the log of the database, so this works only with small amounts of data. Attempting to do this with large amounts of data will practically duplicate it on disk and take a long time to execute and revert.

The NULL value

We need a quick primer on what NULL is. NULL is a placeholder for a value that was not set or is considered unknown. It's a non-value. It is similar to null in C# or JavaScript, but with some significant differences applicable to SQL only. For example, a NULL value (an oxymoron for sure) will never be equal to (or not equal to) or less than or greater than anything. One might expect to get all the values in a table in these two queries: SELECT * FROM MyTable WHERE Value>5 and SELECT * FROM MyTable WHERE Value<=5. But if any rows will have NULL for a Value, then they will not appear in any of the query results. That applies to the negation operator NOT as well: SELECT * FROM MyTable WHERE NOT (Value>5).

This behavior can be changed by using SET ANSI_NULLS OFF, but I am yet to see a database that has ever been set up like this.

To check if a value is or is not NULL, one uses the IS and IS NOT syntax :)

SELECT *
FROM MyTable
WHERE MyValue IS NOT NULL

The NULL concept will be used a lot in the next chapter.

Combining data from multiple sources

We finally go to JOIN operations. In most scenarios, you have a database containing multiple table, with intricate connections between them. Invoices that have items, customers, the employee that processed it, dates, departments, store quantities, etc., all referencing something. Integrating data from multiple tables is a complex subject, but I will touch just the most common and important parts:

  • INNER JOIN
  • OUTER JOIN
  • EXISTS
  • UNION / UNION ALL

Let's write a query that displays the name of employees and their department. I will show the CREATE TABLE statements, too, in order to see where we get the data from:

CREATE TABLE Employee (
  EmployeeId INT,          -- Best practice: descriptive column names
  FirstName NVARCHAR(100),
  LastName NVARCHAR(100),
  DepartmentId INT)        -- Best practice: use same name for the same thing

CREATE TABLE Department (
  DepartmentId INT,        -- same thing here
  DepartmentName NVARCHAR(100)
)

SELECT
    CONCAT(FirstName,' ',LastName) as Employee,
    DepartmentName
FROM Employee e
INNER JOIN Department d
ON e.DepartmentId=d.DepartmentId

Here it is: INNER JOIN, a clause that combines the data from two tables based ON a condition or series of conditions. For each row of Employee we are looking for the corresponding row of Department. In this example, one employee belongs to only one department, but a department can hold multiple employees. It's what we call a "one to many relationship". One can have "one to one" or "many to many" relationships as well. That is very important when trying to gauge performance (and number of returned rows).

Our query will only find at most one department for each employee, so for 10 employees we will get at most 10 rows of data. Why do I say "at most"? Because the DepartmentId for some employees might not have a corresponding department row in the Department table. INNER JOIN will not generate records if there is no match. But what if I want to see all employees, regardless if their department exists or not? Then we use an OUTER JOIN:

SELECT
    CONCAT(FirstName,' ',LastName) as Employee,
    DepartmentName
FROM Employee e
LEFT OUTER JOIN Department d
ON e.DepartmentId=d.DepartmentId

This will generate results for each Employee and their Department, but show a NULL (without value) result if the department does not exist. In this case LEFT is used to define that there will be rows for each record in the left table (Employee). We could have used RIGHT, in which case we would have rows for each department and NULL values for departments that have no employees. There is also the FULL OUTER JOIN option, in which case we will get both departments with NULL employees if none are attached and employees with NULL departments in case the department does not exist (or the employee is not assigned - DepartmentId is NULL)

Note that the keywords INNER and OUTER are completely optional. JOIN is the same thing as INNER JOIN and LEFT JOIN is the same as LEFT OUTER JOIN. I find that specifying them makes the code more readable, but that's a personal choice.

The OUTER JOINs are sometimes used in a non intuitive way to find records that have no match in another table. Here is a query that shows employees that are not assigned to a department:

SELECT
    CONCAT(FirstName,' ',LastName) as Employee
FROM Employee e
LEFT OUTER JOIN Department d
ON e.DepartmentId=d.DepartmentId
WHERE d.DepartmentId IS NULL

Until now, we talked about the WHERE clause as a filter that is applied first (before grouping) so one might intuitively have assumed that the WHERE clauses are applied immediately on the tables we get the data from. If that were the case, then this query would never return anything, because every Department will have a DepartmentId. Instead, what happens here is the tables are LEFT JOINed, then the WHERE clause applies next. In the case of unassigned employees, the department id or name will be NULL, so that is what we are filtering on.

So what happens above is:

  • the Employee table is LEFT JOINed with the Department table
  • for each employee (left) there will be rows that contain the values of the Employee table rows and the values of any matched Department table rows
  • in the case there is no match, NULL values will be returned for the Department table for all columns
  • when we filter by Department.DepartmentId being NULL we don't mean any Department that doesn't have an Id (which is impossible) but any Employee row with no matching Department row, which will have a NULL value where the Department.DepartmentId value would have been in case of a match.
  • not matching can happen for two reasons: Employee.DepartmentId is NULL (meaning the employee has not been assigned to a department) or the value stored there has no associated Department (the department may have been removed for some reason)

Also, note that if we are joining tables on some condition we have to be extra careful with NULL values. Here is how one would join two tables on VARCHAR columns being equal even when NULL:

SELECT *
FROM Table1 t1
INNER JOIN Table2 t2
ON (t1.Value IS NULL AND t2.Value IS NULL) OR t1.Value=t2.Value

SELECT *
FROM Table1 t1
INNER JOIN Table2 t2
ON ISNULL(t1.Value,'')=ISNULL(t2.Value,'')

The second syntax seems promising, doesn't it? It is more readable for sure. Unfortunately, it introduces some assumptions and also decreases the performance of the query (we will talk about performance later on). The assumption is that if Value is an empty string, then it's the same as having no value (being NULL). One could use something like ISNULL(Value,'--NULL--') but now it starts looking worse.

There are other ways of joining two tables (or queries, or table variables, or table functions, etc.), for example by using the IN or the EXISTS/NOT EXISTS clauses or subqueries. Here are some examples:

SELECT *
FROM Table1
WHERE MyValue IN (SELECT MyValue FROM Table2)

SELECT *
FROM Table1
WHERE MyValue = (SELECT TOP 1 MyValue FROM Table2 WHERE Table1.MyValue=Table2.MyValue)

SELECT *
FROM Table1
WHERE NOT EXISTS(SELECT * FROM Table2 WHERE Table1.MyValue=Table2.MyValue)

These are less readable, usually have terrible performance and may not return what you expect them to return.

When I was learning SQL, I thought using a JOIN would be optimal on all cases and subqueries in the WHERE clause were all bad, no exception. That is, in fact, false. There is a specific case where it is better to use a subquery in WHERE instead of JOIN, and that is when trying to find records that have at least one match. It is better to use EXISTS because it is short-circuiting logic which leads to better performance.

Here is an example with different syntax for achieving the same goal:

SELECT DISTINCT d.DepartmentId
FROM Department d
INNER JOIN Employee e
ON e.DepartmentId=d.DepartmentId

SELECT d.DepartmentId
FROM Department d
WHERE EXISTS(SELECT * FROM Employee e WHERE e.DepartmentId=d.DepartmentId)

Here, the search for departments with employees will return the same thing, but in the first situation it will get all employees for all departments, then list the department ids that had employees, while in the second query the department will be returned the moment just one employee that matches is found.

There is another way of combining data from two sources and that is to UNION two or multiple result sets. It is the equivalent of taking rows from multiple sources of the same type and showing them together in the same result set.

Here is a dummy example:

SELECT 1 as Id
UNION
SELECT 2
UNION
SELECT 2

And we execute it and...

What happened? Shouldn't there have been three values? Somehow, when copy pasting the silly example, you added two identical values. UNION will add only distinct values to the result set. using UNION ALL will show all three values.

SELECT 1 as Id
UNION ALL
SELECT 2
UNION ALL
SELECT 2

SELECT DISTINCT Id FROM (
  SELECT 1 as Id
  UNION ALL
  SELECT 2
  UNION ALL
  SELECT 2
) x

The first query will return 1,2,2 and the second will be the equivalent of the UNION one, returning 1 and 2. Note the DISTINCT keyword.

My recommendation is to never use UNION and instead use UNION ALL everywhere, unless it makes some kind of sense for a very specific scenario, because the operation to DISTINCT values is expensive, especially for many and/or large columns. When results are supposed to be different anyway, UNION and UNION ALL will return the same output, but UNION is going to perform one more pointless distinct operation.

After learning about JOIN, my request to start with SELECT queries and only them modify them to be UPDATE or DELETE begins to make more sense. Take a look at this query:

UPDATE d
SET ToFindManager=1
--SELECT *
FROM Department d
LEFT OUTER JOIN Employee e
ON d.DepartmentId=e.DepartmentId
AND e.[Role]='Manager'
WHERE e.EmployeeId IS NULL

This will set ToFindManager in departments that have no corresponding manager. But if you select the text from SELECT * on and then execute, you will get the results that you are going to update. Same query, executing by selecting different sections of it will either verify or perform the operation.

Indexes and relationships. Performance.

We have seen how to define tables, how to insert, select, update and delete records from them. We've also seen how to integrate data from multiple sources to get what we want. The SQL engine will take our queries, try to understand what we meant, optimize the execution, then give us the results. However, with large enough data, no amount of query optimization will help if the relationships between tables are not properly defined and tables are not prepared for the kind of queries we will execute.

This requires an introduction to indexes, which is a rather advanced idea, both in terms of how to create, use, debug and profile, but also as a computer science concept. I will try to stick to the basics here, and you go and get more in depth from here.

What is an index? It's a separate data structure that will allow quick access to specific parts of the original data. A table of contents in a blog post is an index. It allows you to quickly jump to the section of the post without having to read it all. There are many types of indexes and they are used in different ways.

We've talked about the primary key: (unless specified differently) it's a CLUSTERED, UNIQUE index. It can be on a single column or a combination of columns. Normally, the primary key will be the preferred way to find or join records on, as it physically rearranges the table records in order and insures only one record has a particular primary key.

The difference between CLUSTERED and NONCLUSTERED indexes is that a table can have only one clustered index, which will determine the physical order of record data on the disk. As an example, let's consider a simple table with a single integer column called X. If there is a clustered index on X, then when inserting new values, data will be moved around on the disk to account for this:

CREATE TABLE Test(X INT PRIMARY KEY)

INSERT INTO Test VALUES (10),(1),(20)

INSERT INTO Test VALUES (2),(3)

DELETE FROM Test WHERE X=1

After inserting 10,1 and 20, data on the disk will be in the order of X: a 1, followed by a 10, then a 20. When we insert values 2 and 3, 10 and 20 will have to be moved so that 2 and 3 are inserted. Then, after deleting 1, all data will be moved so that the final physical order of the data (the actual file on the disk holding the database data) will be 2,3,10,20. This will help optimize not only finding the rows, but also efficiently reading them from disk (disk access is the most expensive operation for a database). 

Note: deletion is working a little differently in reality, but in theory this is how it would work.

Nonclustered indexes, on the other hand, keep their own order and reference the records from the original data. For such a simple example as above, the result would be almost identical, but imagine you have the Employee table and you create a nonclustered index on LastName. This means that behind the scenes, a data structure that looks like a table is created, which is ordered by LastName and contains another column for EmployeeId (which is the primary key, the identifier of an employee). When you do SELECT * FROM Employee ORDER BY LastName, the index will be used to first get a list of ids, then select the values from them.

A UNIQUE index also insures that no two records will have the same combination of values as defined therein. In the case of the primary key, there cannot be two records with the same id. But one can imagine something like:

CREATE UNIQUE INDEX IX_Employee_Name ON Employee(FirstName,LastName)

INSERT INTO Employee (FirstName,LastName)
VALUES('Siderite','Blog')

IX_Employee_Name is a nonclustered unique index on FirstName and LastName. If you execute the insert, it will work the first time, but fail the second time:

There is another type of index-like structure called a foreign key. It should be used to define logical relationships between tables. For the Department table, DepartmentId should be a primary key, but in the Employee table, DepartmentId should be defined as a foreign key connecting to the column in the Department table.

Important note: a foreign key defines the relationship, but doesn't index the column. A separate index should be added on the Employee.DepartmentId column for performance reasons.

I don't want to get into foreign keys here. Suffice to say that once this relationship is defined, some things can be achieved automatically, like deleting corresponding Item records by the engine when deleting Invoices. Also the performance of JOIN queries increases.

Indexes can be used not only on equality, but also other more complex cases: numerical ranges, prefixes, etc. It is important to understand how they are structured, so you know when to use them.

Let's consider the IX_Employee_Name index. The index is practically creating a tree structure on the concatenation of the first and last name of the employee and stores the primary key columns for the table for reference. It will work great for increasing performance of a query like SELECT * FROM Employee ORDER BY FirstName or SELECT * FROM Employee WHERE FirstName LIKE 'Sid%'. However it will not work for LastName queries or contains queries like SELECT * FROM Employee ORDER BY LastName or SELECT * FROM Employee WHERE FirstName LIKE '%derit%'.

That's important because sometimes simpler queries will take more resources than more complicated ones. Here is a dumb example:

CREATE INDEX IX_Employee_Dumb ON Employee(
    FirstName,
    DepartmentId,
    LastName
)

SELECT *
FROM Employee e
WHERE e.FirstName='Siderite'
  AND e.LastName='Blog'

SELECT *
FROM Employee e
WHERE e.FirstName='Siderite'
  AND e.LastName='Blog'
  AND e.DepartmentId=1

The index we create is called IX_Employee_Dumb and it creates a data structure to help find rows by FirstName, DepartmentId and LastName in that order. 

For some reason, in our employee table there are a lot of people called Siderite, but with different departments and last names. The first query will use the index to find all Siderite employees (fast), then look into each and check if LastName is 'Blog' (slow). The second query will directly find the Siderite Blog employee from department with id 1 (fast), because it uses all columns in the index. As you can see, the order of columns in the index is important, because without the DepartmentId in the WHERE clause, only the first part of the index, for FirstName, can be used. In the last query, because we specify all columns, the entire index can be used to efficiently locate the matching rows. 

Note 2022-09-06: Partitioning a table (advanced concept) takes precedence to indexes. I had a situation where a table was partitioned on column RowDate into 63 partitions. The primary key was RowId, but when you SELECTed on RowId, there were 63 index seeks performed. If queried on RowId AND RowDate, it went to the containing partition and did only one index seek inside it. So careful with partitioning. It only provides a benefit if you query on the columns you use to partition on.

One more way of optimizing queries is using the INCLUDE clause. Imagine that Employee is a table with a lot of columns. On the disk, each record is taking a lot of space. Now, we want to optimize the way we get just FirstName and LastName when searching in a department:

SELECT FirstName,LastName
FROM Employee
WHERE DepartmentId=@departmentId

That @ syntax is used for variables and parameters. As a general rule, any values you send to an SQL query should be parameterized. So don't do in C# var sql = "SELECT * FROM MyTable WHERE Id="+id, instead do var sql="SELECT * FROM MyTable WHERE Id=@id" and add an @id parameter when running the query.

So, in the query above SQL will do the following:

  • use an index for DepartmentId if any (fast)
  • find the EmployeeId
  • read the (large) records of each employee from the table (slow)
  • extract and return the first and last name for each

But add this index and there is no need to even go to the table:

CREATE INDEX IX_Employee_DepWithNames
  ON Employee(DepartmentId)
  INCLUDE(FirstName,LastName)

What this will do is add the values of FirstName and LastName to the data inside the index and, if only selecting values from the include list, return them from the index directly, without having to read records from the initial table.

Note that DepartmentId is used to locate rows (in WHERE and JOIN ON clauses) while FirstName and LastName are the columns one SELECTs.

Indexes are a very complex concept and I invite you to examine it at length. It might even be fun.

When indexes are bad

Before I close, let me tell you where indexes are NOT recommended.

One might think that adding an index for each type of query would be a good thing and in some scenarios it might, but as usual in database work, it depends. What performance you gain for finding records in SELECT, UPDATE and DELETE statements, you lose with INSERT, UPDATE and DELETE data changes.

As I explained before, indexes are basically hidden tables themselves. Slight differences, but the data they contain is similar, organized in columns. Whenever you change or add data, these indexes will have to be updated, too. It's like writing in multiple tables at the same time and it affects not only the execution time, but also the disk space.

In my opinion, the index and table structure of a database depends the most on if you intend to read a lot from it or write a lot to it. And of course, everybody will scowl and say: "I want both! High performance read and write". My recommendation is to separate the two cases as much as possible.

  • You want to insert a lot of data and often? Use large tables with many columns and no indexes, not even primary keys sometimes.
  • You want to update a lot of data and often? Use the same tables to insert the modifications you want to perform.
  • You want to read a lot of data and often? Use small read only tables, well defined, normalized data, clear relationships between tables, a lot of indexes
  • Have a background process to get inserts and updates and translate them into read only records

Writing data and reading data, from the SQL engine perspective, are very very different things. They might as well be different software and indeed some companies use one technology to insert data (like NoSQL databases) and another to read it.

Conclusion

I hope the post hasn't been too long and that it will help you when beginning with SQL. Please leave any feedback that you might have, the purpose of this blog is to help people and every perspective helps.

SQL is a very interesting idea and has changed the way people think of data access. However, it has become so complex that most people are still confused even after years of working with it. Every year new features are being added and new ideas are put forward. Yet there are a few concepts, a foundation if you will, that will get you most of the way there. This is what I have tried to distil here. Hope I succeeded.

  I was attempting to optimize an SQL process that was cleaning records from a big table. There are a multitude of ways of doing this, but the pattern that I had adopted for the last similar tasks were to delete rows in batches using the TOP (@rowCount) syntax. And it had all worked fine until then, but now my "optimization" increased the run time from 6 minutes to 2 hours! Humbled (or more like humiliated) I started to analyze what was going on.

  First thing I did was to SET STATISTICS IO ON. Then I ran the cleaning task again. And lo and behold, there was a row reporting accessing an object that was not part of the query itself. What was going on? At first I thought that I was using a VIEW somewhere, one that I had thought was a table, but no, there was no reference to that object anywhere. But when I looked for that object is was a view!

  The VIEW in question was a view with SCHEMABINDING, to which several indexes were then created. That explained it all. If you ever attempted to create an index on a view you probably got the error "Cannot create index on view, because the view is not schema bound" and then you investigated what that entailed (and probably gave up because of all the restrictions) but in that first moment when you thought "all I have to do is add WITH SCHEMABINDING and I can index my views!" it seemed like a good idea. It might even be a good idea for several scenarios, but what it also does is create a reverse dependency on the object you are using. Moreover, if you look more carefully at the Microsoft documentation it says: "The query optimizer may use indexed views to speed up the query execution. The view does not have to be referenced in the query for the optimizer to consider that view for a substitution." So you may find yourself querying a table and instead the engine queries a view instead!

  You see, what happens is that every time when you delete 4900 rows from a table that is used by a view that has indexes on it is those indexes are being recreated, so not only your table is affected, but potentially everything that is being called in the view as well. If it's a complicated view that integrates data from multiple sources, it will be run after every batch delete and indexed. Again. And again. And again again. It also prohibits you from some operations, like TRUNCATE TABLE, where you get a funny message saying it's referenced by a view and that is why you can't truncate it. What?!

  Now, I deleted the VIEW and ran the same code. It was faster, but it still took ages because finding the records to delete was a much longer operation than the deletion itself. This post is about this reverse dependency that an indexed view introduces.

  So what is the solution? What if you have the view, you need the view and you also need it indexed? You can disable the indexes before your operation, then enable them again. I believe this will solve most issues, even if it's not a trivial operation. Just remember that in cleaning operations, you need some indexes to find the records to delete as well.

  That's it. I hope it helps. Get out of here!

and has 0 comments

  A Lush and Seething Hell is a collection of two novellas: The Sea Dreams It Is the Sky, where vast magical forces play with death and torture in a fictional Chile inspired South-American country, and My Heart Struck Sorrow, a story of dark magic working through verse and song.

  John Hornor Jacobs writes well, dragging the reader into the worlds of his mind, however I found it difficult to stay there. Perhaps it's the alert lifestyle of today, full of interruptions and distractions, but it felt easy for me to stop reading and it needed some effort to start again. It took me two weeks to read it all and even then it required a conscious decision to push through, though it's not a large book.

  Both stories have a common structure: people who are following the narrative of another and thus are drawn into the same world. Reading about reading, so to speak. They have elements of cosmic horror, although most of it is implied or not clearly explained - the traditional way of approaching the genre - intimating that even the tiniest brushes with these hidden realms are terrifyingly dangerous. What they both reminded me repeatedly is House of Leaves, though not so convolutedly detailed, and only marginally of any Lovecraftian work.

  Bottom line: I liked both stories, the world building, the style, the slowly getting under the skin horror elements, but I did feel the writing dragged a little.

and has 0 comments

  Something that feels inspired heavily by Octavia Butler, Semiosis starts with a very interesting premise and continues through generations of human colonists on an alien planet. However, each chapter introduces a new generation, thus abandoning characters and attachments introduced before. In the end it simply feels too clinical, with characterization lacking luster, while still remaining a captivating read.

  The plot centers around a human colony on a distant alien planet. There are only a few dozen people and, with some equipment failures, they find themselves at the mercy of the world's inhabitants. Which are intelligent plants! It is a very interesting premise and both the generational span of the story and the cold calculations of different species that must coexist despite their massive differences reminded me a bit of Xenogenesis. However, Sue Burke didn't have the cruelty required to thoroughly violate her characters that Butler had, so in the end the mood was more positive, perhaps reminiscent of '60s sci-fi, with lots of deliberations and rational arguments as a major part of the story.

  Bottom line: I liked the book. Could have been better, but as a debut it's pretty good. I will probably read the second book sooner or later, because the world of Pax is so full of potential, however I do believe Semiosis can be taken as a standalone story without the need for a continuation.

and has 0 comments

  Edward O. Wilson was a biologist who died at the end of 2021, aged 94. Nicknamed "ant man" for his world renowned expertise of ants, he championed concepts such as sociobiology and biodiversity. Reportedly, he was a very nice man, beloved by most of the people he interacted with. And yet, I didn't hear of him because of his scientific writings, but because of a vitriolic article published by Scientific American. In it, the author used Wilson's death and the renewed interest in his autobiography, Naturalist, to decry Wilson's views ("problematic beliefs"). He had tried to explain everything through biological lenses, for example that individual characteristics are caused by evolution and those characteristics cause the characteristics of a group or society or race in a particular environment. The article's author considered that as proof of "scientific racism", but was immediately shut down by scores of scientists who debunked her entire article and pretty much proved she didn't even read the books she was supposedly basing her writing on.

  So even when I try to filter out the political idiocy that pollutes every aspect of modern life and try to keep up to date with science and technology, I still fall into these toxic holes. Ironically, one of the last chapters in Naturalist talks about how weird it was for one of his colleagues to try to explain biology ideologically (in that case Marxism). Anyway, so I decided to read the book. I usually love autobiographies, especially those of scientists and other driven people, because it makes me feel as they did. Even if prompted by an ugly example of human stupidity and malice, still something good could come of it.

  Alas, while the book is interesting and takes the reader through much of Wilson's life and work, it merely describes his passion for nature, rather than evoke it. Even as it starts with a personal history and childhood, it feels strangely impersonal. A small boy with hearing issues and partial vision in one eye (accidentally caused by him trying to handle a spiked fish), he was nevertheless taught to never run away from a fight by his father, partially schooled in educational institutions that prepared children for military careers and had overall the belief that anything is possible, once you put your mind to it.

  I have no doubt that his approach to life wasn't as analytical as it is portrayed in the book, but what exactly that was is hard to glimpse from this biography. Wilson published Naturalist when he was 65 and, while I am sure he worked some time on it, he treated it as any of his scientific books at the time: facts, history based on journals, actions, expectations, results. I liked the book and I liked Wilson, but I wouldn't particularly recommend Naturalist for anything than a glimpse in Wilson's nature (pardon the pun).

  First of all, neither am I a philosopher nor have I read Nietzsche. The philosophical aspects that I am discussing are how a layman would interpret them. In this post I am going to discuss anime from the Baki Hanma and JoJo's Bizarre Adventures universes with a nod to Andromeda's race of genetically modified humans called Nietzscheans and also other media portrayals of similar concepts.

  Watching episodes from Baki or JoJo anime I got a weird feeling. Both series, while having completely different plots, focus on humans with superior abilities fighting each other. Nothing new here: both American and Japanese cultures are inundated with this cliché. Yet these shows are strangely humanistic in nature. The characters have impossible strong muscles, dress in their own special way and are proudly dedicated to particular philosophies that define their path in life. Compared to other people, they are intimidating, entirely dominating, and they are so strong that they defy the laws of medicine and even physics. They use their power in tactical and strategic ways, they hone their skills, they outthink their adversaries and use whatever the environment gives them in order to win. And this in order to gain power only over themselves.

  In so many ways, they reminded me of the Nietzscheans, from Gene Roddenberry's TV series Andromeda (before the show went to shit, so first season only). They also reveled in their physical, mental and knowledge prowess. Violence, to them, was justified as a way to eliminate weakness. The characters in the two anime shows are the same: they risk their health, their lives, in order to try themselves to the limit. As a result, they cannot exist in human society. People can't abide such obvious difference, when these guys are stronger than guns, impossible to detain through cuffs, chains, walls or cages and at any time they can just destroy a normal human being with little to no effort. It is this part that actually got me thinking and writing the blog post.

  Usually in media, people who care only about their own betterment to the point they eschew social norms are portrayed as villains. Human values are represented as communal values: caring about others, respecting their way to live, abiding social constraints and obeying laws, forming bonds and families, then dedicating effort to maintain and preserve them. The hero will defend, not attack, will arrest, not destroy, will consider, not dismiss, will protect, not invade. In fact, a hero is a social construct and can only exist as society's protector.

  In regular situations, the ones that are considered normal in society, heroes are not needed. Performance is not needed. There are some boundaries in which one is allowed to strive for better output, but only as cogs in a social mechanism that needs them to perform within expected ranges. Only when things go awry, from the breaking of a component (be it a tool, a flow or a person) to some huge disaster, some people "step up" and take over the load. Those are heroes. And here is the dilemma, because someone who has not made the effort of being better than expected of them will not be able to step up, while someone who does make the effort is inevitably vilified during "peace times".

  This reminds me of Rambo, in the first movie and not the ridiculous propaganda sequels. Here is a man who, through circumstances that needed to be tragic and out of his control so as to enhance his heroic status, reached a level above his peers, at least in one particular domain: fighting and killing. He was perfect as a soldier, but as he returns home he has difficulties integrating himself back into society. It takes only a small town sheriff bullying to bring the beast to surface. The old adage still stands: the best heroes are all dead.

  Going back to the animes, I found myself in conflict. Here is the usual portrayal of society, a safe place for everybody to live in, defining what human life is and should be like, but functioning as a soulless mechanism. And here is the usual portrayal of the self absorbed villain, a monstruous being of immense power who threatens the existence of all, but functioning as a proud individual constantly bettering themselves. I feel like the latter option is more humanistic, therefore truly being human is in antithesis to human society.

  Can there be a balance between the two? Could we actually imagine a benign Nietzschean-like society? One that would truly embrace diversity, specialization and performance while despising mediocrity and also not eating itself from within? I find it hard, if not impossible. Still, I can't but feel a sort of admiration for these larger than life characters and their dedication to a random thing than then defines them for ever.

  What do you think?

and has 0 comments

  Gonna try something different today, by sharing the analysis of someone else, which I can only assume is way much better than me at this. This is the Mengarini variation of the Sicilian, where a3 is the reply to Black's c5. Similar to a wing gambit, it serves the same purpose: to deflect Black's development towards a side pawn sacrifice in order to gain the center immediately.

  A very natural continuation is Nc6 b4 cxb4 axb4 Nxb4 c3 Nc6 d4, after which almost every White piece is ready to attack, the king is temporarily safe and Black has only developed a knight which has moved three times.

  From what I can tell, Ariel Mengarini was an Italian from Rome who emigrated in the US and became a psychiatrist. He was also pretty good at chess, having started at 6.

Here is the embed of JayBayC's analysis on LiChess:

Also, let's see a PGN directly on the blog, from an amateur game, just to see how fast things can go wrong for Black: 1. e4 { [%eval 0.33] [%clk 0:05:00] } 1... c5 { [%eval 0.32] [%clk 0:05:00] } 2. a3 { [%eval -0.08] [%clk 0:05:01] } { B20 Sicilian Defense: Mengarini Variation } 2... Nc6 { [%eval -0.09] [%clk 0:04:59] } 3. b4 { [%eval -0.55] [%clk 0:05:04] } 3... cxb4 { [%eval -0.13] [%clk 0:05:00] } 4. axb4 { [%eval 0.0] [%clk 0:05:07] } 4... Nxb4 { [%eval -0.36] [%clk 0:05:02] } 5. c3 { [%eval -0.12] [%clk 0:05:09] } 5... Nc6 { [%eval 0.0] [%clk 0:05:04] } 6. d4 { [%eval -0.18] [%clk 0:05:12] } 6... d5 { [%eval 0.0] [%clk 0:05:06] } 7. exd5 { [%eval 0.0] [%clk 0:05:15] } 7... Qxd5 { [%eval 0.13] [%clk 0:05:07] } 8. Na3 { [%eval 0.14] [%clk 0:05:18] } 8... e5?? { (0.14 → 1.71) Blunder. Qa5 was best. } { [%eval 1.71] [%clk 0:05:03] } (8... Qa5 9. Nf3 e6 10. Bb5 Nf6 11. O-O Be7 12. Bxc6+ bxc6 13. c4 O-O 14. Re1 Qc7 15. c5) 9. Nb5 { [%eval 1.79] [%clk 0:05:20] } 9... Qd8?! { (1.79 → 2.76) Inaccuracy. Kd8 was best. } { [%eval 2.76] [%clk 0:05:01] } (9... Kd8 10. Bg5+ Be7 11. dxe5 Qxd1+ 12. Rxd1+ Bd7 13. Nf3 Bxg5 14. Nxg5 Nxe5 15. f4 h6 16. Ne4) 10. d5 { [%eval 3.32] [%clk 0:05:22] } 10... Na5?? { (3.32 → 7.92) Blunder. Nf6 was best. } { [%eval 7.92] [%clk 0:04:33] } (10... Nf6 11. Be2 Bf5 12. dxc6 bxc6 13. Qxd8+ Kxd8 14. Na3 Ne4 15. Bd3 Bc5 16. Be3 Bxe3 17. fxe3) 11. Qa4?! { (7.92 → 5.67) Inaccuracy. d6 was best. } { [%eval 5.67] [%clk 0:05:10] } (11. d6 Kd7 12. Nf3 Nc6 13. Ng5 Qf6 14. Bc4 Nh6 15. O-O a6 16. Ne4 Qg6 17. Qd5 Rb8) 11... b6?! { (5.67 → 10.14) Inaccuracy. Bd7 was best. } { [%eval 10.14] [%clk 0:04:29] } (11... Bd7 12. Qxa5) 12. Nc7+ { [%eval 10.12] [%clk 0:04:53] } 12... Ke7 { [%eval 8.36] [%clk 0:04:29] } 13. Ba3+ { [%eval 9.74] [%clk 0:04:55] } { Black resigns. } 1-0

Hope it helps!

P.S. Here is my own (and first) study on LiChess, based on human games and computer analysis: Sicilian Defense: Mengarini variation

and has 0 comments

  Strange the Dreamer was a book that, while not perfect, was well written and showing a lot of promise. If anything, I was surprised to see that Muse of Nightmares will be the second and final book of the series, because I couldn't understand how everything begun or hinted at in the first book could be wrapped up. And indeed it wasn't, which doesn't mean that I didn't enjoy reading the book. I feel the series lost a lot of unrealized potential, though. By focusing on the main characters, now starstruck lovers that would do anything for each other, Laini Taylor left all the others behind, without a growth arc or closure. Not only that, but she also brings in another antagonist, from the past of one of the slain gods, so she has even less space to work in.

  Don't get me wrong, I enjoyed the first book so much that I immediately started on the sequel and I've read this one really fast, too. It was entertaining, it was exciting and it was intense in places. But it wasn't better than the first book and the way it ended, with everything nice and cozy and people realizing their dreams and resolving their inner conflicts and helping each other and so on and son on, made have a feeling of jarring fakeness.

  In short, it's a decent book to finish the story, but it went by too fast, paying no attention to characters dragged along from the beginning and left in the dust, focusing too much on love scenes and less on consequential events, using McGuffins all over the place and making people not think of solutions that were employed just pages later by the antagonist using their own powers.

  Like Minya using her powers to force ghosts to do her bidding, regardless of their own desires, so did Taylor corral her characters through the narrow confines of her planned storyline.

and has 0 comments

  There is something really cool about Twitter, but it's not what you probably think. Elon Musk wants to buy it to promote free speech, but also criticizes the way it started at the same time as Facebook yet it made no money. That's what's cool about Twitter: it made no money!

  Having the freedom to express oneself with the only limitation of constraining length to 140 or 280 characters per message is the core of Twitter. I agree with Musk that political censorship of content is evil and that it is slowly strangling what Twitter was supposed to be, but I disagree with the idea that it should be monetized more. That's what Facebook does and for better or worse, it covers that niche. I have to say that if there is someone who can make Twitter both make money and keep its core values intact, that's probably Elon Musk, but I dread the possibility of failure in the attempt.

  Now, don't get me wrong: I almost never tweet. I have an automated system to announce new blog posts on Twitter and lately I've started writing short (Twitter core!!) reviews on the TV series I've watched. In my mind, TV series - should I still call them TV? - don't deserve the attention that I give movies in my IMDb reviews or separate blog posts like books and that is why I write there. Nor do I regularly read Twitter posts. I have a system that translates Twitter accounts into RSS feeds and I sometimes read the content on my feed reader - thank you, Nitter!

  The reason why I am writing this post is ideological, not technical or self interested. If Twitter disappeared tomorrow, I wouldn't really feel the loss. But I do believe its values are important for humanity as a whole. In fact, I don't fully agree with Musk that bots are a problem, because Twitter was designed with automation in mind. A public square, he calls it. I don't like public squares, or anything public, to be honest. What I value is personal and my belief is that this is what we should try to preserve.

  Strangely enough, the trigger for this post was a Netflix documentary about Richard Burton. Now, there is a man! He came from a poor, hard mining town from Wales. He started with a strong regional accent and trained every day to have an English one. From a family of 13 (or 11, if you discount infant mortality, as he did) he choose a drama teacher as a parent - and took his last name, with his family's blessing. Can you imagine being part of a family that is glad to give you for adoption, because that's what's best for you? He was beautiful, hard, passionate and articulate, charming, violent, ruthless, living life to the fullest and hating himself for it. He became a Hollywood icon, yet admitted he had no idea why. That's what men were like half a century ago. I am old enough to have seen some of his movies and to appreciate him as an actor. And while I was watching the documentary, I imagined what Twitter would say about Burton now and what the people behind those tweets would be. The petty, ugly, pathetic crowd that can't stand someone so vastly different, so as not to say superior.

  But it's not Twitter that would be at fault. In fact, when Richard Burton chose to leave his faithful wife for many years for Elizabeth Taylor he was sued by a "subcommittee" for indecent behavior. They didn't have Twitter, but they reveled in outrage just as well. And it's not like he was any kind of saint. He was rigid and cruel and judgmental and lacking any shyness at saying what he though or felt was wrong with you. The issue is not what you do, but why you do it for.

  That's what I believe is where the line should be drawn. It's not about the words you use, but why you said them in the first place. It's not about preserving some social contract that everybody should be forced to obey, but about one's position about particular events. It's not even about "do no harm", because one's harm is another's pleasure. It is about intention.

  Coming back to Twitter and its most famous side effect, cancel culture: I think cancel culture is despicable, but I also partially agree with its defenders or the people who deny its existence. Because the reason why cancelling someone is toxic is not because of people disagreeing, but because people fear being on the wrong side. Once there is enough momentum and energy poured into destroying the life of one person, it becomes a snowball of fear, with people refusing to be "associated" with cancelled people. It's that fear that is the problem, the weak cowardly fear that prevents one from staying the course or ignoring the drama or even supporting someone for mostly economic reasons. Yes, that's what cancel culture is: people afraid to lose their money or some other currency because of other people hating each other. Cancel culture is not new, it just become globalized. If people in Richard Burton's time disliked a person so much they couldn't stand their existence, all that person had to do is leave and start living in some other place. Nowadays, it's all public (heh!) and global. You can't escape being hated.

  Yet the problem is not globalization, is people who somehow care what people they don't care about care about. Yes, you got a bad rep somewhere in the world, from people I don't know. I will be circumspect, but I will use my own judgement about you. Not doing that is lazy and stupid and, again, petty. As George Carlin once said "I never fucked a ten! But I once fucked five twos!". A crowd of stupid, petty, lazy people does not a great person make.

  Bottom line: congrats for making it this far into my rant. People are bound to be different and disagree with each other. Fearing to associate with someone because they are shunned by another group of people is just a feeling. Your choice is what matters. Twitter is a platform, a tool, and what matters is the ability to express oneself and to filter out people you don't want to hear from. That's what a person does and that's what the Internet should preserve. Not the mobs, not the fake outrage to get social points, but the personal: freedom of expression and freedom to ignore whatever you want.

  If Elon Musk would ask my opinion (why would he?!) I would tell him that people need more filters. Enable people to ignore what they choose to ignore and they will be less hateful. That also applies to ads, by the way. Every time I see an angry person obliquely or directly attacking a group that I am even remotely part of I feel attacked and I get angry, too. I didn't want to read that person's opinion and I don't care for it, but it got shoved in my face. If I could just ignore them, I would be less angry and more positive, as would my tweets. And believe me, I used Twitter's word filtering already. It filters out stuff like -isms, politics, U.S. presidents and so on. You see? That's a personal choice, to move away from hatred and judgement. Do it, too, and life will feel so much better. Becoming an outraged activist for something is not an inevitability, it's a choice.

  When we connect to SQL we usually copy/paste some connection string and change the values we need and rarely consider what we could change in it. That is mostly because of the arcane looking syntax and the rarely read documentation for it. You want to connect to a database, give it the server, instance and credentials and be done with it. However, there are some parameters that, when set, can save us a lot of grief later on.

  Application Name is something that identifies the code executing SQL commands to SQL Server, which can then be seen in profilers and DMVs or used in SQL queries. It has a maximum length of 128 characters. Let's consider the often met situation when your application is large enough to be segregated into different domains, each having their own data access layer, business rules and user interface. In this case, each domain can have its own connection string and it makes sense to specify a different Application Name for each. Later on, one can use SQL Profiler, for example, and filter on the specific area of interest.

 

  The application name can also be seen in some queries to SQL Server's Dynamic Management Views (quite normal, considering DMVs are used by SQL Profiler) like sys.dm_exec_sessions. Inside your own queries you can also get the value of the application name by simply calling APP_NAME(). For example, running SELECT APP_NAME(); in SQL Management Studio returns a nice "Microsoft SQL Server Management Studio - Query" value. In SQL Server Profiler the column is ApplicationName while in DMVs like sys.dm_exec_sessions the column is program_name.

  Example connection string: Server=localhost;Database=MyDatabase;User Id=Siderite;Password=P4ssword; Application Name=Greatest App Ever

  Hope it helps!

and has 0 comments

  Strange the Dreamer is a fantasy book with good writing, characters and story. I will probably start reading the second book in the series immediately. The style reminded me of Brandon Sanderson a bit: beautiful and imaginative world, empathetic and compelling characters who are mostly good at heart, even when they are villainous and a bright spirit that celebrates love, curiosity and exploration.

  More than that, this is a normal book, one that is focused on the plot and characters and has no agenda other than telling a good story. I had feared the worst when I saw Laini Taylor is a writer of Young Adult fiction, has bright magenta hair and started with comics. Glad to see my fears so unfounded.

  The main characters are Lazlo, an orphan boy with a love for knowledge and myths, obsessed by the existence of a mythical city of the desert, and Sarai, a half goddess with blue skin and a rather sad existence. But there is more: libraries full of mystery, alchemy, magic, gods, desert warriors, young love, explosions, a sky fortress and more.

  What I felt was the biggest issue with the book is the introduction of so many characters that had an episodic effect on the story or even none at all. There is a part of the story where there are hints of rivalry and intrigue with another character, then it escalates and then... months pass, on the road, and those two characters don't interact at all. The desert trip itself is less than fulfilling, after reading so much about how cruel and difficult the desert is. And then there are characters like the warriors or the girl who climbs things for fun. I hope they will have more of a role in the second book, because otherwise why introduce them at all?

  Bottom line: I feel great promise from Laini Taylor. I liked this book a lot and it's her second, but I expect even greater things from her in the future.