The Cat Returns (Neko no ongaeshi) - Alice in Wonderland, Japanese cat style

Published Feb 13, 2014

Posted in
manga
misc
movies
picture

I can't say that Neko no ongaeshi had a great effect on me. The animation was OK, the story was like a fairy tale, but it lacked something, a special feeling that I was expecting to have.

The plot is that a young girl saves a cat from death and finds herself uncomfortably rewarded by the entire hidden nation of cats with a trip to their kingdom, a marriage to their prince and a free transformation into a feline. She doesn't want this, but helped by new friends, she manages to escape. I am not really spoiling anything here. It wasn't like at any moment I felt that she might be in real danger, which I think was the biggest flaw of the story. Another anime from Studio Ghibli, Spirited Away, features a much more beautiful and scary foray in a magical world and one of the novels of Clive Barker, The Thief of Always, brings the required tension and fear that is missing in this film.

Another issue I had with this is that, other than eat mice and fish, the cats behaved exactly like humans, missing entire opportunities to delight the viewer with so many catty things. They don't use their claws, they don't do acrobatics, they live in a feudal community and are loyal to each other. The whole concept of a feline kingdom passed right by the creators of the anime.

My conclusion is that this is a film for very little children or a lazily made one. It's not that I didn't enjoy watching it, but was completely bland, devoid of any inspiration that would make it rise above average.

The Ocean Waves (Umi ga kikoeru), a beatiful high school romance story

Published Feb 10, 2014

Posted in
manga
misc
movies
picture

and has 0 comments

I always liked animes from Studio Ghibli., but until now I didn't quite get why. It is because they have calm. Everything today has to be over the top, flashy, fast. Ghibli stories take their time, they feature normal people with normal desires and rhythms. behaving normally.

The Ocean Waves is about a cute girl moving from Tokyo to a provincial highschool in Kochi. Everybody is curious about her, but she is a loner and quite rude. Two friends are both interacting with her, but it's never clear what's in their hearts. Slowly, but surely, we start to understand each of the actors and the story comes full circle after graduation, at the first highschool reunion.

I've learned so much about Japanese culture from animes, but the ones from Ghibli make me understand the people. The stories often have what is missing in not only animation, but real actor movies as well: people that you can empathise with, because they are like you (or rather, like you would like to be, but not in infantile fantasies, but in your hopeful dreams).

Really nice movie, it certainly worth seeing.

My neighbours, the Yamadas (Hôhokekyo tonari no Yamada-kun), a cute minimalistic family anime

Published Feb 10, 2014

Posted in
manga
misc
movies
picture

and has 0 comments

When I first started watching the movie and I saw the way it was drawn - colored pencil style, I thought it is some sort of children thing and I would not like it. But the minimalistic animation works very well for this film, which shows the everyday life of a Japanese family. They are not very smart, good looking or have anything special. They are forgetful, self centered and lazy. But they have each other and they are happy. That's a beautiful message in a world dominated by heroes, celebrity and egotism.

One might not like one thing, that the story is merely descriptive. There is no "end" to it, just a funny enumeration of family moments. I enjoyed it, though. The speech at the beginning, from the woman advising the newly weds what life is and how they should spend it together is both funny, mostly true and descriptive of the rest of the film. The part with "life is hard when you are alone, but even two losers can go through life if they are together" cracked me up, as well as the part with "have children, it will help you appreciate your parents; they will come and take care of them for you from time to time".

The bottom line is that this is a movie that families should watch together. It would relieve the pressure of never appearing to make mistakes, trying to be a perfect whatever and missing the joy of life. Now, it's too late for my family, but this film may be a way to screw up your children less.

So, while this would not be for everyone, The Yamadas is one of those Studio Ghibli. animes that makes you have warm feelings.

Pom Poko, the Raccoon Wars (Heisei tanuki gassen ponpoko), a melancholic anime about the city encroaching and finally destroying the nature around it

Published Feb 10, 2014

Posted in
manga
misc
movies
picture

and has 0 comments

A lot of people nowadays are born in the city or a large town somewhere; nature and animal life is something you see on TV. Few older people, though, may remember what life used to be mere decades ago, when wild nature was what awaited you when you got out of your yard and people were three times fewer.

Pom Poko is a movie about the changes urban development brings to the land, as seen from the perspective of a playful and intelligent race of racoons, magically endowed with the ability to shapeshift into anything they choose. Worried, scared and finally enraged by the destruction of their home forests by the expansion of Tokyo, they decide to fight back. Alas, their efforts are in vain, there is no stopping the humans.

A beautiful anime, nicely drawn, very imaginative, it is almost impossible to dislike. The only problem I see is the rapid shift from the playfulness of the raccoons to their grief and despair and then back again. Sometimes I didn't know if to feel sad or to laugh; sometimes I could not stop myself doing both at the same time. And that is saying much: I am city born and bred and can't stand nature much, so it was an inspiring movie.

Watch this, it is another animation gem from Studio Ghibli.

Earth Abides, by George R Stewart

Published Feb 10, 2014

Posted in
misc
picture
books

and has 0 comments

Written in 1951 by George R. Stewart, Earth Abides describes the end of civilisation by way of a deadly pandemic. The main character is an intellectual, used to observe rather than do, therefore he gains comfort in the idea of observing the end of the world. He is immune to the disease, as are few others, and so he becomes not only the observer, but the patriarch of a whole new tribe of people.

The pace of the storytelling is rather slow and the story itself spans several decades, until Ish dies of old age. The book is clearly well written, and I would say well thought, as well, but I take issue with Ish's character. He is proud of being an intellectual, of reading books, he worries all the time about the fate of civilisation, but he really does nothing to share his knowledge or do something of what he is thinking of. I know that's a trait I share, unfortunately, but his level of passivity is insane.

If at the beginning of the book I was relishing the description of the single guy finding ways to survive, both physically and mentally, then liking the way the little group of people was growing into a tribe, then I kept waiting for something else to happen. Instead, they all become complacent, living in houses they didn't know how to maintain, using products from abandoned shops they did not care to learn how to make, forgetting how to read, and so on. The biggest calling of an intellectual is to continually learn and teach. Instead, in what I see as great hypocrisy, Ish is merely content to be slightly more learned than the people around him, thinking to himself like he was reading from the Bible, even if he considered himself as an atheist rationalist, then having hopes that his child will grow to be an intellectual and spread it around, as he was doing none of that. He just complained endlessly about how stuff should be! That was infuriating.

Perhaps that is why it took so long to finish the book, as the ending felt horrifying and even insulting to me: people living like the old American indigens and caring not one bit of the immense body of knowledge that came before them. Perhaps what was worse is that this scenario seems very plausible, too.

What was refreshing (if you can use this word for a book that is 60 years old) is that there were no depictions of warrior groups roaming the land, looking for slaves or whatever, or any other type of antagonistic situations that required heroic violent response. It seems to me that this is almost a requirement in modern apocalyptic sci-fi, if not in most of it.

The style of the writing and the thoughts of its main character are a bit dated, but not terribly so. Electricity is not really useful for much other than lightning and maybe listening to radio, so they don't feel they need to maintain it one bit. Women are not as learned or smart as are the men, but that's OK, because they are women. It is normal for some people to not know how to read. A man can decide for another what is best, just because he thinks he is smarter, and it is only civilised to let them choose for themselves and completely optional. Buildings are mostly wood, so a big fire would burn a town to nothing. And so on and so on.

I can't put my finger on it, but there is something 50ish about the mindset of the lead character that definitely feels alien to me now. Perhaps the idea that, even if he were to make the effort to teach the children to read, the only books that would be of use would be technical or science. That's an incredibly weird point of view to find in a fantasy literature book.

Anyway, as one D.D.Shade lamented in a 1998 review of this book: When you're talking to someone you just met and you discover they 'love' science fiction, and you ask with great anticipation if they have read Earth Abides, the answer is "No, should I?". I agree with the man. The book should be read and should be known, as a classic of the genre and a reminder of how "the first Americans" thought about these things. Don't expect to go all "Wow!" while reading it, but as it stands, there are few books that are as thorough about the end of civilisation as this one.

Tales from Earthsea, well animated Japanese cartoon, but the script fails

Published Feb 7, 2014

Posted in
manga
misc
movies
picture

and has 0 comments

I've read Legends of Earthsea and so I knew a little bit who the characters were and what the story was supposed to be. And still I got confused on what exactly had Tales from Earthsea in common with the books I remember. First of all it is a loose adaptation of the third book, so if you don't know who Sparrowhawk or Tenar are, you are out of luck. Also the Nipponification of the characters makes things a bit lame; for example Tenar is a kind and spirited woman, but completely helpless and always in need of a male to come rescue her. Even Sparrowhawk, the greatest mage in existence, is easily captured or fooled. Then there is the repetition of the same bullshit that without death there can be no life, the reason that Cob needs to be defeated. It's such a Japanese way of accepting fate that has nothing to do with the Le Guin books.

Basically Goro Miyazaki turned this beautiful fantasy into a moralizing piece of crap, where the biggest sin is that one wants to avoid death. The anime is missing the point of the books, it's completely unintelligible without reading those books, and finally does nothing for the viewer. Just read the books and enjoy the story. Or at least see the mini series ecranization of the first three books, with Shawn Ashmore as Sparrowhawk and the beautiful Kristin Kreuk as Tenar. This anime, unfortunately, had nothing working for it except the excellent animation.

Just to understand how bad this film is, I went to Imdb to rate it and I noticed that I had already rated it before. So I have seen it already, but forgotten about it. That's how unmemorable it is.

Configuring the proxy settings via a script and testing it

Published Feb 4, 2014

and has 0 comments

If you go to the system Internet Settings (in Network Connections, or Internet Explorer or Chrome), and you advance to the Tab "Connections", then click LAN Settings, then go to Advanced... I mean, why wouldn't you? ... there is a checkbox called "Use automatic configuration script". The script is supposed to dynamically return the correct proxy for an URL. The practice is called Proxy Auto Configuration for some reason. The script is Javascript and it uses some predefined functions to return either "DIRECT" (don't use a proxy) or "PROXY address:port" (use the proxy at that address and port). You can chain the options by separating them with a semicolon like this: "PROXY 1.2.3.4:55 ; PROXY 10.20.30.40:50; DIRECT". And before you search like a madman for it, there is no way to specify the username/password for those proxy servers in your config file. You still have to type them when asked.

Use this solution to fix problems with proxies that work well for outside sites, but not for internal networks. For reasons too weird to explain here (but explained here: Understanding Web Proxy Configuration) you cannot just put your script on the local drive and use it, instead you have to read it from an http URL. If you don't have the possibility (or it's too annoying) to install IIS or some other web server in order to serve the pac file, try using it from the local drive with a file:// URL (not just C:\...). However, it is a deprecated method and you may experience issues, with .NET software or Internet Explorer 11, for example.

Here is a sample file that connects directly to any URL that is part of a domain or is part of an IP class:

function FindProxyForURL(url, host) {

var defProxy="10.20.30.40:50"; // the misbehaving or incomplete proxy

var domains=[
   ".mysite.com",
   ".xxx",
   "localhost"
];
var ipClasses=[
   "11.22.33.0",
   "55.0.0.0",
   "127.0.0.0"
];

for (var i=0; i<domains.length; i++) {
  if (dnsDomainIs(host,domains[i])) return "DIRECT";
}

var MYHOST = dnsResolve(host);

for (var i=0; i<ipClasses.length; i++) {
  var mask=getMask(ipClasses[i]);
  if (isInNet(MYHOST, ipClasses[i],mask)) return "DIRECT";
}

return "PROXY "+defProxy;

function getMask(ip) {
 var splits=ip.split('.');
 for (var i=0; i<splits.length; i++) {
  if (splits[i]!='0') splits[i]='255';
 }
    return splits.join('.');
}

}

Just add the domains or the IP classes to the arrays in order to connect directly to them. Do not forget to add the local IP classes as well for direct connection, including 127.0.0.0 to access your own localhost.

In order to test or debug your .pac files, use the PacParser open source utility. A reference to the functions you can use in your script can be found here: Using Automatic Configuration, Automatic Proxy, and Automatic Detection

On game Artificial Intelligences and specifically the MinMax algorithm

Published Feb 3, 2014

Posted in
programming
chess
essay

and has 0 comments

MinMax or Minimax, as some like to call it, is the basis of most Artificial Intelligence built for games like chess. Its basis is extremely easy to understand: a rational player will try to take the best option available to them, so whatever is good for me the adversary will take as the most likely outcome and he will find the best solution against that outcome. I, following the same pattern, will also look for his best counter move and plan against it. Therefore the thinking for a game of chess, let's say, is that I will take all possible moves, find the one that leaves me with the best position (evaluated by a function from the board position), then look for the similar best play for the adversary. I continue this way until I get to the end of the game or am out of computing resources.

Now, that sounds logical and it's crazy easy to implement. The problem is that for all but the most childish of plays, the tree of all possible moves increases exponentially. And chess isn't even one of the worst games to do that. Imagine Tic-Tac-Toe, a game played on a 3x3 board between two players. You have a total of 9 possible moves to choose from as the first player, then 8, then 7, etc. The entire game tree has a total of 9! possible moves, or 362880. But generalize the game to a board of 10x10 and a winning rule of 5 in a line and you get 100! moves, which is less than 1E+158, that is 10 followed by 158 zeros.

That's why the so called pruning was created, the most common of all being Alpha-Beta, which tries to abort the processing of leaves that seem to reach a worse situation than their parent node. Of course, all of this is the general gist. You might want to take into account a number N best moves from the opponent, as well as try a more lenient pruning algorithm (after all, sacrificing a piece brings you to a worse position than when you started, but it might win the game). All of this increases, not decreases the number of possible moves.

And now comes my thought on this whole thing: how can I make a computer play like a human when the core edict of the algorithm is that all participating players are rational? Humans are rarely so. Mathematically I could take N, the number of best moves I would consider for my opponent, to be the total number of moves my opponent could make, but it would increase the exponential base of the tree of moves. Basically it would make the algorithm think of stupid things all the time.

The pruning algorithm seems to be the most important part of the equation. Indeed, I could consider the move choice algorithm to be completely random and as long as I have a perfect pruning algorithm it will remove all the stupid choices from me and let me with the smart ones. A quote comes to mind: "you reach perfection not when you have nothing else to add, but when there is nothing left to remove". It's appropriate for this situation.

Now, before attacking an algorithm that has survived for so long in the AI industry (and making my own awesome one that will defeat all chess engines in the world - of course, that's realistic) I have to consider the alternative algorithm: the lowly human. How does a human player think in a game of chess? First he surveys the board for any easy wins. That means a broad one or two levels analysis based on a simple board evaluation function. Immediately we get something from this: there might be multiple evaluation functions, we don't need just one. The simple one is for looking for greedy wins, like "He moved his queen where I can capture it, yay!".

The same outcome for situations like this would be achieved by a MinMax algorithm, so we ignore this situation. It gets more interesting from now, though. We look for the moves of the most active pieces. I know that this is the rookie system, but I am a rookie, I will make my computer algorithm be as stupid as I am, if I am to play it, so shut up! The rookie will always try to move his queen to attack something. It's the most powerful piece and it should get the most results for the least effort. We left Greed behind, remember? We are now doing Sloth. Still, with a good pruning algorithm we eliminate stupid Queen moves from the beginning, so considering the Queen first, then Rooks, then Bishops, then Knights, etc. is not a bad idea. The order of the pieces can be changed based on personal preferences as well as well established chess rules, like Knights being better that Bishops in closed games and so on.

This is a small optimization, one that probably most game engines have. And we haven't even touched pruning; boy, this is going to be a long article! Now, what does the human do? He does the depth first tree searches. Well, he doesn't think of them like that, he thinks of them as narrative, but it's basically a depth first search. This is the casual "What if...?" type of play. You move the Queen, let's say, bringing it right in the enemy territory. You don't capture anything important, but to bring a strong piece this uncomfortably near to the enemy king is scary. You don't play for game points, but for emotion points, for special effects, for kicks! You don't abandon the narrative, the linear evolution of your attack, until you find that it bears no fruit. It's the equivalent of the hero running toward the enemy firing his pistol. If the enemy is dumb enough to not take cover, aim carefully and shoot a burst from their SMGs, you might get away with it and it would be glorious. If not, you die idiotically.

It is important to note that in the "Hollywood" chess thinking you are prone to assume that the enemy will make mistakes in order to facilitate your brilliant plan. The evaluation goes as follows: "I will try something that looks cool if the chances for a horrible and immediate loss are small". When some hurdle foils your heroic plan, you make subplans that would, as well as you hope, distract the adversary from your actual target. This, as far as I know, is a typical human reasoning type and I doubt many (if any) computer game engines have it. In computer terms, one would have to define a completely new game, a smaller one, and direct an AI designed specifically for it to tell you if it would work or not. Given the massively parallel architecture of the human brain, it is not hard to understand why we do something like this. But we can do the same with a computer, mind you. I am thinking of something like a customized MinMax algorithm working on few levels, one or two, as the human would. That would result in a choice of N possible moves to make. Then construct a narrative for each, a depth search that just tries to get as much as possible from each move without considering many of the implications. Then assign a risk to each level of this story. If the level exceeds a threshold, use the small range MinMax at those points and try to see if you can minimize the risk or if at that point the risk makes your narrative unlikely.

Let's recap the human thinking algorithm so far:

Try to greedily take what the opponent has stupidly made available
Try to lazily use the strongest piece to get the most result with the least effort
Try to pridefully find the most showy move, the one that would make the best drinking story afterwards
Try to delegate the solving of individual problems in your heroic narrative to a different routine

Wow! Doesn't it seem that the seven deadly sins are built-in features, rather than bugs? How come we enjoy playing with opponents that pretty much go through each of them in order to win more than we do with a rational emotionless algorithm that only does what is right?

Again, something relevant transpires: we take quite a long time imagining the best moves we can make, but we think less of the opponent's replies. In computer terms we would prune a lot more the enemy possible moves than we would our own. In most rookie cases, one gets absorbed by their own attack and ignores moves that could counterattack. It's not intuitive to think that while you are punching somebody, they would choose to punch back rather than avoid the pain. In chess it's a little bit easier and more effective, since you can abandon a piece in order to achieve an overall gain in the game, but it can and it is done in physical combat as well.

Okay, we now have two alternatives. One is the logical one: take into account all the rules chess masters have taught us, shortcuts for achieving a better position on the board; choose moves based on those principles and then gauge the likely response from the opponent. Repeat. This is exactly like a MinMax algorithm! So we won't do that. The hell with it! If I can't enjoy the game, neither will my enemy!!

Human solution: don't do anything. Think of what your opponent would do, if you wouldn't move anything and foil their immediate plan. This way of thinking would be counterintuitive for a computer algorithm. Functioning on the basis of specific game rules, a computer would never be inclined to think "what would the enemy do if I didn't move anything, which is ILLEGAL in chess?". That makes us superior, obviously ;-)

Slowly, but surely, a third component of the algorithm becomes apparent: the move order choice. Let's imagine a naive MinMax implementation. In order to assess every possible move, it would have to enumerate them. If the list of moves is always the same in a certain board position, the game will always proceed the same way. The solution is to take the list of possible moves, but in a random order. In the case of the "human algorithm" the ordering becomes more complex (favouring powerful piece moves, for example). One could even consider the ordering mechanism responsible for choosing whether to do a careful breadth search for each level or a depth first one.

Here is a suggestion for an algorithm, one that takes into account the story of the game and less the objective gain or position strength:

For each of your power pieces - anything but the king and pawns - compute mobility, or the possibility to move and attack. Favour the stronger pieces first.
For each power piece with low mobility consider pawn moves that would maximize that mobility.
For each power piece with high mobility consider the moves that would increase the chance of attack or that would attack directly
For each strong move, consider the obstacles - enemy pieces, own pieces, possible enemy countermeasures
Make the move that enables the considered power move or that foils the enemy attempts of reply

The advantage of this approach is that it only takes into account the enemy when he can do something to stop you, the pawns only when they can enable your devious plan and focuses on ventures that yield the best attack for your heroes. For any obstruction, you delegate the resolution of the problem to a different routine. This makes the algorithm parallelizable as well as modular - something we devs love because we can test the individual parts separately.

This algorithm would still use a board estimation function, but being more focused on heroic attacks, it would prefer interesting move orders to static positions as well as the "fun factor", something that is essential to a human-like algorithm. If the end result of the attack is a check-mate, then it doesn't really matter what position estimate you get when you did half the moves. All one has to wonder is if the attack is going to be successful or not and if one can do something to improve the chances of success. And indeed this is one of the most difficult aspects for a chess playing human: to switch from a failing plan to a successful plan when it is not yet clear is the first plan is failing. We invest energy and thought into an idea and we want it to work. A lot of the chess playing strategy of human rookies relies on prayer, after all. A computer would just assess the situation anew at every move, even if it has a strategy cached somewhere. If the situation demands it, a new strategy will be created and the last one abandoned. It's like killing your child and making another!

But, you will say, all you did so far was to describe an inferior algorithm that can be approximated by MinMax with only custom choices for the pruning and move order functions! You are missing the point. What I am describing is not supposed to beat Grand Masters, but to play a fun game with you, the casual player. More than that, my point is that for different desired results, different algorithms must be employed. This would be akin to creating a different AI for each level of a chess game.

Then there is the issue of the generalized TicTacToe or other games, such as Arimaa, created specially to make it difficult for computer algorithms to play, where MinMax fails completely. To make a comparison to real life, it's like you would consider the career steps you would take in life based on all possible jobs available, imagining what would it be to be employed there, what the difficulties might be, finding solutions to those problems, repeating the procedure. You will get to the conclusion that it is a good idea to become a computer scientist after thoroughly examining and partially understanding what it would be like to be a garbage man, a quantum scientist, a politician and a gigolo, as well as all the jobs in between. Of course, that is not as far fetched as you think, since in order to be a success in software development you must be at least a politician and a garbage man, perhaps even a gigolo. Lucky for our profession, quantum computers are in the works, too.

The same incongruency can be found when thinking of other games humans enjoy, like races. The desired result can only be achieved at the end of the race, when you actually get somewhere. In order to get to that specific point in space, you could consider the individual value of each direction change, or even of each step. However humans do it differently, they specify waypoints that must be achieved in order to get to the finish and then focus on getting from waypoint to waypoint, rather than rethinking the entire course. In computer terms this is a divide-and-conquer strategem, where one tries to solve a problem that has known start and end points by introducing a middle point and then solving the problem from the start to the middle. BTW, this also solves Zeno's paradox: "Why does the arrow reach its target if, at any point in its course, it has at least half the distance left to fly?" and the answer is "Because of the exit condition that prevents a stack overflow". Try to sell that one in a philosophy class, heh heh.

So why aren't chess AIs based on human thinking processes? Why don't they implement a divide and conquer solution for a game that always starts with a specific board position and ends in capturing a specific piece? Why do chess engines lower their "level" by sometimes randomly choosing a completely losing path instead of something that is plausible to choose, even if completely wrong objectively? How can MinMax be the best general algorithm for game AIs, when some of them have a branching factor that makes the use of the algorithm almost useless?

I obviously don't have the answers to these questions, but I may have an opportunity to explore them. Hopefully I will be less lazy than I usually am and invent something completely unscientific, but totally fun! Wish me luck!

The burden of choice

Published Feb 3, 2014

Posted in
misc
music
rant
picture
essay
video

and has 0 comments

A lot of the political discourse these days relates to the difference between democratic and non-democratic systems. More close to home, the amount of choice a government allows and - do not forget that part - demands from the individual. The usual path of such discourse is either "We let you do what you want!" or "We won't allow people do what you don't want!". I am telling you here that there is only a difference of nuance here, both systems are essentially doing the same thing, with top-to-bottom approaches or bottom-to-top. Like with the Borg in Star Trek, there is a point where both meet and make definition impossible.

My first argument is that the ideal democracy encourages personal freedom as long as it doesn't bother anyone else. That makes a lot of sense, like not allowing someone to kill you because they feel you're an asshole. Many people today live solely because of this side of democratic society. But it also means something else, something you are less prone to notice: you are demanded to know what everybody affected by your actions would feel about them. Forget the legal system, which in its annoying cumbersome way is only a shortcut to the principle described before. This is what it means, people: know your friends, know your enemies, join up! Otherwise you will just offend hard enough somebody who is important enough to make it illegal.

The non-democratic societies function like the all mighty parent of all. Under such governorship, all individual are children, incapable of making their own choices, unless supported by the whole of society or at least a large part of it. That's terribly oppressive, as it lets you do only what is communally permissible. But it also allows you the freedom of ignoring the personal choices of others. You don't need to know anything about anybody, just adhere to a set of rules that defines what you are allowed to do. It's that easy! That's why the system is so popular with uneducated people. Or maybe I should say lazy, to involve also those super educated people who end up supporting one radical view or another because it is inconvenient to find a middle ground compromise.

I am a techie, as you may know, so I will reduce all this human complexity to computer systems. Yes, I can! The first computer systems, created by scientists and highly technical people, were almost impossible to use. Not because they didn't let you do stuff, but because they let you do anything you wanted, assuming you were smart enough to understand what you were playing with. Obviously, few of us are really that smart. Even fewer want to make the effort. This is an important point: it's not that you are stupid, that you didn't read the manual, or anything like that. It's a rather aristocratic reason: you don't want to, don't need to, you expect comfort from the people who give you a complicated piece of machinery to operate. I mean, if they are smart to build one, why can't they make it so easy to use that a child could do it? (child sold separately, of course)

The answer to these complex UNIX systems was DOS, then Windows, then IOS. Operating systems increasingly dumbed down for the average user. Now everybody has a computer, whether a desktop, a laptop, a tablet, a smartphone or a combination of these. Children have at their fingertips computers thousands of times more powerful that what I was using as a desktop in my childhood, and it is all because they have operating systems that allow them to quickly "get it" and do what they feel like. They are empowered by them to do... well.. incredibly idiotic things, but that is what children do. That's how they learn.

You get where I am getting at, I guess. We are all children now, with tools that empower us to get all the information and disinformation we could possibly want. And here is where it gets fuzzy. The totalitarian systems of yesterday are failing to constrain people to conform to the rules because of the freedom technology brought. But at the same time the democratic systems are also failing, because the complicated legal systems that were created as a shortcut for human stupidity and lack of understanding of the needs of others completely break down in front of the onslaught of technology, empowering people to evolve, change, find solutions faster than antiquated laws can possibly advance. The "parents" are in shock, whether biological ones or just people who think they know better for some reason.

Forget parents, older brothers can hardly understand what the youth of today is talking about. Laws that applied to your grandparents are hardly applicable to you, but they are incomprehensible to your children. The world is slowly reaching an equilibrium, not that of democracy and not that of totalitarianism, but the one in between, where people are not doing what they are allowed to, but what they can get away with! And that includes (if not first and foremost) our governors.

This brings me to the burden of choice, the thing that really none of us wants. We want to be able to choose when we want to be able to choose. And before you attack my tautology, think about it. It's true. We want to have the choice in specific contexts, while most of the time we want that choice removed from us, or better said: we want to be protected from choice, when that choice is either obvious, difficult to make or requiring skills we don't have. That is why you pay an accountant to hold the financial reins of your company, even if it is your lifeblood, and you trust that that person will make the right choices for you. If he doesn't, your life is pretty much forfeit, but you want it like that. The alternative is you would understand and perform accounting. Death is preferable.

You know that there are still operating systems that allow a high level of choice, like Linux. They are preferable to the "childish" operating systems because they give you all the options you want (except user friendliness, but that bit has changed too in the last decade). The most used mobile operating system nowadays is probably Android and if it not, it will be soon. It swept the market that Apple's IPhone was thought to master because it gave everybody (users and developers) The Choice. But the off the shelf Android phone doesn't allow that choice to the average user. You have to be technically adept first and emotionally certain second that you want to enable that option on your own phone! It's like a coming of age ritual, if you will, the first "jailbreak" or "root" of your smartphone.

How does that translate to real life? Right now, not much, but it's coming. It should be, I mean. Maybe I am overly optimistic. You get the accountants that find loopholes to pay less taxes, the lawyers that find the path to getting away with what normally would be illegal, the businessmen that eskew the rules that apply to any others. They are the hackers of the system, one that is so mindbogglingly complex that computer science seems a child's game in comparison. If you mess with them, they quickly give you the RTFM answer, like the Linuxers of old, though.

The answer: make the system user friendly. Technology can certainly help now. There will be hackers of the system no matter what you do, but if the system is easy to use, everyone will have the choice, when they want it, and will not be burdened by it, when they don't want it. People talking to find a solution to a problem? When did that ever work? We need government, law, business, social services, everyday life to work "on Android". We need the hurdles that stop us from enabling the "Pro" options, but they must not be impossible to get through. Bring back the guilds - without the monopoly - when people were helping each other to get through a problem together. Liberalize the banking and governmental systems. Forget about borders: just "subscribe" to a government, "like" a bank, "share" a life.

You think this is hard, but it is not. You can survive in an old fashioned system just as much and as well as you can survive in real life without using a computer. You can't! You can dream of a perfect house in the middle of nowhere with the white picket fence, where you will be happy with your spouse, children, dog, but really, that doesn't exist anymore. Maybe in a virtual world. Where the spouse will not nag, the children will actually love you instead of doing things you don't even begin to understand and the dog will never wake you up when you need to sleep. Use the tools you have to make your life simpler, better, depth first!

I assume some people would give me the attitude that is prevalent in some movies that try to explore this situation: "you want to escape reality!" - Yes! Who doesn't? Have you seen reality lately? "you want to play God!" - Yes! I like playing and I would like being God: win-win! And if I cannot, I will get real serious and not play, just be! Is that OK? "this is fantasy, this cannot be!" - Join the billions of dead people who thought the same about what you are doing daily without thinking about it. "You are an anarchist! The government as it is today knows what to do!" or "Allah/Jesus/Dawkings know best!" - no, they don't! And if they knew, they wouldn't tell you, so there.

It all comes to dynamical systems versus static ones. You don't go to the web to search for things and find what you were actually looking for because there is a law against sites hijacking your searches. It is because people want it enough so that a service like Google appeared. You can still find your porn and your torrents, though.

Consider every option you may possible have as a service. You need the service to be discoverable, but not mandatory or oppressive in its design, it has to be easy to use. You want to be able to find and use it, but not for it to be imposed on you. A good example for this is copyright. A small community of producers and a significantly larger one of intermediaries trying to leach on them are attempting to force a huge community of consumers abide to the (otherwise moral and reasonable) laws of paying for what you want and others worked for. The procedure is so annoying that people spontaneously organize to create the framework that democratizes theft. Someone is risking jail to film the movie in the cinema so you can download it free. Why is that? Because technology increases the dynamicity of the system with orders of magnitude. Another service is sex. Porn be damned, prostitutes don't stay on street corners anymore, they wait on the web for you to need them. Supply and demand. So the important point is what are you really demanding?

You know what you won't find on the web? Easy to use government sites. Services that would make it simple to interact with laws, lawmakers, local authorities, country officials. All similar attempts are notoriously bad, if at all present. Why is that? Because the system itself is obsolete, incapable of adapting. Built from centuries of posturing and politicking, it has as little connection to reality as a session of Angry Birds. And you may be enjoying the latter. They survived as long as they have because they were the best at one thing: limiting your choices. Even if you hated it, you enjoyed other people being as limited as you. But the dam is breaking, the water is sipping through, it will all vanish in a deluge of water and debris. It's already started, with peer to peer banks and online cryptographic currencies and what not. Why wait for it? Join the nation of your choice; if there isn't one you like, create one. Be God, be Adam, Eve, the serpent or any combination thereof - whatever you do, just don't be yourself, no one likes that.

I leave you with the beautiful words and music of Perfect Circle: Pet. Something so awesome an entire corporation was created to offer the ability for people to share the song with you, for free, even if theoretically it's illegal.

FatalExecutionEngineError in the WebBrowser control

Published Feb 3, 2014

and has 0 comments

I created a little piece of software which was supposed to get as much of the content I was interested in, then display it in a browser. It was a WPF application, but I doubt it matters that much, since the WebBrowser control there is still based on the Internet Explorer COM control that is also used in Windows Forms.

Anyway, the idea was simple: connect to the URL of the item I was selecting, display the page in the browser and, if the title of the loaded document was that of an error page (hence, the browser could not load it - I found no decent way to determine the response error code) I would display the HTML content that I gathered earlier. These two feats were, of course, realized using the Navigate and NavigateToString methods - more or less hacked to look like valid MVVM for WPF :-P. Everything worked, went to the place where the Internet is a far away myth - my Italian residence, started the application and ...

FatalExecutionEngineError was detected Message: The runtime has encountered a fatal error. The address of the error was at 0x64808539, on thread 0xf84. The error code is 0x80131623. This error may be a bug in the CLR or in the unsafe or non-verifiable portions of user code. Common sources of this bug include user marshaling errors for COM-interop or PInvoke, which may corrupt the stack.. Ignore the hexadecimal numbers, I strongly doubt they were much help.

This horrible looking exception was thrown through a try/catch block and I could find no easy way to get to the source of the problem. The InnerException wasn't too helpful either: System.ExecutionEngineException was unhandled HResult=-2146233082 Message=Exception of type 'System.ExecutionEngineException' was thrown.

I did what every experienced professional does in these situations: googled for "WPF WebBrowser not working!!!". And I found this guy: Fatal Execution Error on browser ReadyState, who described a similar situation caused by interogating the document readyState property, which I was also doing! Amazingly, it worked. At first.

The second step of the operation was to go with this software to the place where the Internet exists, but it is guarded by huge trolls that live under the gateway - my Italian/European Commission workplace. Kaboom! The dreaded exception appeared again, even if I had configured my software to work with a firewall and an http proxy. Meanwhile, the harddrive of my laptop failed and I had to reinstall the software on the computer, from Windows 8 to Windows 7. Same error now consistently appeared at home.

At the end of this chain of events, the other tool of the software professional - blind trial and error - prevailed. All I had to do was to NavigateToString via the WebBrowser control Dispatcher, thus:

wbMain.Dispatcher.BeginInvoke(new Action(() =>
{
wbMain.NavigateToString(content);
}));

... and everyone lived happily ever after.

Deserializing/Serializing XML that contains xsi:type attributes (and other XML adventures)

Published Jan 31, 2014

Posted in
programming
C#
XML

and has 12 comments

I wanted to take an arbitrary XML format and turn it into C# classes. I even considered for a while to write my own IXmlSerializable implementation of the classes, but quickly gave up because of their large number and heavy imbrication. Before we proceed you should know that there are several ways in which to turn XML into C# classes. Here is a short list (google it to learn more):

In Visual Studio 2012, all you have to do is copy the XML, then go to Edit -> Paste Special -> Paste XML as classes. There is an option for pasting JSON there as well.
There is the xsd.exe option. This is usually shipped with the Windows SDK and you have to either add the folder to the PATH environment variable so that the utility works everywhere, or use the complete path (which depends on which version of SDK you have).
xsd2Code is an addon for Visual Studio which gives you an extra menu option when you right click an .xsd file in the Solution Explorer to transform it to classes
Other zillion custom made tools that transform the XML into whatever

Anyway, the way to turn this XML into classes manually (since I didn't like the output of any of the tools above and some were even crashing) is this:

Create a class that is decorated with the XmlRoot attribute. If the root element has a namespace, don't forget to specify the namespace as well. Example:
[XmlRoot(ElementName = "RootElement", Namespace = "http://www.somesite.org/2005/someSchema", IsNullable = false)]
For each descendant element you create a class. You add a get/set property to the parent element class, then you decorate it with the XmlElement (or XmlAttribute, or XmlText, etc). Specify the ElementName as the exact name of the element in the source XML and the Namespace url if it is different from the namespace of the document root. Example:
[XmlElement(ElementName = "Integer", Namespace = "http://www.somesite.org/2005/differentSchemaThanRoot")]
If there are supposed to be more children elements of the same type, just set the type of the property to an array or a List of the class type representing one element
Create an instance of an XmlSerializer using the type of the root element class as a parameter. Example:
var serializer = new XmlSerializer(typeof(RootElementEntity));
Create an XmlSerializerNamespaces instance and add all the namespaces in the document to it. Example:
var ns = new XmlSerializerNamespaces(); ns.Add("ss", "http://www.somesite.org/2005/someSchema"); ns.Add("ds", "http://www.somesite.org/2005/differentSchemaThanRoot");
Use the namespaces instance to serialize the class. Example: serializer.Serialize(stream, instance, ns);

The above technique serializes a RootElementEntity instance to something similar to:
<ss:RootElement xmlns:ss="http://www.somesite.org/2005/someSchema" xmlns:ds="http://www.somesite.org/2005/differentSchemaThanRoot">
<ds:Integer>10</ds:Integer>
</ss:RootElement>

Now, everything is almost good. The only problem I met doing this was trying to deserialize an XML containing xsi:type attributes. An exception of type InvalidOperationException was thrown with the message "The specified type was not recognized: name='TheType', namespace='http://www.somesite.org/2005/someschema', at " and then the XML element that caused the exception. (Note that this is an internal exception of the first InvalidOperationException thrown that just says there was an error in the XML)

I finally found the solution, even if it is not the most intuitive. You need to create a type that inherits from the type you want associated to the element. Then you need to decorate it (and the original element) with an XmlRoot attribute specifying the namespace (even if the namespace is the same as the one of the document root element). And then you need to decorate the base type with the XmlInclude attribute. Here is an example.

The XML:

<ss:RootElement xmlns:ss="http://www.somesite.org/2005/someSchema" xmlns:ds="http://www.somesite.org/2005/differentSchemaThanRoot">
<ds:Integer>10</ds:Integer>
<ss:MyType xsi:type="ss:TheType">10</ss:MyType>
</ss:RootElement>

You need to create the class for MyType then inherit TheType from it:

[XmlRoot(Namespace="http://www.somesite.org/2005/someSchema")]
[XmlInclude(typeof(TheType))]
public class MyTypeEntity {}

[XmlRoot(Namespace="http://www.somesite.org/2005/someSchema")]
public class TheType: MyTypeEntity {}

Removing any of these attributes makes the deserialization fail.

Hope this helps somebody.

T-SQL Convert and Cast turn empty string to default value, NOT null

Published Jan 24, 2014

Posted in
database
programming

and has 0 comments

It is a bit embarrassing not knowing this at my level of software development, but I was stunned to see other people, even more experienced than I, had the same lack of knowledge. Apparently Microsoft SQL Server converts empty or whitespace strings to default values when using CONVERT or CAST. So CONVERT(INT,''), equivalent to CAST('' as INT), equals 0. DATETIME conversion leads to a value of 1900-01-01. And so on. That means that a good practice for data conversion when you don't know what data you may be getting is to always turn whitespace to null before using CONVERT or CAST. Also, in related news, newline is NOT whitespace in T-SQL so LTRIM(CHAR(10)) and LTRIM(CHAR(13)) is not empty string!

Bottom line: instead of CONVERT(<type>,<unknown string value>) use the cumbersome CONVERT(<type>,CASE WHEN LTRIM(RTRIM(<unknown string value>))!='' THEN <unknown string value> END). Same with CAST.

Here is a table of conversions for some values converted to FLOAT:

Value	Normal CONVERT	Cumbersome CONVERT	TRY_CONVERT
NULL	NULL	NULL	NULL
'' (empty string)	0	NULL	0
' ' (whitespace)	0	NULL	0
' ' (whitespace and newlines)	Conversion error	Conversion error	NULL
'123'	123	123	123

You might think this is not such a big deal, but in Microsoft SQL 2012 they introduced TRY_CONVERT and the similar TRY_CAST, which return null if there is a conversion error. This means that for an incorrect string value the function would return null for most but empty string, where it would return the default value of the type chosen, thus resulting in an inconsistent behavior.

Comparing the content of two similar web pages

Published Jan 23, 2014

Posted in
.NET
programming

and has 0 comments

For a personal project of mine I needed to gather a lot of data and condense it into a newsletter. What I needed was to take information from selected blogs, google queries and various pages that I find and take only what was relevant into account. Great, I thought, I will make a software to help me do that. And now, proverbially, I have two problems.

The major issue is that after getting all the info I needed, I was stuck on reading thousands of web pages to get to the information I needed. I was practically spammed. The thing is that there aren't even so many stories, it's just the same content copied from news site to news site, changing only the basic structure of the text, maybe using other words or expanding and collapsing terms in and out of abbreviations and sometimes just pasting it exactly as it was in the source, but displayed in a different web page, with a different template.

So the challenge was to compare two or more web pages for the semantic similarity of the stories. While there is such theory as semantic text analysis, just google for semantic similarity and you will get mostly PDF academic white papers and software that is done in Python or some equally disgusting language used only in scientific circles. And while, true, I was intrigued and for a few days I entertained the idea of understanding all that and actually building a C# library up to the task, I did not have the time for it. Not to mention that the data file I was supposed to parse was growing day by day while I was dallying in arcane algorithms.

In conclusion I used a faster and more hackish way to the same end. Here is how I did it.

The first major hurdle was to clear the muck from the web page and get to the real information. A simple html node innerText would not do. I had to ignore not only HTML markup, but such lovely things as menus, ads, sidebars with blog information, etc. Luckily there is already a project that does that called Boilerpipe. And before you jump at me for linking to a Java project, there is also a C# port, which I had no difficulties to download and compile.

At the time of the writing, the project would not compile well because of its dependency to a Mono.Posix library. Fortunately the library was only used for two methods that were never used, so I just removed the reference and the methods and all was well.

So now I would mostly have the meaningful text of both web pages. I needed an algorithm to quickly determine their similarity. I skipped the semantic bit of the problem altogether (trying to detect synonyms or doing lexical parsing) and I resorted to String Kernels. Don't worry if you don't understand a lot of the Wikipedia page, I will explain how it works right away. My hypothesis was that even if they change some words, the basic structure of the text remains the same, so while I am trying to find the pages with the same basic meaning, I could find them by looking for pages with the same text structure.

In order to do that I created for each page a dictionary with string keys and integer values. The keys would be text n-grams from the page (all combinations of three characters that are digits and letters) and the values the count of those kernels in the Boilerpipe text. At first I also allowed spaces in the character list of kernels, but it only complicated the analysis.

To compare a page to others, I would take the keys in the kernel dictionary for my page and look for them in the dictionaries of other pages, then compute a distance out of the counts. And it worked! It's not always perfect, but sometimes I even get pages that have a different text altogether, but reference the same topic.

You might want to know what made me use 3-grams and not words. The explanation comes mostly from what I read first when I started to look for a solution, but also has some logic. If I would have used words, then abbreviations would have changed the meaning of the text completely. Also, I did not know how many words would have been in a few thousand web pages. Restricting the length to three characters gave me an upper limit for the memory used.

Conclusion: use the .Net port of Boilerpipe to extract text from the html, create a kernel dictionary for each page, then compute the vector distance between the dictionaries.

I also found a method to compare the dictionaries better. I make a general kernel dictionary (for all documents at once) and then the commonality of a bit of text is the number of times it appears divided by the total count of kernels. Or the number of documents in which it is found divided by the total number of documents. I chose commonality as the product of these two. Then, one computes the difference between kernel counts in two documents by dividing the squared difference for each kernel by its commonality and adding the result up. It works much better like this. Another side effect of this method is that one can compute how "interesting" a document is, by adding up the counts of all kernels divided by their commonality, then dividing that to the length of the text (or the total count of kernels). The higher the number, the less common its content would be.

A compressed string class

Published Jan 13, 2014

Posted in
.NET
programming
C#

and has 0 comments

I admit this is not a very efficient class for my purposes, but it was a quick and dirty fix for a personal project, so it didn't matter. The class presented here stores a string in a compressed byte array if the length of the string exceeds a value. I used it to solve an annoying XmlSerializer OutOfMemoryException when deserializing a very large XML (400MB) in a list of objects. By objects had a Content property that stored the content of html pages and it went completely overboard when putting in memory. The class uses the System.IO.Compression.GZipStream class that was introduced in .Net 2.0 (you have to add a reference to System.IO.Compression.dll). Enjoy!

    public class CompressedString
    {
        private byte[] _content;
        private int _length;
        private bool _compressed;
        private int _maximumStringLength;

        public CompressedString():this(0)
        {
        }

        public CompressedString(int maximumStringLengthBeforeCompress)
        {
            _length = 0;
            _maximumStringLength = maximumStringLengthBeforeCompress;
        }

        public string Value
        {
            get
            {
                if (_content == null) return null;
                if (!_compressed) return Encoding.UTF8.GetString(_content);
                using (var ms = new MemoryStream(_content))
                {
                    using (var gz = new GZipStream(ms, CompressionMode.Decompress))
                    {
                        using (var ms2 = new MemoryStream())
                        {
                            gz.CopyTo(ms2);
                            return Encoding.UTF8.GetString(ms2.ToArray());
                        }
                    }
                }
            }
            set
            {
                if (value == null)
                {
                    _content = null;
                    _compressed = false;
                    _length = 0;
                    return;
                }
                _length = value.Length;
                var arr = Encoding.UTF8.GetBytes(value);
                if (_length <= _maximumStringLength)
                {
                    _compressed = false;
                    _content = arr;
                    return;
                }
                using (var ms = new MemoryStream())
                {
                    using (var gz = new GZipStream(ms, CompressionMode.Compress))
                    {
                        gz.Write(arr, 0, arr.Length);
                        gz.Close();
                        _compressed = true;
                        _content = ms.ToArray();
                    }
                }
            }
        }

        public int Length
        {
            get
            {
                return _length;
            }
        }
    }

Mysteries of the Microscopic World - a The Great Courses err.. course

Published Jan 13, 2014

Posted in
misc
movies
picture

and has 1 comment

I can't emphasize enough how cool the video courses from ~~The Teaching Company~~The Great Courses are. They are in the format of a university course, but no one is there to take notes so the pace of presentation is natural, it is all recorded on video. No black or white boards, either, as the visualizations of what the presenter is saying are added later via computer. Most courses have from 10 to 40 lectures, all in an easy to understand language, but no trace of the ridiculous tricks and populist stupidities in TV documentaries.

This course - Mysteries of the Microscopic World, presented by Bruce E. Fleury - in particular is very interesting, as it discusses microorganisms in relation to human culture. Especially interesting are lectures 11 to 13, discussing the hideous pandemic of 1918, of which nobody seems to be talking or making heroic movies about or even remember, even if it killed from 50 to 100 million people. In comparison, first world war killed a measly 8.5 million. Why is that? Is it as Dr. Fleury suggests, that the pandemic was a horrible and completely unstoppable phenomenon from which no one felt they had escaped or in face of which there were no heroes? I find this almost as disgusting as the disease itself, that people would only want to document their triumphs.

Anyway, for an old guy, Bruce is a funny man. He is very eloquent and not at all boring, despite his fears. The course goes from explaining what microorganisms are, how they evolved, the perpetual arms race against other organisms, including us, how they influenced history and even how they were used in biological warfare, AIDS and even allergies, all in 24 lectures. I think a lot of information in this course is something unlikely for you to have accidentally overheard or to have been exposed to, therefore of high quality.

As an additional bonus, you get to understand not only the evolution of medicine, but of all the quack snake oil ideas that are periodically emerging in "naive populations", truly epidemics in their own right, and even the source of some of the most common sayings and symbols. For example the symbol of medicine has little to do with the wisdom of snakes, but more with the procedure to remove nematode worms from someone's flesh by wrapping them slowly around a stick.

All in all a wonderful course, created and presented by a guy who is clearly adverse to bullshit and who has read and has worked quite a bit to make it. Give it a try!