Introduction to algorithmic complexity – 1/2

Note: this is the translation of a post that was originally published in French: Introduction à la complexité algorithmique – 1/2.

There, now that I warmed up by writing a couple of posts where I knew where I wanted to go (a general post about theoretical computer science, and a post to explain what is a logarithm, because it’s always useful). And then I made a small break and talked about intuition, because I needed to gather my thoughts. So now we’re going to enter things that are a little bit more complicated, and that are somewhat more difficult to explain for me too. So I’m going to write, and we’ll see what happens in the end. Add to that that I want to explain things while mostly avoiding the formality of maths that’s by now “natural” to me (but believe me, it required a strong hammer to place it in my head in the first place): I’m not entirely sure about the result of this. I also decided to cut this post in two, because it’s already fairly long. The second one should be shorter.

I already defined an algorithm as a well-defined sequence of operations that can eventually give a result. I’m not going to go much further into the formal definition, because right now it’s not useful. And I’m also going to define algorithmic theory, in a very informal way, as the quantity of resources that I need to execute my algorithm. By resources, I will mostly mean “time”, that is to say the amount of time I need to execute the algorithm; sometimes “space”, that is to say the amount of memory (think of it as the amount of RAM or disk space) that I need to execute my algorithm.

I’m going to take a very common example to illustrate my words: sorting. And, to give a concrete example of my sorting, suppose I have a bookshelf full of books (an utterly absurd proposition). And that it suddenly takes my fancy to want to sort them, say by alphabetical order of their author (and by title for two books by same author). I say that a book A is “before” or “smaller than” a book B if it must be put before in the bookshelf, and that it is “after” or “larger than” the book B if it must be sorted after. With that definition, Asimov’s books are “before” or “smaller than” Clarke’s, which are “before” or “smaller than” Pratchett’s. I’m going to keep this example during the whole post, and I’ll draw parallels to the corresponding algorithmic notions.

Let me first define what I’m talking about. The algorithm I’m studying is the sorting algorithm: that’s the algorithm that allows me to go from a messy bookshelf to a bookshelf whose content is in alphabetical order. The “input” of my algorithm (that is to say, the data that I give to my algorithm for processing), I have a messy bookshelf. The “output” of my algorithm, I have the data that have been processed by my algorithm, that is to say a tidy bookshelf.

I can first observe that, the more books I have, the longer it takes to sort them. There’s two reasons for that. The first is that, if you consider an “elementary” operation of the sort (for instance, put a book in the bookshelf), it’s longer to do that 100 times than 10 times. The second reason is that if you consider what you do for each book, the more books there is, the longer it is. It’s longer to search for the right place to put a book in the midst of 100 books than in the midst of 10.

And that’s precisely what we’re interested in here: how the time that is needed to get a tidy bookshelf grows as a function of the number of books or, generally speaking, how the time necessary to get a sorted sequence of elements depends on the number of elements to sort.

This time depends on the sorting method that is used. For instance, you can choose a very, very long sorting method: while the bookshelf is not sorted, you put everything on the floor, and you put the books back in the bookshelf in a random order. Not sorted? Start again. At the other end of the spectrum, you have Mary Poppins : “Supercalifragilistic”, and bam, your bookshelf is tidy. The Mary Poppins method has a nice particularity: it doesn’t depend on the number of books you have. We say that Mary Poppins executes “in constant time”: whatever the number of books that need to be sorted, they will be within the second. In practice, there’s a reason why Mary Poppins makes people dream: it’s magical, and quite hard to do in reality.

Let’s go back to reality, and to sorting algorithms that are not Mary Poppins. To analyze how the sorting works, I’m considering three elementary operations that I may need while I’m tidying:

  • comparing two books to see if one should be before or after the other,
  • add the books to the bookshelf,
  • and, assuming that my books are set in some order on a table, moving a book from one place to another on the table.

I’m also going to suppose that these three operations take the same time, let’s say 1 second. It wouldn’t be very efficient for a computer, but it would be quite efficient for a human, and it gives some idea. I’m also going to suppose that my bookshelf is somewhat magical (do I have some Mary Poppins streak after all?), that is to say that its individual shelves are self-adapting, and that I have no problem placing a book there without going “urgh, I don’t have space on this shelf anymore, I need to move books on the one below, and that one is full as well, and now it’s a mess”. Similarly: my table is magical, and I have no problem placing a book where I want. Normally, I should ask myself that sort of questions, including from an algorithm point of view (what is the cost of doing that sort of things, can I avoid it by being clever). But since I’m not writing a post about sorting algorithms, but about algorithmic complexity, let’s keep things simple there. (And for those who know what I’m talking about: yeah, I’m aware my model is debatable. It’s a model, it’s my model, I do what I want with it, and my explanations within that framework are valid even if the model itself is debatable.)

First sorting algorithm

Now here’s a first way to sort my books. Suppose I put the contents of my bookshelf on the table, and that I want to add the books one by one. The following scenario is not that realistic for a human who would probably remember where to put a book, but let’s try to imagine the following situation.

  1. I pick a book, I put it in the bookshelf.
  2. I pick another book, I compare it with the first: if it must be put before, I put it before, otherwise after.
  3. I pick a third book. I compare it with the book in the first position on the shelf. If it must be put before, I put it before. If it must be put after, I compare with the book on the second position on the shelf. If it must be before, I put it between both books that are already in the shelf. If it must be put after, I put it as last position.
  4. And so on, until my bookshelf is sorted. For each book that I insert, I compare, in order, with the books that are already there, and I add it between the last book that is “before” it and the first book that is “after” it.

And now I’m asking how much time it takes if I have, say, 100 books, or an arbitrary number of books. I’m going to give the answer for both cases: for 100 books and for n books. The time for n books will be a function of the number of books, and that’s really what interests me here – or, to be more precise, what will interest me in the second post of this introduction.

The answer is that it depends on the order in which the books were at the start when they were on my table. It can happen (why not) that they were already sorted. Maybe I should have checked before I put everything on the table, it would have been smart, but I didn’t think of it. It so happens that it’s the worst thing that can happen to this algorithm, because every time I want to place a book in the shelf, since it’s after/greater than all the books I put before it, I need to compare it all of the books that I put before. Let’s count:

  1. I put the first book in the shelf. Number of operations: 1.
  2. I compare the second book with the first book. I put it in the shelf. Number of operations: 2.
  3. I compare the third book with the first book. I compare the third book with the second book. I put it in the shelf. Number of operations: 3.
  4. I compare the fourth book with the first book. I compare the fourth book with the second book. I compare the fourth book with the third book. I put it in the shelf. Number of operations: 4.
  5. And so on. Every time I insert a book, I compare it to all the books that were placed before it; when I insert the 50th book, I do 49 comparison operations, plus adding the book in the shelf, 50 operations.

So to insert 100 books, if they’re in order at the beginning, I need 1+2+3+4+5+…+99+100 operations. I’m going to need you to trust me on this if you don’t know it already (it’s easy to prove, but it’s not what I’m talking about right now) that 1+2+3+4+5+…+99+100 is exactly equal to (100 × 101)/2 = 5050 operations (so, a bit less than one hour and a half with 1 operation per second). And if I don’t have 100 books anymore, but n books in order, I’ll need \displaystyle \frac{n(n+1)}{2} = \frac{n^2 + n}{2} operations.

Now suppose that my books were exactly in the opposite order of the order they were supposed to be sorted into. Well, this time, it’s the best thing that can happen with this algorithm, because the first book that I add is always smaller than the ones I put before, so I just need a single comparison.

  1. I put the first book in the shelf. Number of operations: 1.
  2. I compare the second book with the second book. I put it in the shelf. Number of operations: 2.
  3. I compare the third book with the first book. I put it in the shelf. Number of operations: 2.
  4. And so on: I always compare with the first book, it’s always before, and I always have 2 operations.

So if my 100 books are exactly in reverse order, I do 1+2+2+…+2 = 1 + 99 × 2 = 199 operations (so 3 minutes and 19 seconds). And if I have n books in reverse order, I need 1 + 2(n-1) = 2n-1 operations.

Alright, we have the “best case” and the “worst case”. Now this is where it gets a bit complicated, because the situation that I’m going to describe is less well-defined, and I’m going to start making approximations everywhere. I’m trying to justify the approximations I’m making, and why they are valid; if I’m missing some steps, don’t hesitate to ask in the comments, I may be missing something myself.

Suppose now that my un-ordered books are in state such that every time I add a book, it’s added roughly in the middle of what has been sorted (I’m saying “roughly” because if I sorted 5 books, I’m going to place the 6th after the book in position 2 or position 3 – positions are integer, I’m not going to place it after the book in position 2.5.) Suppose I insert book number i: I’m going to estimate the number of comparisons that I make to \displaystyle \frac i 2 + 1, which is greater or equal to the number of comparisons that I actually make. To see that, I distinguish on whether i is even or odd. You can show that it works for all numbers; I’m just going to give two examples to explain that indeed there’s a fair chance it works.

If I insert the 6th book, I have already 5 books inserted. If I want to insert it after the book in position 3 (“almost in the middle”), I’m making 4 comparisons (because it’s after the books in positions 1, 2 and 3, but before the book in position 4): we have \displaystyle \frac i 2 + 1 = \frac 6 2 + 1 = 4.

If I insert the 7th book, I already have 6 books inserted, I want to insert it after the 3rd book as well (exactly in the middle); so I also make 4 comparisons (for the same reason), and I have \displaystyle \frac i 2 + 1 = \frac 7 2 + 1 = 4.5.

Now I’m going to estimate the number of operations I need to sort 100 books, overestimating a little bit, and allowing myself “half-operations”. The goal is not to count exactly, but to get an order of magnitude, which will happen to be greater than the exact number of operations.

  • Book 1: \frac 1 2 + 1 = 1.5, plus putting on the shelf, 2.5 operations (I actually don’t need to compare here; I’m just simplifying my end computation.)
  • Book 2: \frac 2 2 + 1 = 2, plus putting on the shelf, 3 operations (again, here, I have an extra comparison, because I have only one book in the shelf already, but I’m not trying to get an exact count).
  • Book 3: \frac 3 2 + 1 = 2.5, plus putting on the shelf, 3.5 operations.
  • Book 4: \frac 4 2 + 1 = 3, plus putting on the shelf, 4 operations.

If I continue like that and I re-order my computations a bit, I have, for 100 books:

\displaystyle \frac{1+2+3+...+99+100}{2} + 100 + 100 = \frac{100 \times 101}{4} + 200 = 2725 \text{ operations}

which yields roughly 45 minutes.

The first element of my sum is from the \displaystyle \frac i 2 that I have in all my comparison computations. The first 100 comes from the “1” that I add every time I count the comparisons (and I do that 100 times); the second “100” comes from the “1” that I do every time I count putting the book in the shelf, which I also do 100 times.

That 2725 is a bit overestimated, but “not that much”: for the first two books, I’m counting exactly 2.5 comparisons too much; for the others, I have at most 0.5 comparisons too much. Over 100 books, I have at most 98 \times 0.5 + 2.5 = 51.5 extra operations; the exact number of operations is between 2673 and 2725 (between 44 and 45 minutes). I could do thing a little more precisely, but we’ll see in what follows (in the next post) why it’s not very interesting.

If I’m counting for n books, my estimation is

\displaystyle \frac{\frac{n(n+1)}{2}}{2} + 2n = \frac{n^2 + 9n}{4} \text{ operations}

It is possible to prove (but that would really be over the top here) that this behaviour is roughly he one that you get when you have books in a random order. The idea is that if my books are in a random order, I will insert some around the beginning, some around the end, and so “on average” roughly in the middle.

Another sorting algorithm

Now I’m going to explain another sorting method, which is probably less easy to understand, but which is probably the easiest way for me to continue my argument.

Let us suppose, this time, that I want to sort 128 books instead of 100, because it’s a power of 2, and it makes my life easier for my concrete example. And I didn’t think about it before, and I’m too lazy to go back to the previous example to run it for 128 books instead of 100.

Suppose that all my books directly on the table, and I’m going to make “groups” before putting my books in the bookshelf. And I’m going to make these groups in a somewhat weird, but efficient fashion.

First, I combine my books two by two. I take two books, I compare them, I put the smaller one (the one that is before in alphabetical order) on the left, and the larger one on the right. At the end of this operation, I have 64 groups of two books, and for each group, a small book on the left, and a large book on the right. To do this operation, I had to make 64 comparisons, and 128 moves of books (I suppose that I always move books, if only to have them in hand and read the authors/titles).

Then, I take my groups of two books, and I combine them again so that I have groups of 4 books, still in order. To do that, I compare the first two books of the group; the smaller of both becomes the first book of my group of 4. Then, I compare the remaining book of the group of 2 from which I picked the first book, and I put the smaller one in position 2 of my group of 4. There, I have two possibilities. Either I have one book in each of my initial groups of 2: in that case, I compare them, and I put them in order in my group of 4. or I still have a full group of two: so I just have to add them at the end of my new group, and I have an ordered group of 4. Here are two little drawings to distinguish both cases: each square represents a book whose author starts by the indicated letter; each rectangle represents my groups of books (the initial groups of two and the final group of 4), and the red elements are the ones that are compared at each step.

So, for each group of 4 that I create, I need to make 4 moves and 2 or 3 comparisons. I end up with 32 groups of 4 books; in the end, to make combine everything into 32 groups of 4 books, I make 32 × 4 = 128 moves and between 32 × 2 = 64 and 32 × 3 = 96 comparisons.

Then, I create 16 groups of 8 books, still by comparing the first element of each group of books and by creating a common, sorted group. To combine two groups of 4 books, I need 8 moves and between 4 and 7 comparisons. I’m not going to get into how exactly to get these numbers: the easiest way to see that is to enumerate all the cases, and while it’s still feasible for groups of 4 books, it’s quite tedious. So to create 16 groups of 8 books, I need to do 16×8 moves and between 16×4 = 64 and 16×7 = 112 comparisons.

I continue like that until I have 2 groups of 64 books, which I combine (directly in the bookshelf to gain some time) to get a sorted group of books.

Now, how much time does that take me? First, let me give an estimation for 128 books, and then we’ll see what happens for n books. First, we evaluate the number of comparisons when combining two groups of books. I claim that to combine two groups of k elements into a larger group of 2k elements, I need at most 2k comparisons. To see that: every time I place a book in the larger group, it’s either because I compared it to another one (and made a single comparison at that step), or because one of my groups is empty (and there I would make no comparison at all). Since I have a total of 2k books, I make at most 2k comparisons. I also move 2k books to combine my groups. Moreover, for each “overall” step (taking all the groups and combining them two by two), I do overall 128 moves – because I have 128 books, and each of them is in exactly one “small” group at the beginning and ends up in one “large” group at the end. So, for each “overall” step of merging, I’m doing at most 128 comparisons and 128 moves.

Now I need to count the number of overall steps. For 128 books, I do the following:

  1. Combine 128 groups of 1 book into 64 groups of 2 books
  2. Combine 64 groups of 2 books into 32 groups of 4 books
  3. Combine 32 groups of 4 books into 16 groups of 8 books
  4. Combine 16 groups of 8 books into 8 groups of 16 books
  5. Combine 8 groups of 16 books into 4 groups of 32 books
  6. Combine 4 groups of 32 books into 2 groups of 64 books
  7. Combine 2 groups of 64 books into 1 group of 128 books

So I have 7 “overall” steps. For each of these steps, I have 128 moves, and at most 128 comparisons, so at most 7×(128 + 128) = 1792 operations – that’s a bit less than half an hour. Note that I didn’t make any hypothesis here on the initial order of the books. Compare that to the 5050 operations for the “worst case” of the previous computation, or with the ~2700 operations of the “average” case (those numbers were also counted for 100 books; for 128 books we’d have 8256 operations for the worst case and ~4300 with the average case).

Now what about the formula for n books? I think we can agree that for each overall step of group combining, we move n books, and that we do at most n comparisons (because each comparison is associated to putting a book in a group). So, for each overall step, I’m doing at most 2n comparisons. Now the question is: how many steps do we need? And that’s where my great post about logarithms (cough) gets useful. Can you see the link with the following figure?

What if I tell you that the leaves are the books in a random order before the first step? Is that any clearer? The leaves represent “groups of 1 book”. Then the second level represents “groups of two books”, the third represent “groups of 4 books”, and so on, until we get a single group that contains all the books. And the number of steps is exactly equal to the logarithm (in base 2) of the number of books, which corresponds to the “depth” (the number of levels) of the tree in question.

So to conclude, for n books, I have, in the model I defined, at most 2 \times n \times \log_2(n) operations.

There, I’m going to stop here for this first post. In the next post, I’ll explain why I didn’t bother too much with exactly exact computations, and why one of the sentences I used to pronounce quite often was “bah, it’s a constant, I don’t care” (and also why sometimes we actually do care).

I hope this post was understandable so far; otherwise don’t hesitate to grumble, ask questions, and all that sort of things. As for me, I found it very fun to write all this 🙂 (And, six years later, I also had fun translating it 🙂 )

Mathematical intuition

Note: this post is a translation/adaptation of a post I wrote in French a few years ago: Intuition mathématique.

I initially wrote that post as a “warning” to my French-reading readers, saying “I might not manage to avoid annoying language tics such as ‘intuitively’, ‘obviously’, ‘it’s easy to see that'”. I think I got slightly better at that since then (or at least slightly better at noticing it and correcting it), but it’s still probably something I do.

I do try to avoid the “intuitive argument” when I explain something, because it used to make me quite anxious when I was on the receiving end of it and I had a hard time understanding why it was intuitive. But still, it does happen – it did happen in an exam once, to fail at explaining “why it’s intuitive”. Something that felt so brightly obvious that I had forgotten why it was so obvious. It’s quite annoying when someone points it out to you… especially when the “someone” is the examiner.

One of the most interesting articles I read on the topic was from Terry Tao, There’s more to mathematics than rigour and proofs. He distinguishes between three “stages” in math education:

  • the “pre-rigourous stage” – you make a lot of approximations, analogies, and probably you spend more time computing than theorizing
  • the “rigorous” stage – you try to do things properly, in a precise and formal way, and you work with abstract objects without necessarily having a deep understanding of what they mean (but you do work with them properly)
  • the “post-rigorous stage” – you know enough of your domain to know which approximations and which analogies are indeed valid, you have a good idea fairly quickly about what something is going to yield when the proof/computation is done, and you actually understand the concepts you work with.

Six years ago, when I wrote the original version of this post, I was considering myself a mix of the three. I did start to get some “intuition” (and I’ll explain later what I meant by that), but I was still pretty bad at finishing a complex computation properly. And, obviously, it did (and still does) depends on the domain: in “my” domain, I was approximately there; if you ask me right now to solve a differential equation or to work on complicated analysis, I’m probably going to feel very pre-rigourous. I’ve been out of university for a few years now, and there’s definitely some things that have regressed, that used to be post-rigourous and are now barely rigorous anymore (or that require a lot more effort to do things in a proper way, let’s say). One of the nice things that stayed, though, is that I believe I’m far less prone to make the confusion between “pre-rigourous” and “post-rigourous” than I used to be (which got me qualified as “sloppy” on more than one occasion).

Anyway, I believe that the categories are more fluid than what Tao says, but I also believe he’s right. And that there’s no need to panic when the person in front of you says: “it’s obvious/intuitive that”: she probably just has more experience than you do. And it’s sometimes hard to explain what became, with experience, obvious. If I say “it’s obvious than 2 + 3 = 5”, we’ll probably agree on it; if I ask “yeah, but why does 2 + 3 = 5 ?”, I’ll probably get an answer, but I may have some blank stares for a few seconds. It’s a bit of a caricature, but I think it’s roughly the idea.

In the everyday language, intuition is somewhat magical, a bit of “I don’t know why, but I believe that things are this way or that way, and that this or that is going to happen”. I tend to be very wary of intuition in everyday life, because I tend to be wary about what I don’t understand. In maths, the definition is roughly the same: “I think the result is going to look like that, and I feel that if I finish the proof it’s going to work”. The main advantage of mathematical intuition is that you can check it, understand why it’s correct or why it’s wrong. In my experience, (correct) intuition (or whatever you put behind that word) comes with practice, with having seen a lot of things, with the fact of linking things to one another. I believe that what people put behind “intuition” may be linking something new (a new problem) to something you’ve already seen, and to do this correctly. It’s also a matter of pattern matching. When it comes to algorithm complexity, which is the topic of the next two posts that I’ll translate, it’s “I believe this algorithm is going to take that amount of time, because it basically the same thing as this other super-well-known-thing that’s taking that amount of time”. The thing is, the associations are not always fully conscious – you may end up seeing the result first and explain it later.

It doesn’t mean that you can never be awfully wrong. Thinking that a problem is going to take a lot of time to solve (algorithmically speaking), when there is a condition that makes it much easier. Thinking that a problem is not very hard, and realizing it’s going to take way more time than ou thought. It still happens to me, in both directions. I believe that making mistakes is actually better than the previous step, which was “but how can you even have an idea on the topic?”

I don’t know how this intuition eventually develops. I know mine was much better after a few years at ETH than it was at the beginning of it. I also know it’s somewhat worse right now than when I just graduated. It’s still enough of a flux that I still get amazed by the fact that there are things now that are completely natural to me whereas they were a complete unknown a few years ago. Conversely, it’s sometimes pretty hard to realize that you once knew how to do things, and you forgot, by lack of practice (that’s a bit of my current state with my fractal generator).

But it’s also (still) new enough that I do remember the anxiety of not understanding how things can be intuitive for some people. So, I promise: I’m going to try to avoid saying things are intuitive or obvious. But I’d like you to tell me if I err in my ways, especially if it’s a problem for you. Also, there’s a fair chance that if I say “it’s intuitive”, it’s because I’m too lazy to get into explanations that seem obvious to me (but that may not be for everyone else). So, there: I’ll try to not be lazy 🙂

(Ironically: I did hesitate translating this blog post from French because it seemed like I was only saying completely obvious things in it and that it wasn’t very interesting. My guess is – it was less obvious to me 6 years ago when I wrote it, and so it’s probably not that obvious to people who spent less time than I did thinking about that sort of things 😉 )

52Frames – 2019 Week 01 – Self-portrait

The first assignment of the year for the 52Frames project was, as it is traditional, “Self-Portrait“. I got a crystal ball for Christmas that I needed to take for a first spin, so that’s what I did.

I did want to have a fairly neutral background – which is pretty hard when you do have a crystal ball that gives you a huge field of view – so I ended up taking pictures on the floor, below my camera set on my beloved tripod that can rotate the central column 90°. Even there, I’m a bit annoyed at the reflection from the window – but oh well. (I tried to post-process them away and failed miserably :P)

I played quite a bit with different ways of getting a proper focus – increasing the f-stop to get more “probability of being in focus”, trying to let the camera do its thing with autofocus and pray for the best, setting the camera at “this seems like a pretty good distance” manually and moving the ball up and down, taking multiple pictures as I went… In the end, the one I eventually chose was one of the earlier in the set. I like the hand position, and I also like the curl of hair that happened to go behind it.

The conclusion of that whole setup was that focusing on a crystal ball is not necessarily trivial, and it’s so much worse when trying to do it blindly while controlling the shutter with a remote 😉

Also worth mentioning: my t-shirt is the t-shirt for Slingshot by Tobias Klausmann, and if you haven’t read it yet you should 😉

So… What’s a logarithm?

This is the translation and update of a blog post originally written in French: “Et donc, c’est quoi, un logarithme ?”.

It’s quite hard to write a series of articles about theoretical computer science without talking about logarithms. Why? Because it’s one of the “useful” functions when one talks about algorithm complexity. So, to make sure that everyone is on the same page, this is a remedial class about logarithms. People who have bad memories about high school maths probably also have bad memories of logarithms; however, logarithms are cool.

Speaking of logarithms, I don’t know how they taught you that in your maths classes; for me, I’m pretty sure it’s been defined from the integral of the reciprocal function. STAY HERE, THAT’S NOT WHAT I’M GOING TO DO. Well, not yet. First, let me try to give some intuition about that thing.

Let us first consider the figure I just successfully inserted in my blog post. I started from a point; from this point I made two branches, at the end of which I added a point; on each of these points, I added two branches, at the end of which I added a point, and so on. I could have continued like that for a while, conceptually – I’d have issues with space and fatigue (it’s going to be annoying really fast to draw points), but I think you can imagine a tree like that with as many levels as you want. Let’s also suppose, because we do computer science, that the “root” of the tree (the very first point I added on my picture at the bottom of it) is at level 0.

Now suppose that I want to count the number of “leaves” of my tree, that is to say the number of points that I have on the highest level of my tree.

It’s pretty clear that the number of leaves depends on the level at which I stop drawing my tree, and that it increases for each level. If I stop drawing at level 0, I have 1 leaf. If I stop at level 1, I multiply that by 2 (because I made two branches), so that’s 2. If I stop drawing at level 2, I multiply the number of leaves at level 1 by 2 again, so that’s 8. And for every level, I take the number of leaves from the previous level and I multiply again by two. At level 3, I’ll have 2×2×2 = 8 leaves, and at level 4, 2×2×2×2 = 16 leaves. To know the number of leaves at level n, where n is the number of my level, I do n multiplications of 2 by itself, which can be written 2^n (read “two to the power of n“).

Now suppose that I don’t want to know the number of leaves corresponding to a level, but the number of levels corresponding to a given number of leaves. For instance, I have 2 leaves: that’s level 1. 16 leaves: that’s level 4. 32 leaves, level 5. And if I have 128 leaves, that’s level 7. It gets a bit more complicated if I have, say, 20 leaves. 20 leaves, that’s level “4 and a bit”: I started drawing level 5, and then I stopped because I got too tired to finish.

This operation (finding the level for a given number of leaves) is the inverse function of the previous “power” operation (finding the number of leaves for a given level), and that’s a logarithm. I say it’s the inverse function because it allows me to “undo” the previous operation. If I take a number n, I compute its power of 2, it yields 2^n, and if I take the logarithm of that, I get

\log(2^n) = n

Similarly, if I have a number n, that I take its logarithm, and that I compute the power of two of the result, I get

2^{\log n} = n

Alright, everything’s nice and shiny, but what happens if, instead of making two branches at each step, I make 3? With the same reasoning as before, at level n, I have 3×3×…×3 leaves, that is 3^n. And, well, in the same way, I can define a logarithm that would be the inverse of this 3^n. But I do want to be able to tell one from the other, so I write the power to which they correspond as a subscript, like this:

\log_2, \log_3

with

3^{\log_3 n} = n

and

\log_3(3^n) = n

That subscript is the “base” of the logarithm. Allow me a small remark about logarithm in base 10 (it’s also true for other bases, at least integer, of logarithms, but let me avoid that). It’s very easy to have a rough estimate of the logarithm in base 10 of a number. It’s the number of digits of said number, minus 1. We have \log_{10}(10) = 1, \log_{10}(100) = 2 (because 10^2 = 100); the logarithm base 10 of all the numbers between 10 and 100 is between 1 and 2. In the same way, you can say that the logarithm base 10 of 14578 is between 4 and 5, because 14578 is between 10000 = 10^4 and 100000 = 10^5, which allows to conclude on the value of the logarithm. (I’m hiding a number of things here, including the reasons that make that reasoning actually correct.)

As an aside, as you may see, one interesting property of the logarithm is that it can “compress” orders of magnitude. If you want to represent on a single sheet of paper quantities that have a very large amplitude. For example, in chemistry, you may want to represent concentrations that go from 10^{-10} to 0.1, and on a “normal” scale, you wouldn’t be able to distinguish between 10^{-10} and 10^{-9}. You can however use a logarithmic scale, so that you represent with the same amount of space “things that happen between 10^{-10} and 10^{-9} and “things that happen between 0.01 and 0.1”. For large scales, xkcd made a very nice drawing with the observable universe seen at log scale: Height.

Back to the previous point – now I have defined the concept of “base” for my logarithm – that’s the number corresponding to the power function that I inverse to get my logarithm. The question is – what prevents me from using “exotic” bases for my logarithms? The answer is “nothing”. I can define a logarithm in base 3.5 (corresponding to the power at which I raise 3.5 to get the number for which I’m computing the logarithm base 3.5), \displaystyle \frac 5 3 (corresponding to the power at which I raise\displaystyle \frac 5 3 to get the number for which I’m computing the logarithm base \displaystyle \frac 5 3), or even \pi (corresponding to… okay, you get the idea) if I want to. It’s less “intuitive” than when looking at the explanation with the tree and the number of levels (because it’s pretty hard to draw \pi branches), but if you see it as the inverse of the power of the same number, I hope you get the idea.

Now the next question you can ask is whether all these logarithms are somehow linked, or whether you can express them in some common way. The answer is yes. There exists the following relation between logarithms of any three bases a, b and c:

\displaystyle \log_a(x) = \frac{\log_c b}{\log_c a}\log_b(x)

(Yes, that’s typically the kind of things that I had in my exam formular because I always get confused, especially when I’m stressed out… like I am in an exam 😉 )

Also observe that the base of the logarithm c absolutely does not matter: the ratio between the logarithms of two numbers stays the same independently of the base.

The important thing to remember here is that all logarithms are equal “up to a constant”; they have the same “asymptotic behavior” (I’m giving the terms here, but I’ll write a specific post on the topic, because it’s a bit long to explain). For theoretical computer science, it’s interesting because we’re mostly interested in behaviors “up to a constant” when we’re talking about execution time or memory consumption of an algorithm. Again – I’ll come back to this later – take it as a “spoiler” of the next episodes 🙂

People from different backgrounds tend to prefer different bases for their logarithms; the three most common bases are 2, 10 and e \approx 2.71828. Here, I feel that it’s possible someone just read that and went “wait, what?”. As far as powers go, there is a “special” one: the powers of e. e is approximately equal to 2.71828, and the function x \mapsto e^x has its own name and its own notation: it’s the exponential function, and e^x = \exp(x). It’s special because of an interesting property: it is equal to its derivative. And because the exponential function is often used, its inverse, the “natural logarithm” (logarithm in base e) is also used a lot, and we write \ln(x) = \log_e(x). It’s also been brought to my attention that some conventions (and, I presume, authors) use \text{lg}(x) = \log_2(x) and use \log(x) = log_{10}(x). Wikipedia has more opinions on the question.

It also turns out that the natural logarithm is related to the reciprocal function \displaystyle x \mapsto \frac 1 x. Formally, we write that

\displaystyle \ln(x) = \int_1^x \frac 1 t \:\text{d}t

And this is what it means, graphically:

The red curve represents the function \displaystyle x \mapsto \frac 1 x. The grey zone corresponds here to the integral from 1 to 6: the area of that zone is equal to \ln(6). And you can represent the natural logarithm of any value (greater than 1) by the area of the zone between the x-axis and the \displaystyle x \mapsto \frac 1 x curve from 1 to that value. Side remark: this area is equal to 1 when you take it from 1 to e (because \log_b(b) = 1 for all b, so \ln(e) = \log_e(e) = 1).

And, to conclude, a nice property of the logarithm function: you can write (in any logarithm base):

\log(a \times b) = \log(a) + \log(b)
\displaystyle \log \left(\frac a b\right) = \log(a) - \log(b)

This is probably easiest to see via the power functions. Let us write things down. We have, on the one hand:

2^{\log_2(a\times b)} = a \times b

and, on the other hand,

2^{\log_2 a + \log_2 b} = 2^{\log_2 a} \times 2^{\log_2 b} = a \times b

Since, if x^m = x^n, then m = n, putting everything together yields the first result. The second equality (with the minuses) can be derived in exactly the same way (and left as an exercise to the reader 😉 ).

Note that it’s because of that kind of property that logarithms were used in the first place. If you have a large table with a lot of “number/logarithm” correspondances, and if you want to multiply two large numbers easily, you look up in the table the logarithms of both numbers, you add them (which is much easier than to multiply them), and you look at the corresponding value (by looking at the table in the other direction) to get the value of the multiplication. Historically, that kind of table appeared in the 18th century (thank you M. Napier) to make astronomical computation easier; slide rules also work on that principle; and all of this only started to disappear when, quite recently, mechanical and later electronic calculators appeared…

Anyway. Logarithms are cool. And now I hope you’re also convinced of that 🙂

What IS Theoretical Computer Science?

(Note: this is a “present-day version” of an article that I published in French around 5 years ago.)

As I mentioned when starting this blog, there’s a few blog posts from my French blog that I kind of want to translate to English. This is part of it: I wrote a series of articles about trying to explain a bit of math and theoretical computer science, and I believe these are pretty good candidates for that.

A few years ago, I was studying at ETH Zürich, and I got asked quite a lot “So, what are you doing these days?” – which was a question that was fairly hard to answer to – typically, I said “Theoretical computer science. Basically maths.” It wasn’t very satisfying, but getting into the details of things when answering a “social” question wasn’t necessarily a good idea either. But maybe there are some people who ARE interested in a bit more detail (if you’re one of the persons to whom I answered that at the time, and you’re still interested, now is your chance 😉 ), and that maybe I could explain things to people who didn’t swim in that area for a few years. To give an idea, I’m aiming at the “I didn’t do maths since high school” level. I’m not sure I managed it – it’s been a while since I’ve quit high school, and I did do a bit of maths since then 🙂

So this is first a very general post to give a few elements, and then I want to give a bit more detail. The idea is that I still love what I studied a few years ago, and I want to explain why I love it. It’s going to ask a bit of work – the first thing I did on the French series was to write a post about “but, but, but… what IS a logarithm?”. At that point I was a bit afraid it’d be an “insult” to the scientific knowledge of my readers, but I still did it because I thought it was fun and because I thought that logarithms were still a very useful notion. It so happened that said logarithm blog post was, by far, my most popular post ever (it still getting hits from search engines these days), so. And then, there’s a few blog posts depending on the mood. A bit of complexity theory, a bit of graph theory (I like graph theory), a bit of probabilities (lol), and the “prerequisites” for all of that (my goal was to be able to explain P vs NP, how to define what NP-complete means, and I eventually got there, and it was a nice exercise).

But still, let’s get to the point. What is theoretical computer science? The first answer I have is that’s it’s the theory of everything you can do with a computer. In the middle of that, you have the notion of algorithm. An algorithm is just a series of instructions. You can think of a cooking recipe as an algorithm:

  1. To make béchamel sauce, you need butter, flour and milk.
  2. Measure out identical quantities of butter and flour.
  3. In a pan over medium heat, melt the butter and add the flour; mix until you get a thick paste.
  4. While the béchamel sauce does not have the desired texture, add milk and mix.
  5. If you wish, add salt, pepper, nutmeg or cheese.

You have instructions (which are numbered) that can be repeated (while the sauce doesn’t have the desired texture) or that you only execute under certain conditions (you only add nutmeg if you want to).

For another process that is more commonly described as an algorithm, you can think of the long division algorithm you learnt at school.

I’m a bit biased on the categories that you can fit into theoretical computer science, because there are domains that I know better than others, but here’s a strictly non-exhaustive list (if I forgot your favorite domain… that’s because I don’t know what to say about it, don’t hesitate to explain in the comments). For a more complete list, I’ll send you to the Wikipedia article about theoretical computer science, which itself is based on the list from SIGACT (a large theoretical computer science research group). As for the fact that “I do maths” – theoretical computer science is often considered as part of discrete mathematics and, in general, proving that something works often implies writing pretty proofs full of pretty maths (or full of ugly maths, but that’s far less satisfying).

I’m going to allow myself a fair amount of approximations here (in particular, I’m confusing problems and problem instances) – I’m trying to write things simply without writing overlong paragraphs. Those who know will be able to correct on their own; for the others: it’s a first approximation, it’s on some level wrong by definition, but I do intend to make it a bit less wrong in the following posts.

So, to me, theoretical computer science contains among others:

  • Algorithmics. Given a problem, how do you solve it with an algorithm, how do you prove that your solving method (the algorithm) is correct, and how much time do you need to get a result?
  • Complexity theory. It includes, but does not restrict itself to, the “how much time do you need to get a result?” from the previous point. We’re also interested in the question “given a problem, what are the minimum, average and worst case time and memory consumption that will allow me to solve it?”.
  • Graph theory. A graph is a mathematical structure made of points (the “vertices” of the graph) and lines (the “edges” of the graph). One example could be a subway map: the stops are the vertices, the edges are the connexions from one stop to another with a subway line. In graph theory, we look at graph properties (for example: can I go from one point to another; what is the shortest path from one point to another; can I still go from point A to point B if I remove this vertex – or, in my previous example, if I shut down this subway station), and at algorithms to decide these properties (how do I find the shortest path from one point to another in the graph).
  • Data structures. How do I store the data of the problem I’m interested in in the most efficient way possible (from a time and space perspective) for the problem I want to solve? Or, to go back to the previous example, how do I store my subway map so that I can compute stuff about it? Data structures and algorithms go hand in hand – you need data on which to run your algorithm, and your data without algorithms is quite often just storage.
  • Computational geometry. How to represent and solve with a computer problems that are related to geometry? A common example is the post office problem: given the location of all post offices in a city, how do I efficiently find the one closest to my home? If I’m only interested in the one that’s closest to my home, I can compute the distance between my home and all post offices, but how can I improve on that if I’m interested in that information for all houses in the city? And what do I do if I shut down or open a new post office?
  • Randomness. What is the impact of randomness for all this? For some algorithms, it’s interesting to add some randomness in the computation (I flip a coin, if I get heads the algorithm does operation X, if I get tails the algorithm does operation Y). What impact does it have on algorithms – in particular their correctness, and how to assess the time that a random algorithm will need to (correctly) solve a problem? Random data structures can also be interesting. For instance, you can define random graphs: you take a set of points, and you add edges between them with some probability. Depending on how you define your random graph, it can be a fairly good model of social networks (who’s friend with whom)… or of the human brain.
  • Cryptography. If Alice wants to send a message to Bob without Eve being able to read it, and if Bob wants to be sure that the message he just read does come from Alice, how can they do that? And how to prove that Eve cannot read the message, and that Bob can be sure of where the message comes from.
  • Computability. It turns out you can’t solve everything with an algorithm, and that you can even prove that some things cannot be computed with an algorithm. The most famous example is the “halting problem”, which is a bit meta, but I hope you’ll forgive me. You can prove that there is no program that can decide if a given problem eventually stops and give a result or if it just loops for ever. The proof is very pretty, but it kind of hurts the brain the first time you see it; maybe I’ll eventually write a blog post about it. (Note: I said that “maybe” five years ago, maybe I should do that for real 😉 )

There, those are all things for which I have at least some notions. Theoretical computer science includes some more stuff for which my knowledge is mostly zero: information theory (I’m somewhat ashamed of that), machine learning (or how do you make a computer recognize pictures of cats), formal methods (or how do I prove that my code does not have bugs), quantum computing (would a quantum computer be better at recognizing pictures of cat than a normal computer), computational biology (how do I check whether two DNA sequences are close of one another or not), and I probably still forget a few.

Also note that different people may have different opinions about what is or is not theoretical computer science. I’m not necessarily convinced of the fact that most machine learning as it is applied today is theoretical computer science, because it still looks very experimental, as in “we tweaked that parameter and consequently the computer is 2.2% better at recognizing cats. (Note: I’d probably be less “critical” now than I was five years ago on that topic. Maybe. Probably.) Wikipedia also gives distributed computing as theoretical computer science. It most definitely intersects, but I wouldn’t say that distributed computing is a subset of theoretical computer science. Same for computational biology: some elements are part of it (the example of the DNA strings), but I’m pretty sure that not all computational biologists would define themselves as theoretical computer scientists 🙂

I thought this would be a pretty short blog post: I think that’s a failure. I hope it was still somewhat interesting; if there are things you think I can talk about in a future blog post in this category… well, don’t hesitate. As I mentioned, I actually have some amount of content that I aim to translate from my archives; I am open to suggestions still, my goal is to Blog More 🙂

Smooth coloring of Mandelbrot

I got half nerd-sniped (by my dad, as it quite often happens) after my blog post on my fractal renderer – “but… your continuous coloring, there… how does it work?”. I had implemented it 8 months ago without really asking myself the question, so after answering “eh, some magic”, I finally got curious for real, and I started digging a bit more into it.

Starting from the Wikipedia article footnotes, I ended up on a couple of articles: Renormalizing the Mandelbrot Escape and Smooth Escape Iteration Counts, as well as smooth iteration count for generalized Mandelbrot sets, that mostly allowed me to piece together some understanding of it. I will admit that my own understanding is still a bit messy/sloppy – my aim here is to try to give some understanding somewhere between “apply this function and you get smooth coloring” and “spectral analysis of eigenvalues in Hilbert spaces” (it’s APPARENTLY related; I’m pretty sure I’ve known at SOME point what this group of words meant, but right now it’s barely a fuzzy concept. Math: if you don’t use it you lose it 😦 ).

Mandelbrot set and discrete coloring

Before talking about continuous coloring, let me set a few concepts and notations so that we’re on the same page.

We iterate, from a point c = x + y\mathrm{i} (where (x,y) are coordinates on the plane), the sequence z = z^2 + c, where z is a starting constant, typically 0, and we look at the points for which this sequence diverges to infinity or does not. The points for which the sequence does not diverge to infinity (i.e., the sequence is bounded) are part of the Mandelbrot set.

Counting to infinity is, however, a bit long. The first thing we can do is – if we know early that the sequence goes to infinity, we can stop counting: the sequence diverges. We happen to know that, for z = z^2 + c, the sequence diverges as soon as its magnitude |z| > R (or equivalently, x^2 + y^2 > R). (Note that the minimum value for this to work is R = 2.) So we can iterate on every point and, as soon as its magnitude reaches 2, conclude that the point is not in the Mandelbrot set. This process gives another information: the iteration at which this condition is met – which is an indication about how fast the sequence diverges.

That only solves a part of the problem – because by definition, the points for which the sequence is bounded will never stop the process in that model. So the second thing we do is to consider that if a sequence has not diverged after a given, fixed number of iterations, then probably it won’t.

Finally, for every possible number of iterations, we define a color (in a way that eventually makes the result pretty/interesting); we also define a color for points whose sequence does not diverge (and that are considered in the set); and we color each point according to that.

Now the thing with this approach is that the number of iterations tend to gather in “bands” – as in the following picture.

Discrete coloring bands

Smooth coloring

The bands come from the fact that two neighboring pixels “tend to” have the same number of iterations, and hence the same color. If the range of possible values is smooth, and if two neighboring points end up not being associated to the exact same value, and if the palette maps them to different colors, then the coloring should be smooth – even with a fairly reduced palette.

The idea is that we want to represent whether a sequence “barely” reached the escape radius or whether it could have done so almost on the previous iteration, so far it is from the threshold. Ideally, we represent that by a function that goes from 0 to 1, we get a nice smooth range, and that yields a nice smooth coloring.

Now many sources go from this statement to “and now, like a rabbit out of the hat, take \log(\log(z)) and we’re done”. Some of them mumble something about “deriving it from Douady-Hubbard potential“, but it’s not really satisfying either. I gave a quick look to that, and I may revisit the topic eventually, but right now the maths go wayyyy over my head 😦

Still, the main question I had was “well, if we’re only interested in mapping ‘the space between the bands’, why not just drop a linear operator on that and be done?”. My reasoning was as follows: we know that at iteration i, we’re above the radius of convergence R, and we were not in the previous iteration, that means that we can bound it. If z_{i-1} and z_i are the values of z at escape iteration i, we have, by definition of i, |z_i| > R and |z_{i-1}| \leq R. To bound z_i from the above, we write |z_i| = |z_{i-1}^2 + c| (by definition of z), which is less than |z_{i-1}|^2 + |c| (by triangular inequality), which is in turn less than R^2 + |c|.

Hence, our “escape margin” |z| - R is between 0 and r^2 + |c|, and we want to map that to something between 0 and 1, so let’s just divide those two things, define my fractional iteration count as \displaystyle i + 1 - \frac{|z| - R}{r^2 + |c|} and we’re done, right?

Hmmm. It DOES look smooth-er. It’s not… smooth. I have two guesses here. First: the triangular inequality is a bit too rough for my estimate. Second: it may be that I’m generally speaking missing something somewhere. Maybe a bit of both.

Now that’s where the “magic” happens. I’ve read enough on the topic to have some inkling that there’s actual, proper math behind that makes it work, but I’m not able to understand, let alone explain it in simple terms 😉

The fractional iteration count that makes things work is \displaystyle i + 1 - \frac{\ln \ln |z|}{\ln 2}. The best intuition/simplified explanation I’ve seen for this comes from Renormalizing the Mandelbrot Escape, where they have the following quote:

The following simplified and very insightful explanation is provided by Earl L. Hinrichs:
“Ignoring the +C in the usual formula, the orbit point grows by Z := Z^2. So we have the progression Z, Z^2, Z^4, Z^8, Z^16, etc. Iteration counting amounts to assigning an integer to these values. Ignoring a multiplier, the log of this sequence is: 1, 2, 4, 8, 16. The log-log is 0, 1, 2, 3, 4 which matches the integer value from the iteration counting. So the appropriate generalization of the discrete iteration counting to a continuous function would be the double log.”

Smooth coloring

So, there. I’m going to leave the topic there for now – that was quite fun to dig into, but I do need more advanced math if I want to dig deeper. It might happen – who knows – but not now 🙂

New year Marzipan

Around a year ago, I started playing with a small fractal generator that I called Marzipan (because, well, Mandelbrot – basically marzipan, right). Okay, in all fairness, I re-used that name from a previous small fractal generator that I had coded in Javascript several years ago 🙂

It’s primarily (for now) a renderer of colored Mandelbrot sets. The Mandelbrot set is fairly well-known:

Mandelbrot set

This image is a made on a 2D plane whose upper-left coordinate is at (-2, -1) and lower-end coordinate is at (1, 1). For each point (x, y) of this plane, we consider the complex number z = x + yi, we do some operations (called iterations) repeatedly on z, and we see what is the result. If z grows larger and larger (if the sequence of operations diverges), the point z is not in the Mandelbrot set, and we color it grey. If it does not, or if we give up before seeing that it diverges, the point z in in the Mandelbrot set, and we color it black. The longer we wait before deciding that a point is in the set, the more precise the boundary is (because we have “more chances” that it diverges if it is going to diverge).

And it’s technically possible to zoom as much as you want on the border of the set and to get “interesting” results at infinite amount of zoom – every little part of the border has an infinite amount of detail.

Now one of the reason why Mandelbrot (and other similar computation) renderings are quite popular is that they’re colorful and pretty. In particular, I have a fair amount of screen wallpaper coming from Blatte’s Backgrounds, that contain some of this kind of images. The connection between my set above and the colorful images I linked is not a priori obvious.

Enter: coloring algorithms. The idea is to try to represent graphically how the points outside of the set diverge. The first algorithm I implemented was a escape time algorithm: if the point diverges after 5 iterations, color it in a given color, after 6, in an other color, after 50, in yet another color, and so on. And the fastest way to generate a color palette is to just generate it randomly (and to affect one color to each possible number of iterations), which can yield… fairly ugly images 😉

Random coloring of the escape time

A variation of that approach is to affect to each number of iteration a color that is “proportional” (for instance, more or less saturated) to the number of points that actually reach that number of iterations.

Histogram coloring of the escape time

Then the next idea is to go from “discrete coloring” to “continuous coloring” – right now, we have bands of colors that may have some aesthetic quality, but that may not work for what one has in mind in the end. To achieve that, we add a “continuous” component to our iteration computation (so that it’s not integer anymore) and we map it to the color palette.

Continuous coloring of the escape time

The other coloring algorithm that I started exploring today is coloring by orbit traps. Instead of considering when the iteration escape, we look at how it escapes, and we try to represent that. The first idea is to take an arbitrary, “interesting” point (from an aesthetic point of view, and mostly found by trial and error at this stage), and to look at how close the iterations of the escaping points come to this fixed point). The colored values are the distances to this point. (And the image at the beginning of this post is a (low) zoom from this one 🙂 ) Note: on this one I also tweaked the palette computation to get more contrast. That was fun 🙂

Coloring of the distance to an orbit point (-0.25 + 0.5i)

Generally speaking, this project is also for me a nice visual sandbox to play around – on top of practicing my C++, my goal is to generate pretty images, but that typically requires a fair amount of “quality of life” updates:

  • a very basic set of command-line options so that I could generate images without hard-coding all the values
  • quicker than I would have thought: a minimal Qt UI that allows me to zoom and increase/decrease the number of iterations – and right now I kind of feel the need to expand that UI so that it’s… actually useable (being able to change parameters on the UI, re-scaling the window to fit the ratio of the rendered input, that sort of things)
  • yesterday, I sped up the rendering by… well, adding threads 😛 (via the QtConcurrent library).

Generally speaking, it’s a fun project – and it’s actually something I can go back to quite quickly (once I go over the shock of “urgh C++” – I actually DO like C++, but I did AoC in Go, and it’s a fairly different language) and implement a thing or two here or there, which is nice. For instance, a few months ago, I went “given my current code, can I add support for Julia sets within 10 minutes before going to bed?” and the answer was yes:

Julia set for z = -0.4+0.6i

New year, new 52frames

A while ago, I participated somewhat regularly to the 52frames challenge. 52 weeks, 52 themes, 52 pictures. I kind of liked it, the themes are typically “challenging, but open enough”, and I quite like the “extra credit” that allows to either add another constraint or not. But eventually I ended up missing a week, two weeks, … many weeks, and it disappeared from the “things I do”.

New year times are obviously a “common” time to re-start that kind of challenges, and since I’m the least original person in the world, well, I’m restarting a run this week – I checked earlier that I still had an active ID, and I intend to submit a picture for this week’s challenge, Self-Portrait.

And since I decided I’d put more content on my blog, I’ll also post what I do here whenever the album is published (typically on Wednesdays, if I remember correctly) – in a way, I’m adding some additional accountability 😉

So – see you next week – in the meantime, I need to shoot a self-portrait 🙂 (Well, another one. The one on top of this post is a few months old already.)

#balisebooks from the end of the year

(Ce billet est traduit en français ici : #balisebooks de fin d’année)

I’m so behind on my reporting that it’s not even funny. So, the plan: remove the backlog before new year, and start 2019 on a reasonably clean slate. Let’s go!

Crazy Rich Asians / China Rich Girlfriend / Rich People Problems – Kevin Kwan – the story starts with Rachel, whose boyfriend invites her to meet his family in Singapore, without even hinting that his family (and the people who gravitate around it) is richer than rich – and not necessarily behaving in a “not richer than rich” way. Drama ensues, and continues for two more books. I liked it way (way) more than I thought I would – it does have a Downton Abbey meets Gossip Girl in Singapore kind of feel, it’s generally pretty funny, many characters are likeable (and you love to hate the ones you do), the sprinkling of Malay and Chinese expressions in the dialogs is pretty well done, and that series made me SO HUNGRY, there is SO MUCH FOOD!

Site Reliability Engineering – How Google Runs Production Systems – edited by Betsy Beyer, Chris Jones, Jennifer Petoff and Niall Richard Murphy – a very nice collection of essays around the SRE theme, on topics that range from “how to organize on-call in your team” to “how to handle consensus in a distributed system” via “what’s a cascading failure and how to deal with it”, with an interesting mix of “organisational” topics and “highly technical” topics.

Ivy and Abe – Elizabeth Enfield – a book where Ivy and Abe, as soulmates as people can be, meet for the first time at different points in their lives, which makes their common story vastly different depending on the timeline and circumstances of their meeting. I really liked the idea, and the characters, and the whole view that the moment at which people meet and what they’ve lived through so far is at least as important as who they are. I am however a bit sad that there’s so many timelines in which things don’t work out, and that some of these timelines kind of lack closure.

Altered Carbon / Broken Angels – Richard K. Morgan – the first two books in the Takeshi Kovacs series, happening in a universe where people can store their consciousness into “stacks” and be revived in new bodies, borrowed or grown. In the first book, Takeshi Kovacs is hired from a (very) rich guy to investigate his own murder (the rich guy’s, not Tak’s); in the second one, he’s gathering a team to go explore a seemingly lucrative alien archaeological find. I had read Altered Carbon a while ago, hadn’t been convinced; but I liked the TV series a lot, so I was wondering if I had missed something in the book. I did like it a lot more on the second read; moreover, the few things that had bothered me in the series were actually different in the book, which is quite funny. I did dislike the second book, though – I’m not sure if it was me or the book or the moment, but I got so, so bored :/

Bad Blood: Secrets and Lies in a Silicon Valley Startup – John Carreyou – the story of Theranos, a startup that wanted to revolutionize the biological testing industry, and its CEO, Elizabeth Holmes. Spoiler: it did not end well – the product never quite worked, and the whole thing went from embarrassing failure to something described as a full-blown scam. Fascinating story, great storytelling – a very interesting and entertaining read. A novel with that plot would seem barely believable… and yet 🙂 Highly recommended.

Romancing the Duke – Tessa Dare – I was very pleasantly surprised by this one. I’ll admit that I have a fair amount of prejudice towards the romance genre, but this prejudice is chipping away one book at a time 😉 Izzy Goodnight inherits a castle, which is a good thing for her, because other than that, she has basically nothing (except an ermine). Problem: said castle is currently inhabited by the Duke of Rothbury, who a/ is not aware the castle has been sold b/ as current owner, would actually have something to say about it. Stuff ensues. Including a bunch of cosplayers. (No, really.) And it’s funny, and it’s cute as hell, and it’s entertaining, and I just loved that thing.

When a Scot Ties the Knot – Tessa Dare – technically in the same series as the previous one, but with unrelated stories and characters. Madeline is shy to the point of social anxiety, so when the time comes for her to make her débuts in London, instead, she invents a fake Scottish fiancé who tragically dies after a bunch of letters. Until the day where said fake invented fiancé arrives on her doorstep, with a bunch of letters addressed to him. I did like it less than the previous one, but it was still a damn entertaining read.

Harry Potter and the Philosopher’s Stone – J.K. Rowling – yup, I started re-reading Harry Potter. Still great 🙂

The Calculating Stars / The Fated Sky – Mary Robinette Kowal – a series where the premise is that, in 1952, a huge meteorite fell to Earth, and made things very awkward climate-wise. It completely changes the timeline of the space conquest (“we need to go to Mars, and we need to do that sooner than later”), and in that context we follow Elma, mathematician and Lady Astronaut.
On the one hand, there’s two points I do have an issue with:

  • I’m not sure I’m buying the premises (of “moving the hell out of here” vs “finding a way to make things work on Earth” – because in any case the environment on the Moon or on Mars is not going to be much better, is it?)
  • I’m not often bothered by sex scenes, but I was in the first book (the second one is better in that regard). They feel kind of awkward, too numerous, and either too long or too short (but then that would probably be marketed differently 😉 ).

Buuuuuuuuuuut. First, it was VERY, VERY hard to put down, and that’s a major factor. Second, it made me audibly chuckle AND drop a few tears here and there, and I’m a sucker for emotional reaction. Third – the anxiety depiction is so fucking spot on I can’t even, and I couldn’t help rooting for Elma – more than I would for myself 😉 – so it’s kind of therapeutic, in a way. All in all: definitely something for which I’m looking forward to the third book.

I’d Rather Be Reading: The Delights and Dilemmas of the Reading Life – Anne Bogel – That was very neat, in a meta sort of way. What do we read, why do we read, how do we read? Nobody is unique in their reading (or non-reading) habits, which makes this small book very relatable – and funny. And I even snagged a few titles that I’ll have to put on my “to read” list 🙂 Also, it made me discover Anne Bogel’s blog, Modern Mrs Darcy, which I quite like 🙂 (And which eventually made me start journaling, sooo.)

The Great Gatsby – Scott F. Fitzgerald – this one counts as a classic, and there’s been a movie recently (which I haven’t seen) that made me want to read it. And honestly? I don’t know. I did like it, but I have no idea why. Probably mostly because of the mood and the writing (which are not necessarily what catch my attention usually, they’re more “nice to have”s, as far as I’m concerned). I don’t know.

The Technological Singularity – Murray Shanahan – “The singularity” is a term that any science-fiction fan and/or computer scientist will have heard. I will confess that the definition and implications of it weren’t that clear to me before starting this book. Shanahan does a very good job at defining it, considering how artificial general intelligence could possibly be achieved, how it can lead to singularity, and what could be the impact of this, considering both technical and philosophical questions, at a very accessible and pretty engaging level. A thoroughly interesting read – although it definitely adds to the general sense of World Anxiety instead of alleviating it 😉

The Consuming Fire – John Scalzi – a great sequel to The Collapsing Empire. Still very entertaining characters (same ones, so if you didn’t like them in the first book, don’t expect to like them more here), a fair amount of smartassness and kickassery, cloak&dagger&treason, and IS THE THIRD BOOK AVAILABLE ALREADY? 😛

Shades of Milk and Honey – Mary Robinette Kowal – apparently, Pride and Prejudice with magic. I haven’t read Pride and Prejudice (yet, it’s on my list for next year 😉 ), but I still enjoyed this one a lot (I’m, at the time of writing this, reading the second book in the series, and it’s even better). It’s an historical romance where the characters are able to manipulate “glamour”, basically magical visual illusions. That was a very pleasant read.

Happier – Tal Ben-Shahar – an intro book about positive psychology. Nothing mind-blowing, but ties a few things together neatly. Pretty good, all considered.

A Semi-Definitive List of Worst Nightmares – Krystal Sutherland – Esther is convinced that her family is cursed, and that every member of her family has One Great Fear that will eventually kill them. Esther has escaped it so far, by keeping a list of “possible fears”, and carefully avoiding getting exposed to all of them – until her friend Jonah challenges her to tackle these fears, one at a time. Funny and heart-breaking and great and generally wow.

Wool – Hugh Howey – stories from the Silo, where a small community of people live, sheltered from a very dangerous Outside, to which occasionally someone gets sent (and dies quite quickly). Really loved the beginning, was less convinced by the “late middle”. Still, a very good read, and I’m looking forward to the other installments of the series.

Wild Hunger – Chloe Neill – first book of Chicagoland, The Next Generation, following Elisa, related to the vampires from the first series, coming back from Paris to Chicago after her training. Scratched the UF itch, but I got slightly bored – and rolled my eyes more than usual, at least in the beginning. The ending was somewhat better.

An Absolutely Remarkable Thing – Hank Green – Gigantic Transformer statues appear all around the world, and April May is the first one to document their appearance on YouTube, and becomes Internet-famous because of it while the whole story about the Carls (after the name April gave “hers” on a whim) and their mysteries unfolds. A very entertaining read with a quite believable protagonist and an interesting depiction of “social network fame”.

The Kiss Quotient – Helen Hoang – Stella, a brilliant 30-something econometrician, is still single, at least partly because of her Asperger. She decides to hire an escort to teach her sex and relationships. Basically a gender-swapped Pretty Woman; nothing much surprising, but very cute.

The Guernsey Literary and Potato Peel Pie Society – Ann Mary Shaffer and Annie Barrows – an epistolary novel set in 1946 where Juliet, writer, starts corresponding with a man from Guernsey – who is part of the Guernsey Literary and Potato Peel Pie Society. Juliet is intrigued and ends up visiting her new-found friend. This was at times fun/lighthearted, poignant and moving. Really (really) liked it 🙂

Altered Traits – Daniel Goleman and Richard Davidson – a short book that aims at distinguishing scientifically validated facts from hypotheses that may not be (not necessarily wrong, but “more research required”) when it comes to meditation and the brain, particularly when it comes to long-term practitioners. It’s a very interesting summary of the research around meditation effects and it’s history, but it sometimes feels a bit messy/meandering.

And if you were to read only one of these… The Calculating Stars.

Advent of Code 2018

In the past few years, December has been for me “the month of Advent of Code”. Advent of Code is an advent calendar with puzzles that can mostly be solved by programming: the input is a problem description and a user-specific input (as far as I know, there’s a set of a “few” pre-validated/pre-tested inputs), and you have to compute a number or a short string, and provide that as a result. If you have the correct result, you unlock a star and the second part of the puzzle, which allows to unlock a second star. Over the course of the 25 first days of December, you consequently get to collect 50 stars.

I like the puzzles, I like the fluff around it (typically a more-or-less-sensical story around the Christmas theme), I like the duration and effort involved, I like that I learn something every year – all in all, I’m a fan 🙂 There’s a “global leaderboard” that tracks all the users who are in the first 100 to solve each puzzle; although I’m on a “favorable” timezone for the puzzles (they come up at 6am for me), I have no hope of ever scoring points on the leaderboard; my claim to fate this year is that I was in the first 1000 players for 11 stars out of 50 🙂

This year, as last year, I solved the whole thing in Go; my Go is still quite bad, probably not-a-all Go-like, and I tend to try to avoid spending too much time second-guessing and refactoring my code in that kind of contexts (it’s good training against analysis paralysis 😉 ), and it’s available on my GitHub if you’re not afraid of bad code and spoilers 🙂 (Oh, and also: truly horrifying code organization.)

After this general introduction, I want to talk a bit about my experience of the individual problems, so beware: spoilers ahead!

Continue reading “Advent of Code 2018”