What I learned this week: 17/2015

Statistics, Probability, Machine Learning, Data Science

1. Correlation coefficients beyond Pearson/Spearman/Kendall

  • I keep switching to Spearman from Pearson during exploratory data analysis (more robust to outliers and a bit better on non-linearities). This week I decided to look around for some further options, and who knew, there is indeed work being done in the area. Particularly, Maximal Information Coefficient seems very promising (although it is computationally intensive and does have some problematic properties). However, I’m just looking for something to help me quickly orient myself in sets with many predictors and this looks up to the task for non-linearities. Will definitely try it next time. Good overview here (pdf).

2. Generalized additive models

  • Played a little bit with GAM’s this week. On the one use case the performance didn’t improve over my other models. Plus they are too complicated to implement on a SQL server… at the moment I’ll keep them on the backburner. Good practical review here (pdf).

3. Thinking about predicion intervals and metrics

  • Lot’s of things here. I’ll need to digest it a bit more.

General Science / Misc.

1. Crispr/cas9 edited human embryos

The big news of this week. When Crispr/Cas9 first came out some time ago, some have theorized that the chinese will go forward with applications to human germ lines. Turns out they did, but it’s less alarming than it might sound. The results are very cautionary, but still very promising. The West can not win anything by carpet-banning research in this area.

2. Shift in the String wars

3. Where Are The Big Ideas in Neuroscience?

4. The Wolf of Wall Tweet

  • Algorithmic trading based on news items is not new, but apparently somebody made a killing in the last weeks in nigh-expired stock options, trading within 1s of the newswire publication. Read all about it in a badly written article, with an annoying “I have a friend…” structure (and the friend is annoying too), which despite its title has nothing to do with twitter, actually.


Smil,V. 2015. The visionary energy solution: Triple windows. IEEE Spectrum March 2015:28

  • Yeah, short ditty, but I like Smil.

Generalized Additive Models

Comparing Measures of Dependency

  • I accidentally stumbled on Michael Clark’s page 2x this week in 2 different topics. Very nice, practical reviews.

Videos / Lectures

Model Thinking @ Coursera

  • Gruellingling long, but finally over. The pros: definitely worth knowing many of the modes. Cons: too long. Large chunks of the videos are working out simple arithmetic. If you can, go on your own pace, skipping a lot of content. For me being formally signed up helps to finish the courses, so I had to grin and bear.

Robin Hanson – Attitudes to the Future

  • Not much new, but I always enjoy Robin Hanson. Particularly interesting the part on the hype cycle around artificial general intelligence. As somebody who was interested in AGI befor it was cool, I symphatetize (and this is a half joke, since the field predates me by a few decades 🙂 )


1. What is Transhumanism – Review the Future
2. Duggan on Strategic Intuition – Econtalk
3. Sustein on Infotopia – Econtalk
4. Moynihan: What I’ve learned losing a million dollars – Tim Ferris Podcast

Books: Non-Fiction

Books: Fiction


Suspended in the immense vastness of space and time

Watching us at 4K resolution at ~1000 frames/s, we realize how increadibly lonely we are.

Any signal takes millenia of milliseconds, whole aeons of femtoseconds to reach us. Any communication is impossible through the chasm of time.

The nearest human being can be tousands of attoparsecs away. How could be a warm touch be possible through the deep abyss of space?

We are hopelessly suspended in the immense vastness of space and time.

We are hopless vastness of space and time.

No man is an island. That doesn’t even begin to describe it.

What I learned this week: 16/2015

Statistics, Probability, Machine Learning, Data Science

1. Multiple comparisons

Finally had a look into multiple comparisons beyond Bonferroni’s correction. Didn’t yet get around to read Gelman’s Why We (Usually) Don’t Have to Worry About Multiple Comparisons (pdf).

2. Robust regression, Quantile Regression

Do you know Warren Buffet’s adage “You get, what you incentivize for”? Well, in machine learning:

You get what you optimize for.

After weeks of arguing for MAE instead of RMSE for model evaluation in a project, I finally had to eat my own dog food: not only to evaluate a model on MAE, but actually optimize for it. This opened a new world for me with Robust versions of regression (Huber loss functions) and quantile regression.

There is a ton to learn here, looking forward to it.

3. Sorting through my thoughts on Knightian uncertainty

Here. Still some work left to do on extensions of expectation maximization.

General Stuff

Kerfuffle around Growth mindset

The Shortest-Known Paper Published in a Serious Math Journal

Anti-market opportunities

  • Good formulation – something to live by, if I ever return to academia.


  • Oh, the sweet, sweet naivety. I’d enjoy working like this, but one should read up on suboptimisation.

The button

Productivity, Life Advice

Summary of Ikigai by Sebastian Marshall 1

Summary of Ikigai by Sebartian Marshall 2

  • Not sure if I’ll read the book, but the summaries have good formulations of some well known thoughts.

What You’ll Wish You’d Known by Paul Graham 

  • Indeed.

How to Stick With Good Habits Every Day by Using the “Paper Clip Strategy”

  • Habbit building strategies are dime a dozen, but this one I haven’t heard before. It sounds cute and I can absolutely imagine it works for ugh-mine-fields.


Evaluating Trading Strategies

  • Pretty good overview of the multiple comparison corrections

Videos / Lectures

Randy Pausch Last Lecture: Achieving Your Childhood Dreams

  • No, I just got some mote in my eye.

Conversations with Tyler: Peter Thiel on the Future of Innovation

  • Cowen did a good job with the questions and steering the conversation. Highly recommended.

Books: Non-Fiction

The Undercover Economist Strikes Back: How to Run–or Ruin–an Economy

Books: Fiction

Schild’s Ladder: A Novel

The uncertainty around Knightian uncertainty

Definitions are due

Knightian uncertainty is a proposition that an agent can have a completely unknowable and incalculable uncertainty about an event. This type of uncertainty goes far beyond the colloquial meaning of “uncertainty”, i.e. an event with subjective probability 0<p<1, by refusing to ascribe any probability distribution to a given proposition.

While the little devil of common sense sitting on your shoulder might wisely nod in approval, the bayesian angel on the other shoulder screams: “Impossible!”. A proper bayesian agent is infinitely opinionated and can serve you a probability distribution for any proposition. Anything short of that leads to an exploitable flaw in your decision theory.

So are there fundamentally unknowable events, or is this just sloppy thinking? Are non-bayesian decision theories leaving money on the table, or are bayesians setting themselves up for a ruin via a black swan.

Knightian uncertainty in humans

Let’s start with something uncontroversial: humans, even at their best, are only very weak approximations to a bayesian reasoner and therefore it might not surprise us that they could legitimately exhibit fundamental uncertainty. A good summary, as usually, can be found at Yudkowsky’s When (not) To Use Probabilities – humans are inherently bad at reasoning with probabilities and thus open to Dutch book exploits due to inconsistencies. While some see it as a failure, others say a prudent thinker can rightfully be stubborn and refuse to stick out his neck.

As I side note, we don’t have to require a bound reasoner to literally have a distribution for every event. But shouldn’t he/she be able to compute one when pushed hard enough?

For humans, claiming Knightian uncertainty can be a crude but useful heuristic to prevent playing games, where we might be easy to exploit. Does the concept generalize beyond the quirks of human psychology?

The luxury of a posterior

The role of a decision theory of an optimizing agent is to help him to maximize his utility function. The utility at any given time is also dependent on the environment and therefore it might not be surprising, that under certain conditions it can be beneficial to tailor the decision theory of the agent to the specifics of a given environment.

And some environment might be more hostile to cognition than others. Evolutionary game theory simulations often have bayesian reasoners getting beaten by simpler agents, that dedicate resources to aggressive expansion instead of careful deliberation (I’m quite sure I have this from Artem Kaznacheev, but for my life can’t find the link). Similar situation occurs also in iterated prisoner dilemma tournaments.

While these simulations are somewhat artificial, we might approach these harsh-for-cognition situations in e.g. high frequency trading, where constructing careful posteriors might be a luxury and a less sophisticated, but faster algorithm might win out. As an example, we have a quote (unsourced) from Noah Smith:

Actually, there are deep mathematical (information-theoretical) reasons to suspect that lots of HFT opportunities can only be exploited by those who are willing to remain forever ignorant about the reason those opportunities exist.

Interestingly, a sort of “race-to-the-cognitive-bottom”, might play out in a multipolar artificial intelligence take-off. While a singleton artificial intelligence might nearly optimally allocate part of its resources to improving its decision theory, in a multipolar scenario (fragile as it might be), the winning strategy can be slimming down the cognitive modules to its barest minimum necessary to beat the competition. A biological mirror image to such a scenario is the breakdown of the Spiegelman Monster discovered by Eigen and Oehlenschlager.

Apart from these concerns, another motivation of Knightian uncertainty in the algorithmic trading can be a split between internal and actionable probabilities in some market making algorithms as a protection from adverse selection (more on here).

In summary, not constructing a posterior for a proposition could be a reasonable strategy even for a much wider class of reasoners than quirky humans especially under resource/computation time bounded scenarios. After all, there are no free lunches, including for bayesians.

While these all sounds reasonable, it still does leave me unclear about a general framework to select decision theories when a bayesian approach is too expensive.

Substrate level Knightian uncertainty

There is still one more possible step – moving the uncertainty out of the cranium of agents into the wild world, into the physical reality itself. Scott Aaronson’s fascinating paper The Ghost in the Quantum Turing Machine, is built on the thesis of “Knightian freedom”, an in-principle physical unpredictability that goes beyond probabilistic unpredictability, that is inherent to the quantum nature of physics. As a poor bound cogitor, I’ll proclaim here my own Knightian uncertainty and refuse to fabricate opinions on this thesis [1].

[1] Ok, I found the paper very interesting, but I don’t agree with most of it. Nonetheless, I also don’t feel nowhere near knowledgeable enough to go into a much deeper critique.