These statistics plots are very cute (p<0.05)

My god, this was a long day… so I was happy to see this plot in my feeds (hat tip to Dave Giles):

The author is the great William S. Gosset (also known as “Student”), a true inspiration for us statisticians and data scientist, because he continued to do important scientific discoveries after he left academia for industry.

In fact, his brilliant mind manifested also in his choice of industry: he became a proto-data scientist for the Guinness Brewery!

No. Actually I was thinking about how to tell, whether the means of these two distributions are significantly different.

The platypus and kangaroo plot reminded me of Antoine de Saint-Exupéry’s drawing of boa constrictor eating an elephant. Turns out that the intersection of Exupéry and statistics is a non-empty set:

Exupéry illustrates the collapse of retirement funding in an aging population.

And finally – puppies! The cutest textbook cover award goes of course to the excellent Doing Bayesian Data Analysis by John Kruschke.

Puppies! No matter how cute the frequentist are, bayesian are winning also the cuteness competition.

Startup culture as an ideology

Disclaimer: 

The article only argues about the “mechanics” of the startup culture. There is no value judgement or any implied relation to the political ideologies mentioned or alluded to below. I think this should be obvious, but you never know…  

I.

Most participants, twenty-something, coming from all corners of Europe, never experienced the old era themselves. Political history wasn’t a big conversation topic. Yet the shadow of the complicated ideological heritage could be felt here and there in the background of the startups scrum in Poland. The very first presentation of the meeting even mentioned Lech Wałęsa, though I assume this was an unknown name for most.

pivot2

For mine and younger generations, that didn’t experience the previous regimes first hand, these stay as a weird enigma. They are present in the old buildings in our cities, in jokes, stories, occasional TV documentary or old movies.

There is also a sense of bewilderment. How could reasonable people believe, and even more importantly, act on mere ideas? Can we nowadays, in the safe and developed West, understand the allure of ideologies on the gut (alief) level?

II.

It is only naive to not think that we are no longer submerged in ideologies on daily basis. However most of them feel more subdued, more passive. The most pervasive (and least noticeable) shapes our consumer behaviors and everything that is enabling them. This makes the broad “consumerist capitalism” memeplex particularly dangerous, but also not sufficiently visceral for demonstration on the alief level.

Political ideology in the western world is similarly largely in stealth regime. Most of the political action nowadays consists of reposting some image macros on facebook or some flamewars. Not really the blood and sweat of bygone eras.

Religion? Not for most of the young generation of developed countries.

scale

III.

That is what is striking on the “startup culture”. It captures the hearts and minds of the youth, gives them dreams and hopes, gives them a struggle and an enemy. There are revered thinkers, books that everybody should read and most can quote. Visionary pitches, lectures and essays. Memes and symbolism.

Details can be discussed and there are variations here and there, but the core message is usually intact.

And it doesn’t end with ideas.

It motivates to vigorous action, endless hours of hard work, enduring months and years of hardship. All in the name of greater good – creating value to the customer.

The end is noble and the means are obviously correct.

launch_early_text

IV.

You are twenty-something and wonder how reasonable people could have been in throes of ideologies like communism and national socialism.

Now feel the rush of adrenaline, the excitement, the call for action, when you think about the world of startups. Feel the future you can build, the good you can create, the reward you can reap.

This is how an ideology feels like.

validate

We have stumbled into the era of machine psychology

Emergent science in an emergent world

When describing complex emergent systems, science has to switch from lower to higher level descriptions. Here is a typical example of such transitions:

  1. We go from physics to chemistry when we encounter complex arrangements of large number of atoms/molecules.
  2. Complex chemistry in living systems is then described in terms of biology.
  3. Complex (neuro-) biology in human brains finally gives raise to the field of psychology.

Of course, the world is not discrete and the transitions between the fields are fuzzy (think about the chemistry-biology shoreline of bio-macromolecules and cytology). And yes, the (mostly infertile) philosophical wars on the ontology of emergence are still being waged. Yet, nobody would deny the epistemic usefulness of higher level descriptions. Every transition to higher order description brings its own ‘language’ describing the object as well as a suite of research methods.

In this game, it is however very easy to miss the forest (high-level) for the trees (low-level). One interesting example I’ve noticed recently is in the field of machine learning. When studying deep neural networks (DNNs), we have already unknowingly stumbled into such a transition. Historically, most of the research has been done on the “biology” of the DNNs – the architecture of the networks, activation functions, training algorithms etc. (and yes, saying “biology” is bio-chauvinistic on my part. We should find a better word!)

Recently however, we are more and more tapping into the “psychology” of the neural networks.

Machine psychology

The deep architectures that are now in use aren’t reaching anywhere near the complexity of human brains, yet. However, with connections in the billions (here is an early example), they are too complex, too opaque, for a low-level description to be sufficient for their understanding. This lead to a steady influx of research strategies, that shift the approach from the bottom-up understanding of “machine biology” to a more top-down, “input-output”, strategy typical for psychology.

Of course, neural networks are commonly, though not quite deservedly, described as “black boxes”. And historically, parts of psychology had its flirtations with cybernetics. But it is only recently that the we see a curious methodological convergence between these two fields as machine learning is starting to adopt methods of psychology.

The interesting distinction between machine and human psychology is that we have a direct access to “brain” states of the network (inputs and activation of each neuron). With machine psychology, we are now shifting attention to their “mental” states, something that is accessible only with higher order, indirect methods.

Psychology of machine perception

A first example of the convergence comes from the psychology of perception.

Deep neural networks have revolutionized the field of computer vision by crushing competing approaches in all benchmarks (see e.g. last year’s ImageNet competition). Yet a deeper intuition for how the DNNs are actually solving image classification requires techniques similar to those used in psychology of perception

As an example: recently, an “input-output” strategy yielded an attack on neural network image classification developed by Szegedy et al. 2013. In this work, they took correctly classified images. modified them imperceptibly, so that the trained network got completely confused (see Fig 1a. below). While on the surface level such a confusion seems alarming, one should just remind oneself of the many quirks of the human visual cortex (Fig 1b.)

szilard

Fig 1a: Example from Szegedy et al. 2013: Image on the left is correctly classified by a neural net as school bus. On the right side the imperceptably modified image is however classified as an ostrich. The middle panel shows the pixel difference of the two images magnified 10x.

618px-Grey_square_optical_illusion

Fig 1b: Your visual cortex classifies the colors of fields A and B as distinct. They are the same.

Nguyen et al. 2014, then turned this game around and used genetic algorithms to purposely evolve abstract images that well trained neural networks confound with real objects. Again examples for a DNN and human visual cortex below (Fig. 2a and 2b).

Image evolved so that a neural network miss-classifies it as a guitar.

Fig 2a: Image evolved so that a neural network miss-classifies it as a guitar.

A lamp miss-classified by your dirty mind.

Fig 2b: An image of a lamp miss-classified by your dirty, dirty mind.

Gestalt psychology for hierarchical feature extraction?

These confounding attacks on classifiers are very important, since deep neural nets are being increasingly employed in the real world. Better understanding of machine perception is required to make the algorithms more robust to avoid fraud (some examples here).

The reason why image classification works so well with deep architectures is the ability to automatically extract hierarchies of features from images. To make them more robust to attacks, requires an improvement in integration of these hierarchies into “global wholes”, well summarized by the mantra of gestalt psychology by Kurt Koffka, “The whole is other than the sum of the parts” (not “The whole is greater than the sum of its parts”).

Psychometrics of neural networks

The cross-fertilization of machine learning by psychology doesn’t end with perception theory.

Measurement of psychological traits are the bread-and-butter of psychometry and the crown jewel is of course intelligence testing. This is even more salient for the field of artificial intelligence. In an early example, Wang et al. 2015 made just recently headlines (e.g. here) by claiming to beat average Amazon Mechanical Turk performance on a verbal IQ test.

Oddly enough, I haven’t yet found a reference using deep nets on Raven’s progressive matrices. This seems like a very obvious application for deep networks as Raven’s matrices are small, high-contrast images and successful solution requires extraction of multi-level hierarchies of features. I expect that DNNs should very soon blow humans out of water in this test.

Raven’s matrices are the go to test for human intelligence with g-loading around 0.8 and virtually no cultural bias. Such an experiment would likely show, that the nets to achieve IQ 200+ in a very vivid illustration of the relationship between proxies for g and the actual “general intelligence” – the holy grail of artificial general intelligence (AGI) research.

Here is then a nice summer project: put together a DNN for solving Raven’s matrices. I even recall a paper on machine generation of test examples so enough training data will not be a problem!

Deep nets and Raven’s progressive matrices are made for each other.

Machine psychotherapy, creativity and aesthetics

On a joking note – if there is machine psychology, could there be also machine psychotherapy? How could a venerable Freudian help his DNN clients?

There are some very playful examples done with generative models (based on recurrent deep networks), e.g. text generation à la Shakespear/Graham/Wikipedia. A machine therapist will definitely be able to use their good old tools of word association games and automatic writing to diagnose whatever will be the digital equivalent of Oedipus complex of his machine patients.

b0a94d76a05bd77491c963328e255554

Did you again dream about electric sheep Mr. Android?

Even the good old cliché of dream interpretation can be brought out of retirement.
Geoffrey Hinton spoke about machine dreams long time ago. And the psychologist are already picking up on this:

One of the areas that I’ve been looking at recently is machine dreaming, the question whether AI systems are already dreaming. There’s little question that they meet our criteria for what a dream is, they meet all our definitional criteria. There’s better evidence really that machines, AI systems, are dreaming, than there is that animals are dreaming that are not human.

— Associate Professor of Psychology, James Pagel on the “All in the Mind” podcast.

The excellent paper by Google researchers, Inceptionism: Going Deeper into Neural Networks,  shows beautiful demonstrations of DNN fantasies, dreams and pareidolia. The psychology of digital psychedelic experience is close too.

What deep neural nets dream about actually.

What deep neural nets dream about actually.

This section is of course tongue in cheek, but its aim is to illustrate, that already now, the state-of-the-art DNNs can achieve very rich “mental” states.

Sidenote: speaking of machine therapy, the other way around, i.e. machines being therapist to humans, is a promising researched field. Indeed they seem to come a long way since the command line therapist and the `M-x doctor` (for the Emacs fans out there).

Machine ethology. Machine sociology. Machine etiquette. Machine politics.

Machines are already talking to each other a great deal: think of the internet, communication networks, or the budding world of internet of things. For now, the conversation is only between agents of low sophistication using simple, rigid protocols. We could perhaps already talk about machine ethology, maybe even nascent sociology. TCP/IP is an example of simple machine etiquette.

But the real deal will come when the artificial agents get more sophisticated (i.e. DNNs) and their communication bandwidth increases.

The final step is achieved, when the agents start to create mental self-models and also models of the other agents they are communicating with. The gates of social psychology, sociology and politics will be then pried wide open for our machine comrades.

Future of hard science is soft science?

Will your AI team soon have to hire a machine psychologist? Maybe so.
It is fascinating, that the hardest of hard fields – mathematics/statistics/AI research/software engineering in the areas of AI converges on methods from soft science.

Soft-sciences, mind you, not humanities.

About

I’m an extrophysicist and datanomer. An ex-astrophysicst and data scientist. The purpose of this blog is to quietly converse with myself, because a pathology becomes a charming eccentricity, when done publicly. Rigth?

But really I just enjoy learning about the Universe. Here is a non-exhaustive list of things, I’m interested in and might discuss in one form or another on the blog:

  • Artificial Intelligence – both narrow and general
  • Probability theory, statistics, bayesian epistemology
  • Machine learning and data science
  • Physics & Cosmology
  • Rationality
  • Decision and game theory
  • Effective altruism
  • Economics, finance, econometrics, econophysics
  • Cognitive sciences, neuropsychology
  • Genetics
  • Philosophy and history of science – particularly physics and statistics
  • Linguistics
  • Distributed systems
  • Complex systems, chaos
  • Literature, poetry, music… and their relationship to science
  • Mindfulness and meditation
  • Meta-ethics – particularly of the consequentialist/utilitarian kind
  • Learning and education
  • and too many other things….

And I’m trying to learn about writing. By doing it.

Albrecht Dürer, Melancholia

Synapses are cheap, experiences are expansive

Here is a simple back-of-the envelope calculation based on Hinton’s talk on deep learning:

You live about 102 years. A year is π x 107, so your life is ~109 seconds. Let’s say you receive 10-100 “experiences” (impressions) per second (brain activity is between few tens to few hundreds of Hz).

Thus, your life is about 1010-11 experiences. Blink an eye and it’s gone.

Your brain has 1011 neurons, with average connectivity of almost 104. That is a total of 1015 synapses.

You have thus about ~104-5 synapses per experience. There is no way the brain could fit a proper model (in the statistical sense), since Nparameters >> Ndata. Instead it has to strongly rely on regularization and sparsity.

Experiences are exceedingly expansive, a synapse is very cheap.

A synapse. Ugh, look at it… so cheap!

This puts all the tired blank slate arguments to grave – brains have to come somewhat pre-trained/regularized via genetics.

Also relevant – the argument is actually just a reformulation and generalization of the poverty of the stimulus argument by Noam Chomsky in the field of linguistics (a field now incidentally totally dominated by deep learning).

The Inconvenient Truth About Data Science

Posted by Kamil Bartocha, a few excellent points. I agree with every single one of them:

  1. Data is never clean.
  2. You will spend most of your time cleaning and preparing data.
  3. 95% of tasks do not require deep learning.
  4. In 90% of cases generalized linear regression will do the trick.
  5. Big Data is just a tool.
  6. You should embrace the Bayesian approach.
  7. No one cares how you did it.
  8. Academia and business are two different worlds.
  9. Presentation is key – be a master of Power Point.
  10. All models are false, but some are useful.
  11. There is no fully automated Data Science. You need to get your hands dirty.

What I learned this week: 18/2015

This was a very busy week, co-organising a workshop, travelling and then an extended weekend.

Statistics, Probability, Machine Learning, Data Science

Xgboost

I played with Xgboost, a parallelized gradient boosting machine implementation. I managed to install it on both a Windows and a Linux machine and it really is fast. I didn’t test it yet directly against the standard GBM implementation so I can’t say if the advantage is purely speed (claim is up to 20x) or if you can get extra predictive power per computational cycle.

Kaggle

After following the fora since quite some time I decided to actually try it (made just 2 submissions on 1 competition). I don’t expect it to use it competitively, and it is definitely a bit of stylized/artificial approach to data analysis/machine learning, but I think it is an interesting endeavor.

It is a martial arts kata excercise to the dirty, non-linear street fighting of the day-to-day data science. Where, as you know, street fighting is 50% of knowing when to avoid fight, 10% actual fight and 40% administration and sitting in meeting. Wait, somehow my metaphor broke.

Kaggle Higgs boson search post-mortem

Read up mostly Motl’s point of view (1,2, and the xgboost solution 3) and the Kaggle forum. Turned out to be a bit less interesting and enlightening than I thought, but it fit with the xgboost theme of the week.

General Science / Misc.

ER = EPR? 

Quanta has a series on recent developments here: 1, 2, 3. Being a total dilettante, but this just feels so very right. God knows, I’ve been in the past excited about many results, that then just went away. But this time it is different (I always say that…). The firewall problem was just a precursor, this is the real deal.

The quantum entanglement wormhole octopus is my new favourite animal.

So you want to be a consultant…?

Excellent article. Focused on freelence IT, but it is interesting also for other areas. And even if you don’t want to be a freelance consultant. Or a consultant.

The days are long but the decades are short

Sam Altman turns 30. Here is the wisdom from the elder. (The article is alright, it just strikes funny to get a life advice from a 30 year old).

Prescriptions, Paradoxes, and Perversities

An alarming analysis by Scott Alexander on the state of pharmaceutical.

Management Myth

An entertaining piece on MBA education and management.

Or: we hope this article will compensate you with a smug feeling of superiority, because although you have the hard science doctorate, we’ll pay far more for the fresh MBA graduate manager :).

Videos / Lectures

The Knowledge – Lewis Dartnell | Authors at Google

  • have his book on my “to read” pile. The talk wants me to move it up in the queue.

Podcasts

Phil Rosenzweig on Leadership, Decisions, and Behavioral Economics

  • Strongly recommended – lot’s of new ways to look at familiar experimental results and their (non) implications in practice. Highlight of the week.

Triple H on Pre-Fight Rituals, Injury Avoidance, and Floyd Mayweather, Jr.

  • after enjoying the Schwarzenegger episode and even (gasp!) the Glen Beck one, I’m hardly surprised about a wrestler coming out as a very reasonable, driven and articulate man. Ferris is an excellent interviewer and his podcast is very good time filler when I’m too tired for other stuff.