Index | Archives | Categories | Tags | Atom Feed

Adventures in Reinforcement Learning

You may or may not have noticed that I've been working for DeepMind for a while, causing me to get exposed to lots and lots of cutting edge machine learning research. Most of that I can't share here, but there's plenty that's already public.

Reinforcement Learning

Firstly, yes, that's what all the fuzz is about. There's a great book by Sutton and Barto, Reinforcement Learning: An Introductioncache, with an in-progress version of the second edition available for free from their website!

The book is very good at introducing and explaining RL itself, but does not cover how to combine it with function approximation (neural networks). The basic idea is very simple - just implement the action value function with a neural network - but doing it in practice is trickier.

If you want some very basic examples on how to implement such an agent, you ...

Brains, Sex, and Machine Learning

A great explanation of why Dropout is really good for training large neural networks, and why it's actually the same thing your brain is doing:

Recent advances in machine learning cast new light on two puzzling biological phenomena. Neurons can use the precise time of a spike to communicate a real value very accurately, but it appears that cortical neurons do not do this. Instead they send single, randomly timed spikes. This seems like a clumsy way to perform signal processing, but a recent advance in machine learning shows that sending stochastic spikes actually works better than sending precise real numbers for the kind of signal processing that the brain needs to do. A closely related advance in machine learning provides strong support for a recently proposed theory of the function of sexual reproduction. Sexual reproduction breaks up large sets of co-adapted genes and this seems like a bad ...

Reading Drive: The Surprising Truth About What Motivates Us

After mentioning that I was reading Influence to my good friend Adriennecache, she recommended Drivecache as my next book. The theme is quickly explained: While there are three different ways to motivate us - biological urges like hunger or sex, external reward & punishment, and intrinsic reward from performing a task - only intrinsic reward can consistently foster creative behavior.

Pink starts out by showing how traditional external motivation - cash bonuses et al. - overly constrain our focus, interfere with creativity, extinguish internal rewards and even lead to unethical behavior (think doctoring sales numbers to meet a bonus target). Only in special circumstances can external motivation still be useful: If a task is routine and boring to begin with, then there's not much creativity to lose.

He then makes the case for why intrinsic motivation is better suited to our highly evolved and demanding work environments. Everyone who's ever been ...

Relevant Reading

How much of what you read today would you still want to read if it was a week from now? A year? A decade, a century?

Too much of the information hammering us is ephemeral infotainment, not enough brings lasting value.

The test I apply is simple: would I still want to read this if I had to wait a year, even if it was the only thing I could read that day?

Taking Notes

I always thought taking notes was a waste of time. Surely you could just look back at the slides if you forgot something?

But I realized I might have had it backwards all this time. What if you take notes not to have something to refer to, but because it forces you to listen with greater attention and strengthens your memory?

To really take notes you need to understand the subject well enough to pick out the key phrases and concepts in real time, all while the lecturer is speaking on and on. It's similar to how actually doing the exercises in your textbook will make you realize very quickly which parts you've understood, and which parts just sounded vaguely plausible.

Reading Influence: The Psychology of Persuasion

The theme of Influencecache is clear: unable to keep up with the onslaught of information and decisions we fall back to mental shortcuts and learned behavior.

Robert Cialdini orders them into six large groups:

  • Reciprocation. Giving a gift or doing a favor makes us much more likely to comply with a subsequent request, even if the favor was unwanted. The same concept applies to making an initial high demand and then "conceding" to a lower one. Example: charities giving you a free gift.
  • Commitment and Consistency. We will change our opinions and desires to be in line with our behavior; extracting initial admissions and behaviors can make us want to comply with later ones. Example: Claiming you would do public service when surveyed.
  • Social Proof. In uncertain situations we will look to others for guidance on how to behave. Example: Bystander effect, advertisements with "people from the street".
  • Liking ...

Neural Turing Machines

DeepMind just published a new paper Neural Turing Machinescache with some very interesting implications. For the ones pertaining to AI just read the paper; here I want to focus on what it means for the human brain.

The main achievement of the NTM paper is coupling a neural network to external traditional computer memory, implementing both content and address based access. This implies something similar could potential work with the human brain.

There's already precedent for implanting electrodes and using them for controlcache. The big open question here is the training - for NTM this was achieved by gradient descent, but it is not clear if a similar process would automatically take place in an adult brain. However, the brain seems to be exceedingly plastic, even being able to learn to see from electrical stimulation of the tonguecache.

Successful implementation would give us working memory of almost ...

A Walk around Umbrella Revolution

Umbrella Revolution sign

The Umbrella Revolution (from the umbrellas used to protect against tear gas) has been going on for nearly a month, yet publicity is scarce; fading out almost completely in the last two weeks.

Which is a shame - here's a vibrant city struggling for a most basic right, fair and independent elections, and yet we in the west, so proud of our democracies, stand by and watch?

all that is necessary for evil to triumph is that good men do nothing

By chance I've been in Hong Kong for the past few days, and meeting these protesters - both young and old - really struck a chord in me. They risk their freedom and lives for something they passionately believe in, but all we know of it is a short clip in the evening news. No more!

I invite you to join me on a little journey to the protests:

Study Corner sign


First a quick overview what the protests are about: In September, the government in ...

I Travel

I travel to explore the world, I travel for the thrill of the unknown, to broaden my mind, to taste the unexpected, to learn and keep growing. I travel for chance encounters and randomly meeting again halfway across the globe. I travel because it's the easiest way to multpily my (remembered) life span. I travel to break the daily routine, to upset hardened molds, to escape boredom, to make new friends and meet old ones again.

I travel not to feel alone, to meet new people and fall in love for a brief romance, to share moments and explore together, to forge bonds like only crazy adventures can.

I travel to be alone, to escape people and be thousands of miles from anyone who knows me, lost in the unknown with myself and my thoughs; no one who speaks my language, no one who looks like I do.

I ...

Censorship and the Great Firewall

I just arrived in China for my vacation and noticed one thing immediately: All kinds of useful sites are blocked. Google Search and Maps, Gmail and Facebook. This is a major pain in the ass for doing anything from looking up directions to uploading pictures for those back at home, not even to speak of any human rights aspects.

My initial approach was using OpenVPN on my phone, but that doesn't seem to work - Facebook is still blocked, etc. What does help is ssh -D <PORT> <HOST>, but even that seems to get slowed down after some use. It's enough to upload pictures, but still far from perfect.

Does anyone have other good workarounds?

Update: While trying to publish this post I discovered they also block Dropbox and Amazon S3. Fucking Asshats. Since you are reading this now their security is obviously crap, only wasting some of my ...

© Julian Schrittwieser. Built using Pelican. Theme by Giulio Fidente on github. .