To celebrate the publication of our MuZero paper in [cached]Nature ([cached]full-text), I've written a high level description of the MuZero algorithm. My focus here is to give you an intuitive understanding and general overview of the algorithm; for the full details please read the paper. Please also see our [cached]official DeepMind blog post, it has great animated versions of the figures!
MuZero is a very exciting step forward - it requires no special knowledge of game rules or environment dynamics, instead learning a model of the environment for itself and using this model to plan. Even though it …
I gave a detailed talk about MuZero at ICAPS 2020, at the workshop "Bridging the Gap Between AI Planning and Reinforcement Learning".
In addition to giving an overview of the algorithm in general, I also went into more detail about reanalyse - the technique that allows MuZero to use the model based search to repeatedly learn more from the same episode data.
I hope you find the talk useful! I've also uploaded my slides for easy reference.
Ka-Ping Yee's [cached]blog post about election methods and how to visualize them has long been one of my favourites. As always, a well chosen diagram or picture is much easier to understand than a verbose description, and can make corner cases directly leap into our face. The human visual system is a powerful pattern detector, we should make use of it whenever we can!
Inspired by Yee's blog post, I made an interactive simulator to allow you to explore the four main voting systems directly in your browser: Plurality (aka first past the post), Approval, Borda and Instant-runoff (Hare …
I've written and been paid to write code in a wide variety of languages - functional and imperative, statically and dynamically typed, verbose and concise. Over time, I came to appreciate the benefits of certain language features.
The one I want to talk about today is static typing. My programs in statically typed languages seem to be systematically more likely to work correctly on the first try, more robust against bugs introduced by refactoring or adding new features, and easier to reason about. In fact, the benefits are so large that I now use pytype even for Python scripts with just …
The [cached]Rust WebAssemply book has a detailed introduction to WebAssembly in Rust; unfortunately it's example setup is somewhat complicated and requires the use of
npm just to run show a simple Hello World! message in the browser.
Luckily, there's a simpler way to get started if you don't care about
Clone the example project template:
cargo generate --git https://github.com/rustwasm/wasm-pack-template
which will prompt you for a project name, in the following we'll assume you used …
My previous Getting into Machine Learning post is one of my most popular; since then much has changed.
There's a new kid on the block: [cached]JAX. A thin, but powerful layer over Autograd and XLA, it makes it easy to concisely express algorithms with the same syntax as numpy while getting the full performance of TPUs and GPUs.
The resources I recommended in my previous …
Recently, I was once again looking for a note taking solution. Over time, I had used many: Evernote, a markdown based one I wrote myself, Joplin. For one reason or another, I had stopped using each of them in turn.
My requirements were simple:
- Linux and Android support
- Markdown first, with syntax highlighting
- Automatic synchronization of all my devices
- Plaintext backup to simple files, for easy migration
- Copy & paste to insert images
- Full-text search.
And the most important: Does it spark joy?
The best note-taking solution is of no use if you don't consistently use it. After trying many alternatives …
From Musk's "Potentially more dangerous than nukes." tweet, increased funding for the Machine Intelligence Research Institute (MIRI) to the founding of cross-industry groups like the Partnership on AI, AI is being taken more seriously.
One worry that is sometimes cited, as in the book [cached]Superintelligence by Nick Bostrom, is that once we reach human-level AI, it might rapidly improve itself past anything humans can envision, becoming impossible to control. This is called "Singularity", because anything after such a point is unforseeable.
The argument for a Singularity rests on the fact that a hypothetical AI could devote all its resources …
I'm sometimes asked if I'm aware of any open-source implementations of [cached]MuZero. In addition to the [cached]official pseudocode, a variety of researchers have made their own implementations and shared them online.
I have not checked these for correctness or completeness, but I still believe that you may find them useful. In no particular order:
- How To Build Your Own MuZero AI Using Python from Applied Data Science
If you know any others, please let me know and I'll add them!
If you want to replicate MuZero or try other RL ideas on …