Thursday, November 6, 2014

Four years later

Dear reader,
It has been years since the last post and still I get emails and comments about this topic which is of interest to highly self-motivated learners around the world. Consequently, here is a post to let you know how I have progressed with SuperMemo and procedural learning over this time. I provide several links back to old posts, as much for my benefit as yours, because it's important to stitch it together with the rest of what I've written.

  1. I have continued to learn the following procedural items through SuperMemo: keyboard shortcuts, programming, mathematics (mainly engineering and financial), chemistry.
  2. I have not continued using SuperMemo to learn Rubik's cube, guitar or violin, although I have continued to practise each of these outside of SM.
The primary difference between these two categories is that the first is not very prop-dependent; at most I need pen, paper and calculator. Although I noticed this problem way back when I wrote How to set up a procedural collection, I didn't realise how much of an effect the props would have.

Keyboard shortcuts
As described here, I believe the trick to keyboard shortcuts is answering the item by touching the keys. Consequently, when the question is asked on SM, it shows the black keyboard. When I click Show Answer, it shows the coloured dots and the answer box. This has continued to work nicely.

By the same token, many programming items in my collection are of the type shown below.  

Whichever language you are using, syntax is a type of basic knowledge that you'll need. However, watch out for interference if you're learning multiple similar languages. As Piotr Wozniak says in 20 rules, "Interference is probably the single greatest cause of forgetting in collections of an experienced user of SuperMemo". Some solutions that I have applied are: 1) master one language above all others (like a first language), 2) by all means add items from other languages, but if there's interference with your 'first' language, delete the secondary language item, 3) use screenshots of surrounding code and the language's main logo to build stronger context.

The following items were also useful, for building stronger recognition of common coding errors. Brackets are pretty ubiquitous, so just remembering to look for such things is useful.

In general, I use the spell-pad answer component for programming items, because it lets me type them in, which is always good practise.

This is definitely my best achievement in using procedural SuperMemo. Most of my procedural items are mathematical, and the effectiveness of these is very high. I only need to keep a pen and paper near the computer, and I can use the computer calculator to do anything I need.

Below is an example of a calculus question. The key to making this work is writing down the question and solution on paper, thereby simulating how the situation where I would need to use this. I think it's worth experimenting with numbers in place of pronumerals. For example, you could also write this question as sin[k(x)] instead, but I don't think it makes a difference to learning.

The next example shows a more in-depth example. Basically, every time I see this question I need to follow the same procedure, or else I won't get the right result. In practice, the only difference between this question and the previous chain rule question is that it takes longer to solve. Otherwise, it works the same.

The most complex type of item is shown below. Basically, it is the sort of question you would find in an end of semester engineering exam. I am still convinced that if you want to retain the ability to solve this type of question over the long term, that you need to ultimately have an item such as this which requires you to solve from start to finish (same goes for playing musical pieces). Unlike declarative items, I propose that the minimum information principle does not refer to the complexity of the item; it refers to the step-up in difficulty from what you can already do.


I eventually stopped using SuperMemo for guitar and violin because of lifestyle. To me they were much more leisurely activities. I didn't practise as consistently as I would have liked because of work hours rather than laziness. When I did pick up the instrument I didn't particularly want to spend my time drilling skills. However, I think it is worth noting that of the skills I did import to SuperMemo and practise for a couple of years, I can definitely recall these better even now. Hence, if/when I had the time for a more serious practise schedule I would return to SuperMemo.

Rubik's Cube
Although this is hardly important, I thought it relevant to say that I tried to solve the cube two days ago and got stuck. This never happened in the past, but I have not used it for ages. (Note that I was following a specific procedure to solve it, rather than spatial brilliance). Just another data point to remind me of the importance of spaced repetition.

Future of Procedural Learning with SuperMemo
At this point in time I believe that if and when SuperMemo becomes super-portable and mobile-compatible (like Evernote) it will be much easier to overcome the prop-problem. You could just take your phone to the karate dojo, basketball court or music practise room. Until then, only the most dedicated students will continue to use SuperMemo for more active procedural learning. And good on them!

Wednesday, September 8, 2010

Action items vs Coordination items

The purpose of this post is to answer the following question, which was previously left as a comment: "Why should we learn procedures in full, instead of cutting them into their smallest components as we do with declarative knowledge?"


The short answer is: Fluency. By learning a single, specific method for achieving your desired outcome and training yourself in that method from start to end, you will be able to perform it as easily as breathing when the need arises (At least, you will if you haven't forgotten how to perform it - hence the need for SuperMemo).

On the other hand, if you have learned all the components of a skill, but don't regularly practice them as a combined activity, then you may need to invest considerable thought towards designing an effective method every time you encounter a new situation. Or, you may simply need to train the combined activity as a separate skill. Either way is more time-consuming than simply learning a particular method by heart to begin with.

Principles vs Methods

A method is a specific set of steps that can be performed in order to achieve a desired outcome. For example: "If you want the violin to sound louder, pull the bow faster, and put more weight on the strings"; "If you are about to be kicked in the face, pull your arm up and block". As usual, such rules are most effective when they are non-verbal and automatically applied, rather than thought about beforehand. Unfortunately, thinking as much as Sherlock Holmes does when he fights distracts rather than focuses effort and attention.

In an abstract sense, whereas we normally tend to consider declarative knowledge as a sort of "web" of associations, procedural methods are more like algorithms or enumerations - "if this, do that". Of course, since there are usually many ways to achieve the same outcome, you could easily end up confused by a thought-process sounding something like: "If this, do that... or that ... or that ... or...". Back in one of my first posts (Procedural Knowledge) I said that "a procedural knowledge item is characterised by the purpose the procedure serves... if you know two ways of achieving exactly the same thing then one of those ways is redundant (once again, redundancy is not necessarily bad)"

In contrast, a principle refers to a basic criterion which must be met by any specific method. For example, when playing the violin (or most musical instruments) it is important to keep as relaxed as possible, even in a performance, so that 1) you don't get sore muscles or RSI from playing, and 2) the music does not get affected (eg scratchy, squeaky or otherwise distasteful sounds). In order to relax this way, some people use the Alexander Method, some meditate beforehand, and some just breathe deep. Each of these is a different method, but they all serve the same principle - to relax in order to play better music.

In general, principles are declarative concepts and must be understood sooner or later. But when it is time to build up skill, methods are the way to go. Furthermore, it doesn't matter which method you use, as long as it works.

Action Items
An action item is a "micro-skill" (rather than a skill,
per se), which is used to achieve a small, specific effect during application of a more advanced skill. In a way, these are like the "minimum information" elements you would be used to from SuperMemo.


  • a martial artist can learn a simple kick, and then combine it with many other actions for different purposes or effects
  • an artist can learn how to draw simple shapes such as oval, circles and squares, and then use these micro-skills when drawing much more advanced drawings
  • a violinist can first learn staccato bowing on the C major scale (for example), before integrating this bowing technique into endless numbers of real pieces
As you can see, these sorts of abilities are quite useless alone, but powerful once enough of them have been mastered in a particular field. However, there is an inherent problem with learning many action items in isolation, and this is what this post is all about. The problem is that while each of these individual actions is a skill, the ability to combine these actions into a fully-formed procedure (i.e. step 1, step 2, step 3 - not just step 1!) is a skill unto itself. That's where coordination items come in.

Coordination items
These items link many action items together to form a specific method for accomplishing a goal. As the name implies, the focus is not on the individual actions, but on integrating the many discrete parts into a smooth, continuous output. I have often found when studying second languages that though a word might be easy to say in isolation, it can still trip me up in a sentence. In such cases, once I master the word, I also try to master the sentence.

  • Any musical piece, or any piece of artwork, consists of many individual steps taken that make an impression on the onlooker through their overall, combined effect
  • A soccer player often makes use of his running, tackling, dribbling (and acting) skills in a game, even while practicing each in isolation during practice sessions
  • A computer programmer writes a fully functional piece of software, using many small tricks and methods accumulated over many hundreds of hours
This last one has been of particular interest to me lately. That is, I have been trying to work out how best to formulate the ability to write software programs. While SuperMemo has often been used by others to practice writing computer code correctly (i.e. learning syntax) the ability to frame real world problems as well-defined steps (i.e. algorithms) is a much higher-order ability, and much more interesting and more powerful to learn.

In particular, I have found that a very straightforward way to formulate such knowledge in SuperMemo is to enter simple programming exercises, and thereafter to always answer them in the same way. Although this requires hardly any creativity, we are not trying to retain creative ability (for now!). What we are trying to do is retain the ability to write effective programs. For example, in order to retain the ability to write a simple recursion formula, I have a SuperMemo item which requires me to write a program to find the nth Fibonacci number. I then test the program by finding the 11th number (i.e. n = 11) and check the answer field to see if I am correct. If I ever forget how to do it, I have a screenshot (also in the answer) of the correct code. Once again, although there are many ways to achieve the same effect in a program, in order to learn to be effective you only need to learn one of these methods well. Obviously, if the effect is very important you can learn more than one method, but you should formulate this as a clearly different SuperMemo exercise.

As a summary, when you are training a new skill and retaining it through SuperMemo don't break it down to its smallest parts unless it is useful to do so; that is, if those "micro-skills" are fundamental to your art and are likely to be used and re-used in different scenarios. Even if this is the case, don't limit yourself only to learning action items. Learn how to fluently perform extended applications of your skill, such as playing a whole musical piece, writing a short computer program or speaking a whole page out loud in your second language (not just individual words). This will make you more fluent in practice and make it easier to respond to new situations on the fly using fully internalised responses.

Friday, July 23, 2010

Rote Learning

One of the reasons for introducing the contrast between Intuitive Declarative and Blind Procedural learning was to deal with the concept of rote learning. In particular, rote learning can be considered a primary form of procedural learning.

What is rote learning?
All users of SuperMemo know the importance of repetition in learning. However, there are two distinctly different functions for which repetition can be used. On the one hand, rote learning involves the acquisition of knowledge through repetition. Conversely, SuperMemo uses repetition as a maintenance tool for previously-acquired knowledge.

Examples of rote learning:
  • In school, most children learned their multiplication tables through an endless (and mindless) process of re-reading and recital. Despite any work done to reconcile multiplication with arithmetic (i.e. that 3x = x + x + x) it was primarily this mundane repetition that bored the numbers into their heads
  • Every budding musician's first few years consist at least partly of practicing scales, arpeggios, etudes and other such exercises - over and over (and over) - until the fingers know how to play them, and the mind doesn't need to get in the way
Whatever your skill is, training for it will generally involve practicing it over and over until the slow but inexorable process of trial and error can chisel the neural script for it into your brain. As you will notice from all of these cases and any others that you may have in your experience, such rote learning does not culminate in any great amount of understanding. To the extent that any understanding is gained at all, it is only that amount that is required directly during execution of the skill.

This method of learning is focused entirely on practical application. As long as you can defend yourself in a fight, it is not required that you explain the theory behind your technique. As long as you can play your piece in the orchestra, in tune and in time, it's not necessary that you be able to name and describe all the muscles involved in playing. This type of learning is does not require strong understanding.

In contrast to this process, repetitions in SuperMemo occur only after skill or knowledge has been acquired. In the case that you have thoroughly studied something such that you can fully understand and apply it, SuperMemo simply ensures that you "lock in" this improvement so that you need never return to your notes and wonder what on Earth they mean (which otherwise happens very often to students studying large amounts of material). Spaced repetition does not increase your understanding, nor does it reduce it - it simply maintains it at whatever level you have achieved by the time you formulate your items.

The somewhat subtle difference between these two forms of repetition is the first point of misunderstanding with potential new users of SuperMemo. The assumption is that "repetition = rote learning". In fact there are two types of repetition, each with different functions. In particular, rote learning is the acquisition of skill through repetition.

How it Works
The basic principle which makes rote learning effective for acquiring a skill is the variation principle. This is the fundamental principle that a new skill is easiest to learn when it is but a slight variation on an already-learned skill. In rote learning, the first time you try to execute something, you only get part of it right. However, you then repeat the part that you know, and then try adding a bit more to it.

For example, you try to play the first line of a musical piece but only get the first two bars sounding good. So what do you do? You practice the 3rd bar in isolation, and then try to add it on the the first two bars. Then you learn the fourth bar, and so on. Overall, it looks something like:
  1. Play the first bar ok
  2. Play the first bar again, and also play the second bar
  3. Play the first and second bar again, this time getting the dynamics right in the second
  4. and so on
This is of course the simplest situation - no new techniques to be learned, no difficult passages, just learning of notes. However, it illustrates the basic rationale, which is that each time you repeat the skill you add on a little bit extra.

When you first feel the thrill of riding a bike, it is all you can do to keep in a straight line. However, after a few more goes you can also steer and brake without smashing into things or jumping off the bike.

When you learn your multiplication tables, it often happens that you can remember the first 2 or 3, but have trouble with the rest. e.g.
  • 1 x 3 = 3
  • 2 x 3 = 6
  • 3 x 3 = 9
  • 4 x 3 = ...?
However, after some time of saying the whole x3 table from 1 x 3 to 12 x 3 over and over (and over) again, you add on a few more lines each time. The parts you learn may not necessarily be in order. So, you might learn 10 x 3 before you learn 7 x 3. However, you are still using repetition and the variation principle to add on, bit by bit, to the basic skill that you started with.

Now, it is of interest to note that in mathematics education "Children have a tendency to learn algorithms by rote without developing any understanding of what they are doing" (Hiebert, 1986). This statement beautifully illustrates that skills can be acquired through simple rote repetition, without any supporting understanding. Now, this quote is obviously pointing to the negative effects of learning maths without declarative support. However, the cure is not to stop children learning procedures by rote - the method is obviously effective - it is simply to make sure to also teach the context and rationale of those procedures. Apart from being able to apply these, students should also be able to choose when to apply them, and understand why they work.

Terminology and other Meaningless Verbal Knowledge
The last part of this post may turn out to be the strangest. And that is because I propose that any knowledge acquired through repetition alone (i.e. rote), without association to other concepts (which is the primary indicator of understanding), is procedural. As a simple test, if there is something that you cannot remember on the first attempt, but need several goes in the Final Drill to learn it, it is probably procedural.

A particularly large and relevant example of this type of knowledge (especially in SuperMemo) is terminology and other general vocabulary. Knowledge of which word is used to represent which idea is almost always learned through repetition, rather than through an intuitive understanding that the word makes sense. For example, you learn that "orange" is the word for that large fruit with the unique colour to its skin.

Why is this fruit called an "orange", and not a "coconut"? It just is and you better practice the word a few times or you will forget what it means the next time someone says it. Certainly, you can learn the etymology of the word and understand (declaratively) where it was originally derived from, but you don't have time to do this with every new word. And let's face it, how many children know or care about the etymology of words like "mum" and "dad"? That's just what they're called, and with a bit of effort, they're not too hard to say.
  • This continent is "Africa". Why?
  • This person is called "John". Why?
  • This mathematical symbol is called "pi". Why?
  • This bone is called the "tibia'. Why?
As you can see, asking "why?" to any of these is irrelevant in most situations. Even if you did find out why, you would still have to learn the word, separately.

The only real exception is words where the etymology of words is obvious. Because I am part Greek, and have lived several years in Greece, there are many scientific words that I can understand immediately. For example, I knew the meaning of the word "polychromatic" the very first time I heard it, because it is made of the Greek words "poly" (many) and "chromatic" (colours). So something that is polychromatic has "many colours". In this case, I learned this word declaratively. Unfortunately there are not so many other cases where this can be done. Even compound words like "pancake", which ostensibly comes from "pan" and "cake", don't really make that much sense, since a pancake is not really a cake anyway.

This Wired article on SuperMemo and Wozniak describes how Ebbinghaus experimented with nonsense syllables such as "bes, dek, fel, gup, huf, ke4k, be4p, bCn, hes". Though many psychologists would probably treat these as declarative, I contend that they are probably not. At least in practice, they can be treated as procedural, and that is what's important for us. However, I would also hazard a guess that an fMRI would confirm that this type of knowledge is not processed in the brain in the same way as when you think about something that you understand, and has many associations to other knowledge. The intuitive justification is that when learning vocabulary, you are simply acquiring the ability to say the right word when presented with certain information.

Thus, the names of countries, capital cities, parts of anatomy, and any other technical jargon all form procedural knowledge. In fact, practically all words in language are learned procedurally, except perhaps those words that can be readily and obviously derived from other words.

Redundancy is an important and useful tool for maximising the stability of your memories through association. When I separated the ideas of Intuitive Declarative and Blind Procedural, I did not mean that you should only learn one or the other. In practice you must learn both, in order to reinforce the same knowledge from different perspectives. It is important that you learn these as two different representations of the same knowledge, rather than learn a single all-encompassing "intuitive procedural", so to speak. As I said in response to a comment on another post, you should keep the training of skills and learning of knowledge separate. It's just like saying that in language learning, vocabulary, grammar, spelling, punctuation etc should all be formulated into separate SuperMemo elements - not because one is more important than the other, or because they are unrelated, but because it allows for better focus and more efficient learning.

Thus, you will naturally find instances of procedural learning that seem to require a lot more thought than others, even once they have been mastered. The trick is to artificially separate out the different components so that they can be learned separately and then used to support each other.

In summary, rote learning is when you acquire knowledge primarily through repetition rather than through association to other knowledge. If you learn something by rote, you should thereafter treat that knowledge as procedural. Also, all knowledge is inherently fluid and often contains parts that require blind execution and parts that require intuitive understanding. Learn both, and use them to support each other.