Why Learning Should Be Difficult

(But not too difficult)

Whichever definition of learning we choose to adopt, the one thing they all have in common is the notion that it involves a relatively permanent change. Robert Gagné, for example, defines learning as ‘a change in human disposition or capability that persists over a period of time and is not simply ascribable to processes of growth’ while Robert Bjork, views it as ‘the more or less permanent change in knowledge or understanding that is the target of instruction’. John Anderson prefers ‘the process by which relatively permanent changes occur in behavioural potential as a result of experience’. More recent definitions specifically highlight the role of changes in long-term memory. However this latter type of definition can be problematic.

Learning and memory are like two sides of the same coin, so they are different but also deeply intwined with each other. We can’t have memory without learning but we can learn things that we forget. The aim of learning, therefore, is to ensure that what we have learned stays put and is available to be accessed as it is required. This, according to Anderson, represents the  difference between learning (or behavioural) potential and performance. 

In research terms, learning is the difference between time 1 (before learning) and time 2 (after learning). This is why learning (and, therefore, changes in long-term memory) is generally inferred by some kind of assessment or test. Tests are a measure of performance at time 2, the way we know if learning has or hasn’t taken place. Learning itself cannot be seen and can only be inferred via observing performance. The same is true for long-term memory, in other words, learning and memory are hypothetical constructs. 

We, therefore, distinguish between potential (for example, our ability to speak Dutch) and our performance (ordering coffee in an Amsterdam cafe). These permanent changes, however, aren’t confined to learning. Developmental process also lead to changes in behaviour as can brain trauma. In addition, the ability to perform can also fluctuate as the result of internal factors such as fatigue, drugs or anxiety and environmental conditions like temperature and noise.

The problem is that sometimes our performance can give the impression that learning has taken place when, in reality, we are missing the ‘more or less permanent’ part. Say I gave a group of participants a list of random words to learn and then after presenting them with the list I tested their ability to recall them. Results will certainly fluctuate between participants, but we can assume that they will all recall at least some of the words. This recall would be a judge of performance, the observation of the change from time 1 (the presentation) to time 2 (the recall test).

I then decide to wait a week and test them again on the same list of words (time 3). Chances are, their performance will have declined between time 2 and time 3, meaning that learning has failed the ‘more or less permanent’ criteria. Nevertheless, success at time 2 created the illusion of learning because information like this takes time to decay.

This isn’t a ground-breaking revelation – Hermann Ebbinghaus had already noted this back in the 1800s. If the list of words isn’t actively rehearsed between time 2 and time 3, the information is unlikely to have been recalled as successfully, the forgetting curve (see below). This might indicate that the information wasn’t fully encoded into memory in the first place (or, at least, storage strength was inadequate).

Source: https://www.growthengineering.co.uk/what-is-the-forgetting-curve/

According to Ebbinghaus, the speed of forgetting is is dependent on a number of factors, including the difficulty of the learned material, physiological factors such as stress and sleep, and the representation of the material. Better representation, such as the use of mnemonic techniques and representation based on active recall, will ensure long-term retention. 

One way of doing this is by incorporating into our learning what Robert Bjork describes as desirable difficulties. By making retrieval difficult (but not too difficult) we trigger encoding and retrieval processes that support learning, comprehension and remembering. However, if we do not possess the background knowledge and skills to respond to these difficulties, they become undesirable.

There are a number of different effects that can give us as idea about how these desirable difficulties operate.

The generation effect refers to the long-term benefit of generating an answer, solution or procedure versus the relatively poor retention seen in being presented with it. The retrieval of learned information, therefore, is a more effective strategy than, say, re-reading the material because there is less cognitive effort required. One study, for example, found that using cues (such as the first two letters of a learned item or an anagram) resulted in better retention than re-reading the item while another found similar results with blurry fonts.

The generation effect is related to the testing effect, whereby regularly testing ourselves on the material (perhaps through a formal quiz or the use of flashcards) we again further strengthen our ability to recall it.

A number of studies have also highlighted the association between perceptual processing and improved retention. For example, a 2010 study found improved recall for information presented in hard-to-read fonts.

Two interrelated strategies, however, appear to produce the best results: Distributed practice (an example of the spacing effect) and interleaving.

Distributed practice is perhaps best explained by referring to its alternative – massed practice. Massed practice can be best explained using the example of cramming. Learning something new or revising something we have already learned often takes place in a single block and over a relatively brief period of time, such as cramming for an exam. This can result in rapid gains but such gains are usually short-lived. Distributed practice, on the other hand, involves shorter study sessions distributed over a longer period.

One of the reasons why distributed practice appears to be so successful in due to reminding mechanisms, in that each time the information is presented, the effort required to remember what was previously learned enhances retention. This means that the more often the information is re-visited, the stronger learning becomes. As the lag (the time between each learning episode) increases, so does the difficulty – learning becomes more effortful. However, at some point the lag increases to such as extent that any benefit from the spacing effect disappears. When plotted on a graph, learning takes the form of an inverted U.

Interleaving takes the form of inserting different topics or tasks within these lags. We might, therefore, interleave our practice of ordering a coffee in a Dutch cafe with asking for directions to the railway station, or (in the case of a psychology course) studying attachment on day 1, memory on day 2, attachment on day 3, and so on.

Using distributed practice and interleaving doesn’t lead to the same rapid gains that we see in massed practice and blocked learning. Indeed, it may at first appear that very little learning is taking place. However, the gains become more apparent over time in terms of long-term retention. 

If you enjoyed reading this, you might like to visit my TES page

Or buy me a coffee

Published by

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s