The problem of linear A-levels and how cognitive load theory may help us.

Whilst I have many anxieties about moving to linear A-levels (see previous blog), I think this period of change is an ideal time for us to use evidence-based approaches to improve our practice.  As our schemes of learning evolve, we can reshape the what and how of our curriculum.

There is a coherent canon of strategies emerging from cognitive science that provide a clear evidence-based approach to curriculum design.    Not all pedagogical methods are equal, and there is an opportunity cost to relying on discovery methods with novices.  The research evidence is helping us shine a light on how students learn.  This knowledge of pupil learning can seep into our schemes of work, lesson plans and pedagogical discussions.  Finally, the ‘how’ we learn is linking to the what and why we learn.  Sweller’s cognitive load theory helps us think about the relationship between learning and memory and provides clear examples of what might help and why.

What is the relationship between learning and memory?

My own PGCE was bereft of any mention of memory or the relationship between learning and memory.  On reflection, it seems difficult to define learning at all without some mention of it.

David Didau, ‘Learning is the long term retention of knowledge and skills and the ability to transfer between them.’

Kirschner, Sweller & Clark put it, “If nothing has changed in long-term memory, nothing has been learned.”

Dan Willingham “memory is as thinking does … whatever students think about is what they will remember.  Memory is the residue of thought.”

How can we adapt our teaching to improve long term memory and develop understanding?  Recent ideas surrounding cognitive load theory may help us as we rethink our curriculum design and pedagogical preferences. Not all cognitive load is bad, however our ability to learn is impacted when the working memory is overburdened  with intrinsic or extraneous cognitive load.



Cognitive Load Theory

Sweller’s (1988) Cognitive Load Theory (CLT) suggests that our working memory is only able to hold a limited amount of information (around 4 chunks) at any one time and that our teaching methods should avoid overloading our working memory to maximize learning.

Human cognitive architecture.

 Building on the work of Baddelely & Hitch (1974) – CLT views human cognitive architecture as comprising a Working Memory (WM) and a Long Term Memory (LTM).



Our WM is limited but has many different components that are responsible for directing attention and coordinating cognitive processes.  It is like a mental post-it note, with a limited capacity to hold and use information for a short amount of time.  It is fragile, unstable and easily distracted.

LTM on the other hand, has an endless capacity for storage and works with WM to retrieve information. Our LTM holds all of our schemas or pockets of information about the world and if we can make links in our explanations it reduces the cognitive load.



Neuroscience makes a clear distinction between the two. It holds that working memory is related to temporary activation of neurons in the brain. In contrast, long-term memory is thought to be related to physical changes to neurons and their connections. This can explain the short-term nature of working memory as well as its greater susceptibility to interruptions or physical shocks.

If the cognitive load overwhelms our processing ability we will struggle to complete an activity successfully.  Learning is hampered when the working memory capacity is exceeded in a learning task.  We need to help students transfer the knowledge to long term memory, which will reduce their cognitive load when asked to draw on this material.  Although, if subject-knowledge is incomplete, the student will not be able to fall back on their long term memory which will lead to further cognitive overload, failure to follow instructions, incomplete recall and task abandonment.

What can we do to reduce cognitive load?

Activate existing schemas and prior knowledge before sharing information – this makes it easier to accommodate and assimilate new information.

Use verbal and visual information to present to students. Our working memory processes these types of information in different subsystems the Visual Spatial Sketchpad and the Phonological Loop so using both pathways will reduce the cognitive burden on working memory (the modality effect).

Types of cognitive load.

Sweller identifies three types of cognitive load:


Intrinsic Load – inherent difficulty of the subject matter being learnt.  It seems as though there is a sweet spot between Bjork’s desirable difficulties and Sweller’s cognitive overload.  If the task is too difficult it will overload the working memory and make it difficult to process the information.  The intrinsic load can be decreased by instructional techniques such as ‘simple to complex’ explanations or part-whole explanations where individual elements are taught first before an integrated task.

Extraneous Load – How the subject matter is taught. “A load that is not necessary for learning” this is a bad type of cognitive load as it does not directly contribute to learning.  This includes poor instructional techniques or any activities which distract from the learning task itself.

Harry Fletcher-Wood suggests we must be aware of cognitive load theory in the planing and preparation of our lessons.  Extraneous cognitive load is a distraction caused by tasks which occupy working memory but do not contribute to the formation of long-term memories. This includes:

  • Splitting attention: asking students to refer to two sources of information simultaneously: we can avoid this by ensuring information, such as labels, is where it will be needed.
  • Redundancy: information which adds nothing detracts from learning: we can avoid this by removing irrelevant and distracting labels and text.
  • Expertise reversal: support which helps novices, like model answers, can hinder experts; we can monitor students’ success closely and reduce support as students become more skilled.

Germane Load – This is the good type of cognitive load.  This refers to the load imposed on working memory when it is processing and transferring to LTM.  We should choose teaching methods that create germane cognitive load.  We must redirect learners’ attention from processes that are a distraction and not relevant to learning and direct them towards processes that are relevant to learning and new schema construction.

How to reduce Cognitive Load?

There seems to be two obvious ways to implement the ideas surrounding CLT.  First, in the planning and sequencing of learning.  Our schemes must break down subject content and consider moving from the simple to the complex.  Dan Willingham tells us that stories are psychologically privileged as the mind thinks and remembers in stories.  We must use this human mental architecture to benefit our students.  What stories are we going to tell?

Second, the ideas of CLT will also influence in how we tell these stories.

  1. Explicit Teaching / Direct Instruction.

“Decades of research clearly demonstrate that for novices, direct instruction is more effective and more efficient than partial guidance. So when teaching new content and skills to novices, teachers are more effective when they provide explicit guidance accompanied by practice and feedback”

Clark, Kirshner & Sweller, 2012 p.6

CLT suggests that when learning new material explicit or direct teaching methods are best as they reduce cognitive load. They should not ‘discover’ the learning for themselves. Teaching novices is most effective when new content and skills are explicitly taught alongside practice and feedback.  One of the best blogs I have read on explicit teaching is by Ben Newmark 10 Principles for Great Explicit Teaching

Ben celebrates the welcome return of the sage on the stage approach and gives us a comprehensive run through of the essential ingredients needed for explicit teaching, with the rallying cry that we should all practice our explanations.  Ben bravely films himself and asks for student feedback on the quality of the explanation.  I am not sure I am that brave but I can see that time spent planning and refining how I am going to explain a concept can be more worthwhile than other PPA activities.  We should all update our subject-specific knowledge, teach children to listen silently and not interrupt. Use storytelling techniques, rhetorical questions, cliff-hangers, metaphors and analogies. We should practice and rehearse delivering explanations.  Ben argues we should see explanations as short theatrical performances which need refining and honing.  We should use boardwork to structure learning, with clear neat illustrations, vary our tone, inflection and cadence. Repeat explanations and link back, teach from the front and ask for feedback.  Here is an Oliver Cavligliol visual summary:


2. Worked Examples

The Sweller research suggests that students need cognitive support when learning how to solve problems.  Unguided problem-solving puts too much burden on the working memory preventing students from transferring information into their long term memories.

“Worked examples can effectively provide us with problem-solving schemas that need to be stored in the long-term memory using the information store principle”

“Worked examples impose a relatively low working memory load … compared to solving problems with a means-end search”

Sweller, Ayres, Kalyuga (2011) via Oliver Cavigliol.

Students benefit from model answers and worked examples, particularly step-by-step guides.  Providing students with models and worked examples can help them learn to solve problems faster.


My favourite example of this is a recent blog and youtube clip by John Tompsett on walking through a mock paper and talking through all the meta-cognitive thinking with a visualiser.  John provides further examples in a later blog on the art of the paragraph, where live writing is used to help structure extended writing.

This technique will not surprise most teachers, as model answers and worked examples are a staple of most classrooms.  For me, most lessons will involve essay plans, model answers and what sociology examiners like to call – a clear chain of reasoning.  In the social sciences, the mark schemes do not require or provide a specific structure – so we have become good at developing our own structures and writing frames – PERC, PEEL, PERVERT, PEE.

I am attempting to use knowledge organisers more effectively to reduce cognitive load – with a clear structure using the assessment criteria of  A01, A02 and A03.  This gives students clear hooks to hang the learning on.

3. Expertise Reversal Effect.

Sweller suggests that we need to be aware of the differences between novices and experts.  The models and worked examples that provide effective cognitive support for novices become a hinderance for experts.  When a student has become an effective learner, the additional support is a distraction and now becomes redundant.  We need to monitor our students closely and take away the scaffolding / stabilisers when it is apparent that they are no longer needed or that they are hindering further progress.

Both these previous points are clearly applied in the classroom when a teacher shows a model answer, then provides a writing frame to help students structure an answer which eventually becomes just a question as the students retain more knowledge and become more expert at the task.

4. The Redundancy Effect.

Students do not learn when their limited working memory is directed to unnecessary or redundant information. When learners are presented with redundant or additional information not directly relevant to learning or with the same information in different forms.  Providing learners with additional information is not harmless it can be a cause of instructional failure.  For example, A textbook that has both a diagram AND text with the same information.  A powerpoint presentation in which the presenter reads the text presented on the screen.  This inhibits learning as it overloads the working memory.

5. Split-Attention Effect

When learners are required to process two or more sources of information simultaneously in order to understand the material. When a diagram is used but it cannot be understood without reading the text at the same time.   Learner is required to hold both bits of information at the same time.  The learner is left to amalgamate both lots of information which can contribute to cognitive overload.  If one of the sources adds nothing new, it should be eliminated or the information should be physically integrated following a dual coding approach which will reduce cognitive load.

This year I have been focusing on my board work and abandoning many of my powerpoint slides where possible.  This ‘retro’ approach has helped me finesse my explanations and think about my use of dual coding – what words and images will help embed this concept.  Whilst my art-work is not perfect, I feel it encourages the students to adopt a similar approach in their own notes and prevents split attention.

6. Modality effect.

It is possible to reduce extraneous load by using more than one mode of communication – evidence from working memory suggests that presenting information in both auditory and visual working memory can increase memory capacity. For example when using a diagram and text to explain a concept the text can be communicated in spoken form. Using both channels increases the capacity of working memory.

Oliver Caviglioli summarises the evidence for dual coding in his blog about the evidence surrounding visuals and learning.  Visuals support attention and bring essential information to the fore.  Visuals activate prior knowledge and help build mental models.   Using both channels can absorb more information than usually possible avoiding cognitive overload.

Final thoughts.

There are a range of problems and challenges with linear A-levels. Two years in and I am still developing my thinking about how best to deliver.  Nonetheless, this might also be an opportunity for us to re-shape and embed the lessons of evidence-based practice and to start creating the type of learning that we want to see in our classrooms.  We need to stop pretending that all teaching methods are equal and start putting explicit teaching front and centre, in the same way that phonics has become a staple of the primary classroom.  We need to spend less time marking and data tracking and more time developing our subject-specialism and refining our explanations. We need to not be shy about getting feedback on those explanations. We also need to know more about how memory works and what we can do to aid the transfer of knowledge and skills between different contexts.


Caviliglioli, O. (2018) Six Ways visuals help learning.

Didau, D. (2016) When do novices become experts.

Clark, R, Kirschner, P & Sweller, J 2012, ‘Putting students on the path to learning: The case for fully guided instruction’, American Educator, Spring, pp. 6-11.

Fletcher-Wood, H. (2018) Planning Lessons Using Cognitive Load Theory.

Newman, B. (2017) Ten Principles for Great Explicit Teaching

Proffitt, Cath. (2018) Danger – Overload

Shibli, D, West R. (2018) Cognitive Load Theory and its application in the classroom. Impact: The Science of Learning. Chartered College of Teaching p.18-20.

Sweller, J. (1988). Cognitive Load during Problem Solving: Effects on Learning. Cognitive Science, 12, p.257-285.

John Sweller, Paul Ayres,‎ Slava Kalyuga (2011)Cognitive Load Theory (Explorations in the Learning Sciences, Instructional Systems and Performance Technologies).  Springer.