The Biology of Reward and Punishment (Part 2 of 3)
The human brain is designed to facilitate learning. When we understand this design, and its drivers, we can achieve self-mastery, happily.
In the first article of this 3-part series, we explored the risks associated with big change, which we see all too frequently in failed diets and New Year Resolutions. We unearthed the sad reality, that we fail because we don’t appreciate the underlying biology of learning. In this article, we explore the Biology of Reward and Punishment …
To survive, all animals must adapt their behavior in response to their experiences. Humans are no different. We call this learning.
To do this, we need the computational capacity (or brain power) to process the results of our behavior and actions. If you do something that is good for you, your brain measures the benefit, records it, and then integrates it into your future decisions.
Let’s imagine that you decide to have a good night’s sleep. That’s a healthy thing to do. Your brain records the positive consequences (or reward) of the good night’s sleep, stores this in your memory, and then serves it up again the next time you debate whether or not to go to bed early.
Similarly, our brain records, stores and retrieves bad experiences (or punishment) for future use. If you placed your hand on a hot stove, you got burnt, and it would be a very good idea not to do it again. Your brain logs this event, stores it, and retrieves it next time you’re in the proximity of a hot stove.
The simplest form of these complex thought patterns was described by Edward (Ted) Thorndike, a psychologist and researcher from Columbia University. In the late nineteenth century, he described his “law of effect”. You will recognize it as the stimulus-response model used by many subsequent scientists, including the famous Pavlov and his dogs. This law states that each of us has the ability to learn an appropriate response to a specific stimulus. If the outcome is positive, the association between stimulus and response is strengthened; if the outcome is negative, the connection is weakened.
There are far more complex pathways, including deliberate actions, and even situational variations, that our powerful brains can process, but it’s enough for our purposes to understand this simple stimulus-response framework. And I’m sure that you’re beginning to appreciate that the currency of learning is reward and punishment.
For survival, many choices need to be made instantaneously. Quick action requires an instantaneous evaluation of the value of each possible future action based on both expected rewards and expected punishments. For less urgent decisions, we get more time to weigh the likely balance of reward and punishment of our future actions or behavior. But the principal is the same; all of our important decisions are forward-looking. As we make that decision, we are anticipating expected reward and punishment.
Fortunately, Mother Nature has provided us with an elaborate system for integrating the results of previous actions (whether reward or punishment) into our behavior.
The Anatomy of Reward
Understanding the intricate biology of learning and behavioral adaption has been slow work. In part, this is the consequence of the very complexity of the problem and of Mother Nature’s beautiful solution. In part, it’s a function of the tiny size of the nerve centers that are controlling reward and punishment. In part, it relates to the difficulty in evaluating a fully functioning brain in a totally responsive animal or human under controlled circumstances. We can’t just take out a brain, take it apart, and understand how it works!
One of the tools that has helped enormously is the development of Functional Magnetic Resonance Imaging (or fMRI). These are brain scans that evaluate electrical and chemical activity within the awake brain as it performs its duties. Much of our recent insight comes from studies using this specific technology.
The circuitry of reward and punishment, although intimately related, are housed in different parts of the brain. As they drive a critical survival skill, it shouldn’t surprise you to find out that they both reside deep within our primitive, instinctual brains, and are amongst the earliest centers in our brain to have evolved. Both centers are common to all vertebrates (animals with a backbone), and serve as functional links between the pathways of sensory input and the pathways of behavioral output.
The primary brain structures responsible for reward include the ventral tegmental area, the ventral striatum, the dorsal striatum, and the substantia nigra. Activation of the reward circuitry results in the release of dopamine in the midbrain – a brain hormone that elevates mood and influences motivation.
The primary brain structure responsible for punishment is known as the habenular complex (sometimes simply known as the habenula). This tiny structure encodes negative feedback and drives the aversive learning process, helping us to avoid unpleasant experiences. It also sends signals deep into the midbrain, where it works in opposition to the reward system, impeding the release of the “happy hormones” dopamine and serotonin. Notably, these centers not only react to a direct punishment, but have also been shown to register the absence of reward.
These two components perform mirrored roles in the symmetric reward-punishment system. Good stimuli (and the expectation of reward) results in the release of “happy hormones”. Bad stimuli (and the expectation of punishment) inhibits the release of “happy hormones”.
The Expectation of Reward and Punishment
The element of expectation is critical to our understanding of happiness. The reason is simple, if not intuitive. When we expect reward, but it doesn’t happen, the brain senses this as punishment. Similarly, if we expect punishment, and it doesn’t happen, the brain registers this as reward. All this at the deep, chemical level.
A deep understanding of this complex regulatory circuitry and its chemical expression helps us to better manage our own aspirations, plans and expectations. We are better able to secure long term happiness.
In Part 3 of this series, we explore the Activation of Reward … what you need to do (differently) to harness the Power of Reward.
Dr. Roddy Carter, MD, has over 30 years of experience working across a range of medical disciplines and corporate settings.
At the height of his successful career, Roddy experienced his own health and happiness crisis. During this profoundly transformative time, he began applying his deep knowledge of performance neuroscience to his everyday life. He discovered that, in moments of trauma, the brain develops intricate psycho-protective adaptations to ensure our short-term survival; however, these adaptations often impose substantial residual limitations, create distress, and prevent us from reaching our full innate potential.