Associative Conditioning

Classical Conditioning

Pavlov’s experiment

He was interested in digestive systems of dogs. Then he notices that if we show food to dog, they start to salivate. If paired with sound (tuning fork) they start to salivate even if they just hear the sound. He defines two states:

  • Before conditioning
  • During conditioning
  • After conditioning state. Important words are conditioned stimulus, conditioned response. And their oppose (unconditioned). It is important that it is quite consistent.

Associate unconditioned stimulus with conditioned stimulus.

Importance of reliability

If the unconditioned stimulus is not well paired with the conditioned stimulus, then the conditioning is not well strenghtened.

Conditioning Theory-20250402143406788

Rescorla-Wagner model

The expected reward is modeled as a linear combination of the expected reward: $V = wu$ where u is an aleatoric variable that indicates whether the conditioned response is present, and $w$ is the strength of the presence (initially 0). Error is $\delta = R - V$ and you update it using something similar to The Perceptron Model.

It already explains many phenomenons:

  • Blocking, second tones are not going to be learned, because the first is enough to explain the prediction error (they are both still present).

Time Conditioning

Experiment with flies and electric shock and odor.

  • After the shock, it becomes a palliative, they like it.
  • Before or shortly after the shock, they don’t like it.

Conditioned galvanic response? Prof. thinks it is innate but difficult to test.

Operant Conditioning

They associate actions with rewards or bad things. Classical example is the one with mices. We set some vocabulary, which has been some confusion around these years.

  • Reinforcement (increase behaviour)
    • Negative (remove aversive stimulus (escape)), or avoidance, some behaviour that avoids this aversive stimulus (avoidance)).
    • Positive reinforcement (reward, e.g. food)
  • Punishment (decrease behaviour)
    • Positive punishment (add aversive stimulus, e.g. shock)
    • Negative punishment: remove something appetitive

Dopaminergic neurons

Neuromodulator’s pathways

There are some parts that are produced in specific parts of the brain, but then they influence the whole brain

Syntesis pathways

  • Dopamine -> three steps by tyrocine amminoacid
  • norepinephrine -> four steps from tyrocine, directly with dopamine (highly influenced by dopamine, but slightly different with that) Conditioning Theory-20250402145039237

Neuromodulators’ influence

  • Modulated -> facilitation or inversion of neuron activations.
  • effect on the brain plasticity of the neuro-modulators (some kind of push pull behaviour).
  • Influence ion channel excitability.
  • Difference neuro-modulators can have same effect
    • Linear effect (makes firing rate higher)
    • Non linear (changes firing pattern).
Conditioning Theory-20250402145443000

Dopamine

They model both when and occurrence!

  • NO-stimulus -> reward, makes neuron fire.
  • After conditioning, the dopamine neurons start to fire after the conditioned stimulus.
  • If no reward, there is a drop of the dopamine neurons (less reward), nicely explained by the rescorla-wagner model.
  • Reward for learning, when learned no firing animore, that is nice. (also predicts its interval).
Conditioning Theory-20250402145720028

Magnitute of predicted reward

More reward -> higher response. Smaller probability of prediction means lower activation (Rescorla ok!)

Difference between rewarded stimulus and unrewarded stimulus. Conditioning Theory-20250402150221108

Pathways of reward-based and avoidance learning

Conditioning Theory-20250402150657686

People with parkinson’s disease cannot learn much from trial and error. Ldopa increses dopamine (they learn to choose with medicine, but without it only aversive stimulus).

  • Dopamine helps learn active reward seeking
  • Absence of dopamine prevents avoiding behaviours. VTA has some dopinanergic input.
  • VTAx neurons reward prediction error firing (higher if it is better than expected, and lower if lower than expected).
  • Optogenetic experiment to test the above hypothesis (experimentally manipulate the firing of some neurons.). See Birdsong and Song System.
  • VTx neurons encode relative syllable quality compared to a tutor’s song.

Delayed rewards are discounted.