The prefrontal cortex has long been thought to subserve both working memory and "executive" functions. Although many computational models of working memory have been developed, the mechanistic basis of executive function remains elusive. PBWM is a computational model of the prefrontal cortex to control both itself and other brain areas in a strategic, task-appropriate manner. These learning mechanisms are based on subcortical structures in the midbrain, basal ganglia and amygdala, which together form an actor/critic architecture. The critic system learns which prefrontal representations are task-relevant and trains the actor, which in turn provides a dynamic gating mechanism for controlling working memory updating. Computationally, the learning mechanism is designed to simultaneously solve the temporal and structural credit assignment problems. The model's performance compares favorably with standard backpropagation-based temporal learning mechanisms on the challenging 1-2-AX working memory task, and other benchmark working memory tasks.
Model
First, there are multiple separate stripes in the prefrontal cortex and striatum layers. Each stripe can be independently updated, such that this system can remember several different things at the same time, each with a different "updating policy" of when memories are updated and maintained. The active maintenance of the memory is in prefrontal cortex, and the updating signals come from the striatum units. PVLV provides reinforcement learning signals to train up the dynamic gating system in the basal ganglia.
The sensory input is connected to the posterior cortex which is connected to the motor output. The sensory input is also linked to the PVLV system.
Posterior cortex
The posterior cortex form the hidden layers of the input/output mapping. The PFC is connected with the posterior cortex to contextualize this input/output mapping.
PFC
The PFC has a localist one-to-one representation of the input units for every stripe. Thus, you can look at these PFC representations and see directly what the network is maintaining. The PFC maintains the working memory needed to perform the task.
Striatum
This is the dynamic gating system representing the striatum units of the basal ganglia. Every even-index unit within a stripe represents "Go", while the odd-index units represent "NoGo." The Go units cause updating of the PFC, while the NoGo units cause the PFC to maintain its existing memory representation. There are groups of units for every stripe. In the PBWM model in Emergent, the matrices represent the striatum.
PVLV
All of these layers are part of PVLV system. The PVLV system controls the dopaminergic modulation of the basal ganglia. Thus, BG/PVLV form an actor-critic architecture where the PVLV system learns when to update.
SNrThal
SNrThal represents the substantia nigra pars reticulata and the associated area of the thalamus, which produce a competition among the Go/NoGo units within a given stripe and mediates competition using k-winners-take-alldynamics. If there is more overall Go activity in a given stripe, then the associated SNrThal unit gets activated, and it drives updating in PFC. For every stripe, there is one unit in SNrThal.