The authors have declared that no competing interests exist.
The activity of a border ownership selective (BOS) neuron indicates where a foreground object is located relative to its (classical) receptive field (RF). A population of BOS neurons thus provides an important component of perceptual grouping, the organization of the visual scene into objects. In previous theoretical work, it has been suggested that this grouping mechanism is implemented by a population of dedicated grouping (“G”) cells that integrate the activity of the distributed feature cells representing an object and, by feedback, modulate the same cells, thus making them border ownership selective. The feedback modulation by G cells is thought to also provide the mechanism for object-based attention. A recent modeling study showed that modulatory common feedback, implemented by synapses with N-methyl-D-aspartate (NMDA)-type glutamate receptors, accounts for the experimentally observed synchrony in spike trains of BOS neurons and the shape of cross-correlations between them, including its dependence on the attentional state. However, that study was limited to pairs of BOS neurons with consistent border ownership preferences, defined as two neurons tuned to respond to the same visual object, in which attention decreases synchrony. But attention has also been shown to increase synchrony in neurons with inconsistent border ownership selectivity. Here we extend the computational model from the previous study to fully understand these effects of attention. We postulate the existence of a second type of G-cell that represents spatial attention by modulating the activity of all BOS cells in a spatially defined area. Simulations of this model show that a combination of spatial and object-based mechanisms fully accounts for the observed pattern of synchrony between BOS neurons. Our results suggest that modulatory feedback from G-cells may underlie both spatial and object-based attention.
Vision allows us to make sense out of a very complex signal, the patterns of light rays reaching our eyes. Two mechanisms are essential for this: perceptual organization which structures the input into meaningful visual objects, and attention which selects only the most important parts in the input. Prior work suggests that both of these mechanisms are implemented by neurons called grouping cells. These organize the object features into coherent entities (perceptual grouping) and access them as needed (selective attention). For technical reasons it is difficult to observe grouping cells but their effect can be seen in the influence they have on responses of other classes of cells. These responses have been measured experimentally and it was found that they depend in unexpected ways on where the subject was attending. Using a computational model, we here demonstrate that the responses can be understood in terms of the interaction between two kinds of selective attention, both of which are known to occur in primate perception. One is attention to a specific area in the environment, the other is to specific objects. A model including both of these attentional mechanisms generates neuronal responses in agreement with the observed patterns of neural activity.
In this study, we focus on the interplay of two features of intermediate vision. The first is selective attention which enhances perception of particular sensory stimuli [1–5]. Top-down visual attention can be categorized into at least three distinct types: spatial, feature-based, and object-based attention [6–8]. In this study, we focus on spatial and object-based attention. It has been suggested that these types of attention rely on distinct cortical pathways [9] for enhancing the related neural responses and for improving the discriminability of visual stimuli [10–12].
The second feature is figure-ground segregation, the integration of visual features into objects and the segmentation between different objects and the background. This is an important step in understanding complex scenes, with images of many objects projected simultaneously onto the retinae. It has been proposed that an early step in figure-ground segregation is to establish on which side of the border of a foreground object (figure) this object is located. This has been called border ownership since the foreground object, which is closer to the observer, determines the fate of the border (e.g. when the object is moving) and therefore “owns” it [13–15].
The majority of neurons in intermediate-level visual area V2 are border ownership selective, with their responses depending on which side of a border owns the border. These are the Border Ownership Selective (BOS) neurons [16]. Various characteristics and mechanisms of BOS neurons have been investigated through physiological methods [17–20], studies of human perception [21–23] and computational models [24–31]. One computational method that is designed to draw conclusions on the structure of the neuronal circuitry underlying the observed activity patterns is the analysis of neuronal correlations. In particular, common input plays a critical role for inducing synchronized responses between postsynaptic neurons. For this reason, it is believed that analyses of spike train correlations and spike synchrony between neurons can provide insights into neuronal connectivity [32–34] (but see ref [35] for a cautionary note). Intriguingly, several studies have indicated that common input may modulate the activities of postsynaptic neurons rather than driving it, i.e. increase its mean firing rate by itself. For example, modulatory input from higher visual areas increases the firing rates of striate (V1) and extrastriate (V2) neurons [25, 36] related to top-down attention [37] and figure-ground segregation [16, 38]. In this case, common input spikes from a higher area may not evoke action potentials in the target neurons by themselves but will transiently enhance the firing rate increase that is caused by feedforward input from bottom-up visual stimuli.
Recently Martin and von der Heydt [39] have physiologically characterized effects of grouping structure and attention on spike train correlation between BOS neurons (Fig 1). Building on previous work [18] that failed to support the “binding-by-synchrony” hypothesis (review: ref [40]), their work showed that spiking synchrony between neurons depends on their border ownership selectivity. For pairs of neurons with consistent border ownership preference, stimulation by a common object increased spiking synchrony, but selective attention to the object, while increasing firing rates, decreased spiking synchrony. A recently proposed computational model explains both the firing rate changes and synchrony structure in these neurons [41]. The model is based on the assumption that feedback from hypothetical grouping cells (G-cells) at higher visual areas modulates the activities of BOS neurons. Importantly, the feedback does not drive the activity of neurons by itself but rather shapes the activity caused by visual input. The model postulates that this modulatory feedback is implemented by glutamatergic synapses of the N-methyl-D-aspartate (NMDA) type [42, 43]. Activation of NMDA receptors by itself does not increase firing rates of postsynaptic neurons substantially, but it increases the effect of excitatory input from other types of receptors, typically of the glutamatergic α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) type.

![Pairs of BOS neurons and conditions for visual input and attention (modified from ref [39]).](/dataresources/secured/content-1766021930388-ea36e916-5e4f-487c-815e-6c3993d1358d/assets/pcbi.1008829.g001.jpg)
Pairs of BOS neurons and conditions for visual input and attention (modified from ref [39]).
A: Stimulus displays for testing the effects of object integration and selective attention. Ellipses on the borders of the keystone-like objects represent the receptive fields (RFs) of border ownership selective (BOS) neurons. In each display, three separate objects are presented. In the left display the RFs of two neurons are on the borders of two different objects (“unbound” condition). In the middle and right displays the two RFs lie on the borders of the same object (“bound” condition). Note that the visual stimuli in and around the RFs are identical in all three conditions, but represent parts of two separate objects in the left display and parts of the same object in the other two displays. In these experiments, the monkey attended one of the objects, as shown by a red star (not part of the display). Such an object is called “attended” while objects that are not attended are referred to as “ignored.” B: Consistent and inconsistent pairs. Arrows from an RF point toward the preferred side of the corresponding BOS neuron. Subscripts L and R indicate left and right side-of-figure preferences, respectively, while retinotopic position is represented by the superscripts “1” and “2”. RFs of neurons whose border ownership preferences are consistent with representing a common object are connected by black dashed lines (“consistent pairs”), while RFs of neurons with inconsistent preferences are connected by gray dashed lines (“inconsistent pairs”).
A second major result of the study by Martin and von der Heydt [39] is that for pairs of neurons with inconsistent BOS preference, attention to the common object increased spike synchrony. This observation was not addressed in the computational model proposed in ref. [41] which focused strictly on consistent neuronal pairs and the paradoxical reduction of synchrony by attention. The main goal of the present study is to understand why attention can increase synchrony between inconsistent pairs.
As in previous models, we assume that responses of model BOS neurons are modulated by feedback from G-cells which organize the responses of BOS neurons and mediate top-down attention [25, 26, 28]. Going beyond earlier work, we assume two distinct classes of G-cells, one responsible for spatial attention and one for attention to objects (Fig 2). Both classes of G-cells provide modulatory feedback to BOS neurons via NMDA synapses. Self et al. [42] have reported that feedforward input to V1 is mainly provided by AMPA type synaptic currents whereas feedback signals mediated by NMDA synaptic receptor underlie figure-ground modulation. Herrero et al. [43] showed the importance of NMDA receptors for mediating the feedback signals including selective attention. Simulations of the proposed model indicate overall agreement with responses of BOS neurons as reported by Martin and von der Heydt [39]. These results suggest that feedback signals modulate the responses of feature selective neurons in lower-level visual areas, and that there are two types of feedback signals, one that serves spatial attention, and another that facilitates object-based attention.


Model architecture.
Two different types of G-cells (balls with “G”) are part of the model: spatial grouping cells (Gsp) and object-based grouping cells (Gobj). Whereas Gsp-cells implement spatial attention, i.e. attention to everything within a circumscribed spatially defined area, Gobj-cells impart grouping structure of objects in the scene and mediate object-based attention. The latter are similar to the grouping cells from refs [25, 26, 41]. Feedback signals from these G-cells modulate activity of BOS neurons (balls with “BOS”) by NMDA-type connections (gray downward pointing arrows). Black and gray ellipses represent the locations of RFs of BOS neurons which are driven by visual input through AMPA-type synapses (black upwards pointing arrows). Black and gray horizontal arrows from RFs point toward the preferred side of the corresponding BOS neuron. For description of subscripts and superscripts of BOS cells see text.
In this study we hypothesize, and we support this hypothesis by computational studies, that the feedback signals are responsible for the neurophysiologically observed correlation structure between BOS neurons. All BOS neurons that represent different parts of the same perceptual object receive input from those G cells that represent this object. This common input generates correlation between their spike trains. The situation becomes more complicated, however, when attention is taken into account. We showed in our previous study [41] the non-monotonic effect of G-cell firing rates on synchrony between border ownership selective (BOS) cells (see Fig 4 of [41]). The observed lower correlation between BOS neurons representing parts of the same object when this object is attended compared to when it is not attended could then be understood (see Fig 3 of [41]) since the effects of both grouping and attention to objects modify the firing rates of BOS cells. That study did, however, not deal with inconsistent BOS cell pairs. In the present report we show that increased synchrony due to attention can be explained as a consequence of spatial attention, implemented in a separate class of G cells.
To investigate the neuronal mechanisms of border ownership selectivity in visual cortex, we study the behavior of our model (Fig 2) with simulated versions of visual input and a simple model for the animal’s attentional state. Border ownership relations are represented by BOS neurons, four of which (,
Per the definition in Fig 1B,
All results of our study are formulated in terms of changes of mean firing rates of BOS neurons and of the spike-spike correlation functions of pairs of BOS neurons. Any pair of BOS neurons can be in one of four possible states with respect to the activating visual object(s). These four states are the combinations of two types, bound vs. unbound, and attended vs. not attended. For the first state, the two members of the BOS neuron pair can represent parts of the same object, or of different objects. In the first case, the neurons participate in the representation of an integrated, or bound, object. In the second case, they represent parts of different objects which are not bound to a coherent object. For simplicity, we call this an “unbound” situation, see Fig 1. The second state regards attention. Attention may be either on an object with some of its parts in the RF of one or both of the neurons, or attention can be at a position elsewhere in the visual field, far from the RFs of the considered neurons. In the first case, attention is on the BOS neuron(s) while in the second case both are ignored. These two binary types of states (attended vs. ignored, bound vs. unbound) are considered independent of each other, therefore there are four possible combinations. We do not consider one of these conditions, unbound-attended, since we are not aware of any neurophysiological experiments addressing situations in which attention is directed towards disconnected visual features.
In our model, the attentional state is represented by the activity level of G cells. We assume that G cells representing an attended object have a higher firing rate than those representing an ignored object, see Table 1. Our model does not describe how these activity levels derive, there are many models of the control of selective attention that describe possible mechanisms, e.g. [44] for bottom-up attention or ref [45] for top-down attention.

Stimulus in ![]() |
![]() |
![]() |
![]() | Gsp |
|---|---|---|---|---|
| Unbound-ignored | 5Hz | 30Hz | 30Hz | 3Hz |
| Bound-ignored | 30Hz | 5Hz | 5Hz | 3Hz |
| Bound-attended | 60Hz | 2.5Hz | 2.5Hz | 15Hz |
Likewise, we assume that a G cell has a higher firing rate in the bound condition than in the unbound condition, because in the bound condition it integrates features all around the boundary of the object, each of which is represented by the activity of individual BOS neurons, whereas in the unbound condition, it integrates only two edges, each from one of the lateral objects (dashed outlines in Fig 2). Also the two grouping cells which would represent the lateral objects,
Mechanisms leading to increased G cell activity are quantitatively described in several published computational studies [25, 26, 28–30, 46]. While the argument presented so far applies only to isolated objects, new experimental results [47] show that also G cells representing foreground objects in cluttered scenes likely have substantially higher firing rates than G cells representing the partially obscured objects, see the section on cluttered scenes in the Discussion. In this study, we therefore assume that differential activity levels (firing rates) are present for bound (higher rates) and unbound conditions (lower rates), without explicitly implementing how these differences arise.
The firing rates of model G-cells for all three stimulus and attention conditions are summarized in Table 1 and in the Materials and methods section. For cell
Columns ‘
Finally, column ‘Gsp’ of Table 1 shows the firing rates of the second type of G cell, which represents the effect of spatial attention, see Fig 2. If attention is on the center region of the scene, as shown in that figure, this cell fires with a frequency of 15Hz (row ‘Bound-attended’), if this region is not attended, its firing rate is 3Hz (rows ‘Unbound-ignored’ and ‘Bound-ignored’).
In the following sections, we report numerical results of simulations of the model defined above. Population results are given for two measures. One is the firing rate of subsets of BOS neurons, the other are correlations (loose and tight, see below) between sets of pairs of neurons. For the latter, we already have defined consistent and inconsistent pairs of neurons, see Fig 1. For the former, it is important to note that firing rates are a property of neurons, not pairs of neurons. We differentiate between those neurons whose border ownership preference is towards the object centered between their RFs (gray keystone-like object in Fig 1 and gray parallelogram in Fig 2) and those whose border ownership preference is another object. We call the former “preferred neurons” and the latter “non-preferred” neurons. Note that both members of a consistent pair are preferred neurons. At least one of the members of an inconsistent pair is a non-preferred neuron, the other can be preferred or non-preferred, see Fig 1.
First, we investigate the influence of G-cell activity levels on the spiking frequencies of model BOS neurons. Fig 3A shows spike raster plots of BOS neurons for 100 simulated trials where G-cells are activated in the Unbound-ignored condition between 0 and 1000 ms, in the Bound-ignored condition between 1000 and 2000 ms, and in the Bound-attended condition otherwise. The feedback from G-cells modulates the activities of all BOS neurons.


Responses of BOS neurons.
A: Raster plots showing 100 spike trains of BOS model neurons. For these plots, G-cells were activated for representing the Unbound-ignored condition between 0 and 1000 ms, the Bound-ignored condition for 1000 and 2000 ms, and the Bound-attended condition for 2000 and 3000 ms. Feedback from G-cells modulates the firing rates of BOS neurons. Identities of BOS neurons are shown next to each plot. B: Firing rates of preferred neurons (
Fig 3B and 3C show the mean firing rates of BOS neurons for the Unbound-ignored (gray), Bound-ignored (black) and Bound-attended (red) conditions (see Fig 1 for definitions of these terms), in the preferred (Fig 3B) and non-preferred conditions (Fig 3C). Note that the firing rates of preferred neurons are significantly higher in the bound condition than in the unbound condition (Fig 3B; t-test, p = 4.9 × 10−44, effect size r = 1.0). They are also significantly higher in the Bound-attended condition than in the Bound-ignored condition (t-test, p = 3.6 × 10−44, r = 1.0), which is in agreement with physiological results [17, 39]. In contrast to the preferred neurons, the firing rates of the non-preferred neurons are significantly higher in the unbound condition than in the bound condition (Fig 3C; t-test, p = 2.7 × 10−45, r = 1.0). The firing rates of these model neurons are also slightly but significantly increased in the Bound-attended condition relative to the Bound-ignored condition (t-test, p = 2.1 × 10−30, r = 1.0). These results imply that the activities of G-cells significantly modulate the spike frequency of BOS neurons in our proposed network.
We next quantify the loose (correlations on the order of tens of milliseconds) and tight (order of milliseconds) synchrony between BOS neurons of consistent and inconsistent pairs (Fig 1B) using the methods of ref [41].
We define loose synchrony as the integral of spike correlations in the range of a ±40ms interval around lag zero (see Materials and methods section). Loose correlations between


Correlations between BOS neurons.
The gray, black, and red lines/bars represent the correlation (loose synchrony) of the Unbound-ignored, Bound-ignored, and Bound-attended conditions, respectively. A, C: Experimentally observed mean spike train cross-correlation and loose synchrony for consistent BOS neurons, modified from [39]. B, D: Model spike train cross-correlation and loose synchrony for the consistent pair. Confidence intervals of loose synchrony of this pair for the Unbound-ignored, Bound-ignored, and Bound-attended (panel D) were 1.13 ± 0.01 (SD = 0.01), 1.39 ± 0.02 (SD = 0.03), and 1.21 ± 0.01 (SD = 0.02) coincidences/s, respectively. E, G: Experimentally observed mean spike train cross-correlation and loose synchrony for inconsistent BOS neurons. F, H: Model spike train cross-correlation and loose synchrony for the inconsistent pairs. Inset in F shows detail of center region at higher scale. Confidence intervals of loose synchrony of these pairs for the Unbound-ignored, Bound-ignored, and Bound-attended (panel H) were 0.30 ± 0.01 (SD = 0.02), 0.23 ± 0.01 (SD = 0.01), and 0.34 ± 0.01 (SD = 0.01) coincidences/s, respectively. Curves in all panels are normalized by the maximum correlation value of the consistent pair. The observed maximum values were: A:54, B:24, E: 26, F:5 coincidences/s2. Asterisks indicate significant differences between conditions (** p < 0.01, t-test). Error bars indicate SDs.
Martin and von der Heydt [39] also reported synchrony results for inconsistent pairs (Fig 4E and 4G). These pairs were defined as all possible pairs of BOS neurons with the exception of the consistent pair, as illustrated in Fig 1B. In our model, the inconsistent pairs are represented by three pairs;
We also computed loose synchrony of the inconsistent pairs by integrating the correlation in the range of ±40ms interval around lag zero (Fig 4H). Conventions are the same as those in Fig 4D. The level of loose synchrony for the Bound-attended condition (red bar in Fig 4H) is significantly higher than that for the Bound-ignored condition (black bar in Fig 4H)(t-test, p = 2.8 × 10−16, r = 0.99). Even though loose synchrony for the Unbound-ignored condition (gray bar in Fig 4H) is at a similar level to that for the Bound-attended condition, technically it is significantly different (t-test, p = 3.9 × 10−6, r = 0.84) because of the large amount of simulated data (10 sets of 100 simulated trials). In contrast to the consistent pair (Fig 4D), we found that the loose synchrony for the Bound-ignored condition is the lowest among these three conditions. These modulation patterns of loose synchrony for the inconsistent pairs are opposite to those for the consistent pair. This attentional enhancement of loose synchrony for the inconsistent pairs agree with that of experimentally observed BOS neurons [39]. Our simulation results show that the interaction between two distinct types of attentional feedback signals can explain why correlations between consistent and inconsistent pairs of BOS neurons are of opposite polarity. We investigate the mechanisms underlying this surprising result in sections below.
To further quantify the responses of our model, we compute the noise correlation between BOS neurons of consistent and inconsistent pairs (S1 Fig). The noise correlation for the consistent pair with respect to the Unbound-ignored, Bound-ignored and Bound-attended conditions are summarized in S1(A) Fig. There is no significant difference in the noise correlation between ignored conditions (Unbound-ignored and Bound-ignored) (t-test, p = 0.97, r = 0.01). By contrast, we found a significant decrease in noise correlation between the Bound-ignored condition and the Bound-attended condition (t-test, p = 9.4 × 10−3, r = 0.57). S1(B) Fig shows the noise correlation of the inconsistent pairs. As for the consistent pairs, we found no significant difference between ignored conditions (Unbound-ignored and Bound-ignored) (t-test, p = 0.35, r = 0.22). Different from the consistent pair, there is also no significant difference between the noise correlation in Bound-attended and the Bound-ignored conditions (t-test, p = 0.48, r = 0.17).
One plausible cause of synchrony between spike trains of two BOS neurons is common input to both. Correlation functions of the inputs to BOS neurons are shown in S2 Fig. There are no significant differences in the means of the loose synchrony between Gsp- and
The neurophysiological data in Fig 4A and 4E show some evidence of periodicity in the beta/gamma range which is absent in the simulated data. Martin and von der Heydt [39] did not found statistically significant differences in this spectral range between the different experimental conditions (see their discussion of their Fig 4 on which our Fig 4 is based). This was the case for both spike-spike and spike-field coherence. We therefore do not assign significance to these oscillations.
Martin and von der Heydt [39] investigated tight synchrony [50, 51] between BOS neurons with respect to consistent and inconsistent pairs using a transformation of the original spike trains in which spikes are distributed randomly within a jitter window of width Δ = 20ms (jittered spike train). A large number of correlations between jittered spike trains were computed, and their mean was subtracted from the original spike correlation, resulting in the jitter-reduced correlation (tight correlation). This procedure removes all correlations at times scales larger than the jitter window Δ, revealing the underlying tight synchrony [50, 51]. A detailed description of the procedure for computing tight correlation and synchrony is given in the Materials and Methods section. We computed tight synchrony by integrating the tight correlation in the range of ±5ms interval around lag zero.
The experimentally observed tight correlations for the consistent pair shows significant peaks at zero lag in the bound conditions, but not in the unbound condition [39]. Data are reproduced in Fig 5A which shows the tight correlations of the consistent pair for three conditions. Under the bound conditions (black and red lines in Fig 5A), there are marked peaks of tight correlation around zero lag. Fig 5B shows the corresponding curves from our simulation of the model, obtained by applying the same jitter method as in the experimental results to our simulation data. Tight synchrony for the consistent pair in physiological experiments and our simulations are summarized in Fig 5C and 5D, respectively. Fig 5E shows experimental tight correlation for the inconsistent pairs and Fig 5F the corresponding simulation results. Experimentally observed tight synchrony for inconsistent pairs is shown in Fig 5G. Fig 5H summarizes model tight synchrony for inconsistent pairs.


Tight synchrony between model BOS neurons.
Reduced cross-correlations after subtraction of Δ = 20ms interval jitter cross-correlation. A, C: Experimentally observed mean tight correlation and tight synchrony of the consistent pair for the Unbound-ignored (gray), Bound-ignored (black), and Bound-attended (red) conditions from [39]. B, D: Simulated tight correlation and tight synchrony of the consistent pair. Conventions are the same as those in A and C. Confidence intervals of tight synchrony of this pair for the Unbound-ignored, Bound-ignored, and Bound-attended conditions (panel D) were 0.0028 ± 0.0008 (SD = 0.0013), 0.0061 ± 0.0013 (SD = 0.0020), and 0.0076 ± 0.0018 (SD = 0.0029) coincidences/s, respectively. E, G: Experimentally observed mean tight correlation and tight synchrony of the inconsistent pair. F, H: Simulated tight correlation and tight synchrony of the inconsistent pairs. Inset in the panel F shows central area at higher scale. Confidence intervals of tight synchrony of these pairs for the Unbound-ignored, Bound-ignored, and Bound-attended conditions (panel H) were 0.0005 ± 0.0005 (SD = 0.0008), 0.0009 ± 0.0006 (SD = 0.0010), and 0.0012 ± 0.0005 (SD = 0.0007) coincidences/s, respectively. Curves in all panels are normalized by the maximum value of the “Bound-attended” condition of the consistent pair. The observed maximal values were: A:17.4, B:3.7, C:3.4, D:0.5 coincidences/s2. Asterisks indicate significant differences between conditions (* p < 0.05, ** p < 0.01, t-test). Error bars indicate SDs.
Irrespective of grouping structure and attention conditions, inconsistent pairs in the experiment did not show a significant peak of tight correlation at zero lag [39](Fig 5E). We computed tight correlation based on our simulation data for inconsistent pairs (Fig 5F). There was no marked peak around zero lag under Unbound-ignored condition (gray line). By contrast, under the bound conditions, we found peaks of tight correlation for inconsistent pairs (black and red lines). However, the peak value of tight correlation for the inconsistent pairs under the Bound-attended conditions (see caption of Fig 5) was markedly weaker than that for the consistent pair. Overall, tight correlation for inconsistent pairs in our model is similar to that observed physiologically. No significant differences between the levels of tight synchrony of inconsistent pairs was observed in Martin and von der Heydt’s electrophysiological recordings [39] (Fig 5H). They found, however, that in the Bound-ignored condition, tight synchrony between members of consistent pairs was significantly higher than that of inconsistent pairs (Fig 5C and 5G). Likewise, simulation of the Bound-ignored condition showed a significantly higher tight synchrony between consistent pairs (the black bar in Fig 5D) than between inconsistent pairs (the black bar in Fig 5H) (t-test, p = 1.8 × 10−6, r = 0.85). Additionally, there were no significant differences between levels of tight synchrony of inconsistent pairs in our simulation results (Fig 5H; t-test for Unbound-ignored vs. Bound-ignored, p = 0.35, r = 0.22; t-test for Bound-ignored vs. Bound-attended, p = 0.52, r = 0.15), which were similar characteristics to electrophysiological results [39]. Note the different observed maximal values for each subplot, which is listed in the caption of Fig 5. We next explore the differential role of the two types of grouping cells in our model and their contributions to the observed loose and tight synchrony results.
Results so far have characterised activity levels (firing rates) and pairwise correlations of BOS neurons for a fixed set of firing rates of the G cells that were either chosen according to previous firing rate models (
Firing rates of BOS neurons for a variety of combinations of


Firing rates for BOS neurons as functions of the mean rates of G-cells.
A: Averaged firing rates of
Fig 6C shows the firing rates of the preferred
Fig 6E shows the mean rates of preferred BOS neurons as a function of the firing rate of
The influence of G-cell firing rates on loose synchrony between BOS cells is summarized in Fig 7. Fig 7A and 7B shows loose synchrony of BOS neurons with various combinations of


Loose synchrony for model BOS neurons as function of the mean rates of G-cells.
Shown is the integral of the spike train cross-correlation in the range of a ±40 ms interval around lag zero. Each data point is the average of 100 simulated trials, each of 200 biological seconds duration. A: Loose synchrony between
Details of modulation in loose synchrony of the consistent and inconsistent pairs as a function of the Gsp-cell activity are shown in Fig 7C and 7D, respectively. As in the previous section, in these simulations mean rates of
Loose synchrony for the consistent and inconsistent pairs is a non-monotonic function of
Finally, we compute tight synchrony between model BOS neurons by systematically varying the rates of G-cells (Fig 8). The magnitude of tight synchrony was computed by integrating the tight correlation in the range of ±5 ms around lag zero (see also Materials and methods section). Tight synchrony of BOS neurons with various combinations of rates of


Tight synchrony for model BOS neurons as function of the mean rates of G-cells.
Shown is the integral of the tight correlation in the range of ±5 ms interval around lag zero. A: Tight synchrony between
Fig 8C and 8D show details of tight synchrony modulation for the consistent and inconsistent pairs as a function of Gsp-cell firing rates, respectively. Detailed modulations of tight synchrony for the consistent and inconsistent pairs as a function of
Our overall goal is to understand the neuronal circuitry responsible for scene understanding and selective attention in primate visual cortex. In this computational study we extend a previously developed neuronal network model with spiking neurons [41] which is based on the grouping hypothesis [16, 25, 26, 28]. Electrophysiological recordings have elucidated the interaction between perceptual organization implemented by border ownership selective neurons and selective attention [17], and the cited grouping models take this “attention to objects” [7, 52] mechanism into account. There is, however, evidence that there are mechanisms of selective attention that act purely spatially, without reference to visual objects [53–55]. It is therefore necessary to expand grouping models to include such purely spatial mechanisms that operate independently of object features.
Our previous grouping model [41] took into account attentional top-down influences of the first kind described above, i.e. attention to objects. In that model, G cells only project to those BOS cells that are consistent with an object in the attended position, we call these Gobj cells here. In the present study, we extend this model by adding an implementation of the second type of attentional top-down projection which is purely spatial. This projection, implemented by a separate class of G cells (Gsp) modulates all BOS cells in a spatially defined area. This is seen in Fig 2 which shows that the spatial grouping cell Gsp projects to all BOS cells, irrespective of their RF location (in this area) and their border ownership preferences. What is common to both types of grouping cells is that their feedback is modulatory via NMDA receptors [42, 43].
To understand the neuronal circuitry we follow a rich and active tradition of theoretical work [32–34] by analyzing the correlation structure of neuronal spike trains, using first-order (mean firing rate) and second order (spike-spike) correlation functions. In particular, common input plays a critical role for inducing synchronized responses between postsynaptic neurons. We focus on the analysis of the observed spike-spike correlations in (mainly) extra-striate cortex. Specifically, we focus on synchrony observed in consistent and inconsistent pairs of BOS neurons (Fig 1).
Our simulations of the new model reproduce the shapes of the cross-correlation functions observed in a recent neurophysiological study [39], for both consistent and inconsistent pairs of BOS neurons, and for all conditions of feature binding and attention. In addition, we showed that: 1) firing rates of BOS cells increase monotonically with increasing G-cell activity (cells of inconsistent pairs only with Gsp cell input because they do not receive Gobj cell input), 2) loose synchrony between BOS cells is a non-monotonic function of G-cell activity if they both receive common input from the G-cell and is stronger in consistent versus inconsistent BOS pairs, and 3) tight synchrony results were more variable, with a higher magnitude of tight synchrony in consistent versus inconsistent BOS pairs. In the next section, we discuss how the top-down influences in our model generate correlation structures that we describe. Overall, our results support the hypothesis that figure-ground organization and attentive selection are both produced by grouping feedback modulation to the early feature representation levels of the visual cortex and suggest that the modulation is mediated by the NMDA-type receptor.
Neurophysiological data [39] show that attention to a target stimulus has opposite effects on the level of synchrony (In this section, we use synchrony synonymously with loose synchrony except where specifically noted otherwise.) on two neuronal populations: The spike-spike correlation between two neurons whose border ownership preferences point to a common object (which makes them consistent neurons, see Fig 1B) is lowered when attention is directed toward the object. In contrast, the correlation between pairs of neurons whose border ownership preferences do not both point to that object, i.e. inconsistent neurons, goes up when that object is attended. Our previous work [41] explained the correlation structure for consistent neurons but did not address responses of inconsistent neurons. In the current study, we include this group of neurons and we also specify in more detail the type of attentional influences in our model. As we have discussed earlier, there are multiple mechanisms of top-down visual attention. We here focus on attention to a spatially defined portion of the visual field, i.e. spatial attention, and attention to objects.
A fundamental hypothesis of our approach is that these two types of attentional influences have anatomical implementations in the form of separate types of G cells. One type consists of object-based G (Gobj) cells, similar to those in previous studies [25, 26, 41]. Neurons of this type are responsible for grouping (binding) features together to a coherent object and also serve as a “handle” for attention to such an object. This is to be distinguished from spatial attention where everything within a spatially circumscribed area is attended. Our hypothesis is that this is implemented by top-down feedback that activates a separate type of neurons, the spatial G (Gsp) cells. In our model, both types of G cells exert attentional influence on BOS cells by NMDA-type projections [43], Fig 2. Furthermore, we assume that top-down attention to one of the target stimuli (shown in schematic form in Fig 1) engages both types of G cells, resulting in the elevated firing rates of both cell classes in the attended vs. unattended condition, Table 1.
At the level of BOS cells, our simulations show that top-down attention depresses synchrony for the consistent pair, as observed [39]. Our simulation also reproduces the observed increased synchrony of inconsistent cell pairs when the object is attended. This difference in population responses can be understood from our model results shown in Fig 7. It shows how synchrony between members of consistent (Fig 7A, 7C and 7E) and inconsistent (Fig 7B, 7D and 7F) pairs varies with the level of common input from Gsp (Fig 7C and 7D) and Gobj (Fig 7E and 7F) cells. As is expected on theoretical grounds, synchrony is a smoothly varying function with a single peak which can be pronounced (Fig 7E) or weak (Fig 7D), except for the inconsistent pairs in which synchrony does not vary with the Gobj-cell firing rates (Fig 7F). Top-down attention is assumed to increase firing rates of both G cell types, Table 1, which corresponds to a rightward shift along the curves in Fig 7C, 7D, 7E and 7F. The starting point of this shift differs, however, between the two types of G cells. Gobj-cells fulfill the dual role of mediating attention to objects and of grouping object features pre-attentively. For the center object in Fig 1, the second of these functions (grouping) requires them to fire at a high rate (30Hz, see Table 1) even without attention. This is already to the right of the peak in synchrony, Fig 7E. Attention further increases (doubles) this rate, i.e. shifts it to the right in that figure, resulting in a substantially lower synchrony level. Since only consistent BOS cells receive input from the (center) G cell,
Gsp-cells provide input to all BOS cells in their projective fields. Since they are not involved in feature binding, their firing rates are much lower than those of Gobj-cells. This has the important effect that the rightward shift in Fig 7 occurs from the left of the maximum, i.e. it increases the synchrony level at the level of BOS cells. This explains the increased synchrony between inconsistent cells because, except for low levels of spontaneous firing that is always present in all G cells, their only common top-down input is from Gsp-cells. This slight increase is also present in consistent cells but it is masked by the larger decrease resulting from the input from the Gobj-cells discussed above.
Martin and von der Heydt [39] also reported that tight synchrony shows peaks at zero lag for the consistent pair in the bound condition, but not for the inconsistent pair (Fig 5A and 5E). Our model reproduces the characteristics of tight synchrony, provided we choose synaptic weights of Gobj-cells twice as large as that of Gsp-cells (see Materials and methods section), a prediction of our model. The differences in synaptic weights between Gobj- and Gsp-cells underlie that difference in tight synchrony between consistent and inconsistent pairs.
The relative rates of object-based and spatial G cells in Table 1 are thus an important prediction of our model (and more specific than the prediction that these two neuronal populations exist in the first place). In this view, the primary function of the object-based G cells is grouping of object features, and this functionality is modulated by attention to objects. In contrast, the spatial G cells only have one function. This difference in functionalities is reflected in their mean firing rates and leads, indirectly, to the observed difference in correlations at the BOS cell level.
A variety of studies have investigated the role of synchrony between neuron pairs for processing visual information and for organizing visual perception [18, 56, 57]. By contrast, our work suggests the possibility that the observed synchrony between BOS neurons is epiphenomenally induced via modulatory feedback mediated by currents through NMDA receptors. Further studies are necessary for understanding the roles of spike synchrony between pairs of BOS neurons.
In our model, the state of object representations (attended vs. ignored, bound vs. unbound) is represented by the mean firing rates of their associated G cells, see Table 1. We do not model in detail the mechanisms that create the differences in G cell firing rates since we focus in this study on the effects they have on the activity of border ownership selective cells and their correlations.
As far as the effect of attention is concerned, the assumption (made explicit in Table 1) is that attention to an object doubles the firing rate of its associated object-based G cell population (from 30Hz to 60Hz) and it also halves the firing rates of nearby G cells of the same type (from 5Hz to 2.5Hz) because of interactions between G cells [26]. The increase is assumed to result from top-down input from the attention control system that provides additional excitatory input to the G cells representing attended objects.
There is also a strong difference between object-based G cells representing bound and unbound regions of the RF, with firing rates assumed as 30Hz in the former and 5Hz in the latter case (in the absence of attention; attention to an object doubles the rate as discussed). The origin of this difference is the different input from BOS cells in the bound vs. unbound conditions. As Fig 2 shows,
However, one might argue that the latter argument, that firing rates differ substantially between bound and unbound object representations, may hold for simple scenes with isolated objects as in Fig 2 but not necessarily for more complex scenes. In particular, we need to consider the presence of clutter and overlapping objects. An example is the simple scene showing partially overlapping rectangles shown in Fig 9. If the circuitry shown in Fig 2 receives input from this or a similar scene, the object-based G cells that receive input from the background figure (dark gray rectangle) will have a similar firing rate as the object-based G cell that represents the foreground figure (light gray rectangle) since both receive input from about the same number of line segments. Thus, our assumption that the G cell of one of them (the foreground figure) has a substantially higher firing rate seems not justified.


Overlapping squares as an example of cluttered scenes.
In this illustration, the light-gray “foreground” square appears in front of the darker “background” rectangle; the perception is not that of a light-gray square adjacent to a darker L-shaped object. Red and black circles mark T and L junctions, respectively, and are not part of the visual display. Such local cues modulate the responses of BOS cells in addition to the grouping cell inputs discussed in the present study [25].
This is certainly the case for the model described so far. However, again for the purpose of focusing on the main topic of this study, we have simplified the model of perceptual grouping to only include edge segments. As was recently argued by von der Heydt and Zhang [47], the distribution of edges (contours) is sufficient to group features of isolated compact objects but does not disambiguate the assignment of borders between two overlapping objects. It was found in earlier modeling work [25] that addition of local features, in particular T-junctions (physiologically represented by end-stopped cells [58]), can resolve this ambiguity. These model assumptions are strongly supported by recent neurophysiological results showing that even single feature elements, like T or L junctions, strongly influence the firing rates of BOS cells in the case of overlapping figures [47]. The firing rates of BOS cells that represent features of a foreground object are increased by the presence of an L junction that is part of this object, and suppressed by a T junction that is inconsistent with the presence of this object. Remarkably, single occurrences of these local features have a strong effect on BOS cell activity, and additional consistent features only make small additional contributions (ibid).
One possibility how local cues influence BOS cell firing rates is by increasing the firing rates of the object-based G cells representing foreground objects. The prediction is that the firing rate of the foreground G cell is substantially higher than that of the G cells representing the objects partially occluded by it. An additional prediction motivated by the data in ref [47] is that G cell activity is subject to a strong compressive (saturating) nonlinearity. Another possibility is that neurons responding to local cues directly project to BOS neurons to modify their firing rates [25, 26]. Of course these two possibilities are not mutually exclusive, both could occur. In addition, there are local cues in the spectral domain that can be used to distinguish figure from ground, and therefore border ownership [59, 60]. Incorporating the influence of strong local features in our model as well as the influence of strong saturation is an obvious topic for future work.
Several models have been proposed to account for the mechanism of the modulations of BOS neuronal activity during object perception. Craft et al. [25] developed a computational model of the BOS mechanism based on the hypothetical grouping circuit. Mihalas et al. [26] proposed a computational model for explaining how selective attention that was mediated by top-down projections from G-cells modulates the responses of BOS neurons. Russell et al. [28] opened the feedback loop between BOS and G cells and implemented an efficient feedforward model that can process arbitrary visual input, including natural scenes and, in later work, video [61]. However, these models mainly demonstrated how top-down selective attention modulates the firing frequency of BOS neurons, without considering neuronal dynamics and spike synchrony. Furthermore, in these models, top-down signals for modulating the activity of BOS neurons are implemented functionally, without regard to the biophysical mechanism how the modulation is achieved. In contrast, this model, as well as its predecessor [41] suggests that the bottom-up input from the visual periphery drives BOS cells by AMPA-type synapses while feedback signals mediating grouping and attention rely on NMDA synapses (we discuss other types of synapses below). An additional major advance of the current study compared to that in ref [41] is that we here add an explicit mechanism for spatial attention.
In our model, modulatory feedback from G-cells is implemented by glutamatergic synapses of the NMDA type and feedforward input representing visual stimuli relies on AMPA type synaptic currents. We show that this combination of synaptic circuitry is sufficient to explain a substantial part of neurophysiological observations. However, it is believed that other neurotransmitters and neuromodulators also play a role in the control and implementation of selective attention, in particular acetylcholine and dopamine [62–64], which we do not consider in this study. Dopamine-mediated activity within the frontal eye field (FEF) may be involved in the determination of saccadic target selection and the modulation of responses of V4 neurons [64]. Signals from FEF modulate responses in visual cortices during tasks requiring spatial attention [65, 66]. It is therefore possible that dopamine plays a role in modulating the neuronal activities in visual areas including V2 where most BOS neurons are located. It was also found that cholinergic modulation participates in bottom-up attention and saliency-based selection in cortex [67] and midbrain [62, 68, 69]. Further studies for synapses mediated by these neuromodulators are necessary for understanding the detailed mechanism of attentional modulation in BOS neurons. The present study focuses on the cortical circuitry (Fig 1) while these modulatory influences originate in subcortical structures which we do not consider here. However, they will need to be included in a more complete model.
Another limitation of our study is that we include only the minimum number of BOS neurons to understand the fundamental mechanism for understanding temporal correlations between BOS neurons. In particular, we do not consider connections between BOS neurons. However, the related Craft et al. model did incorporate recurrent connections between excitatory and inhibitory BOS neurons [25].
The Craft et al. model also includes recurrent connections between BOS neurons and G-cells which autonomously generate activity patterns that implement the object grouping signals in G cell as well as the observed physiological responses of BOS neurons. Since that study showed that it is possible to construct circuitry with G cell activity that results in BOS cell firing rate pattern consistent with experimental findings, we take the existence of such circuitry as given. We therefore assume that the G cells already have the firing rates corresponding to the binding and non-binding conditions, respectively, rather than building the circuitry that generates these firing rates. This simplifies our already quite complex simulations and allows us systematically explore the influence of the different G cell types on the BOS cells in the object grouping and attention conditions which are the focus of this study.
Finally, a limitation of our model is that it does not require a role for inhibitory interactions even though are clearly present in cortical circuitry. Synaptic inhibition modulates short-timescale correlations, such as synchrony between groups of excitatory neurons (review: [70]). Previous computational studies suggested that attentional modulation for cortical activities is induced by feedback projections to classes of inhibitory neuron [71, 72]. In contrast, the role of inhibitory neurons is not addressed in our model. As mentioned, however, a previous model [25] that shares many features with the present one does employ recurrent inhibitory interactions between BOS cells.
We extend our previous model [41] to understand the neural mechanisms underlying border ownership selectivity. Fig 2 shows the architecture of our network model. It consists of four border ownership selective cell populations, (
In our network model, we include only the neurons and synaptic connections necessary to understand the fundamental mechanism for modulating consistent and inconsistent pairs of physiological BOS cells [39](see also Fig 1A). The arrows to model populations in Fig 2 indicate synaptic connections in our network model. The BOS neuron whose RF is shown by the left black (gray) oval has right (left) side-of-figure preference and is therefore named
Three types of inputs are applied to these BOS neurons: bottom-up visual input and top-down signals from two distinct types of G-cells, object-based grouping (Gobj) and spatial grouping (Gsp) cells (Fig 2). Bottom-up input arises from visual stimuli. Gobj-cells impart grouping structure and mediate object-based attention, whereas Gsp-cells implement the influence of spatial attention. According to our previous work [41], all inputs to model BOS neurons were modeled as stochastic random processes with Poisson statistics, where each event stands for an incoming action potential. As shown by the arrows to model populations in Fig 2, bottom-up inputs for visual stimuli and top-down signals from
Previous computational studies [25, 26] have that hypothesized receptive field of the Gobj-cells have a variety of sizes, so they can respond to objects of different scales in the visual scene. Zhang and von der Heydt have investigated how responses in physiological BOS neurons are modulated depending on object size [20]. By contrast, in the present study, we have focused on one scale only, and assume that the mechanisms are identical at all scales. The same applies to Gsp-cells. We hypothesize that populations of Gobj- and Gsp-cells with various sizes of receptive field for responding to whole object scales exist in visual cortices, which functionally results in the zoom lens model of attention [54] that has been supported by behavioral and neurophysiological studies [74, 75]. The zoom lens model proposes that attention can subtend to as little as a fraction of a degree of angle, and, in the other extreme, can be dilated to an even distribution over the entire visual field, with a concomitant loss of spatial resolution for larger scales. Gobj-cells and Gsp-cells with appropriately scaled receptive fields may tile the visual scene for representing objects with a variety of sizes.
In our model, the BOS neurons are integrate-and-fire neurons as in previous models [41, 76, 77], and described in detail as follows. The dynamics of the subthreshold membrane potential (V) of a model BOS neuron are


According to our previous model [41], bottom-up excitatory postsynaptic currents (Ivis) are mediated by glutamatergic receptors of the AMPA type,


In our network model (Fig 2), whereas

The fraction of open NMDA channels in a synapse is sNMDA, defined as


In our network model, model BOS neurons integrate bottom-up inputs, representing object borders, with top-down influences mediating the perceptual grouping structure and selective attention (Fig 2). Since the contents of the RFs are identical for all visual inputs considered (see the three configurations in Fig 1B), the bottom-up input has the same statistics in all three conditions, which were modeled as Poisson spike trains with a mean rate of 200Hz. This input should be interpreted as originating from a population of visually responsive neurons rather than from a single neuron. This rate is chosen according to our previous theoretical work [41].
The Gobj-cells activity is based on the integration of the responses of BOS neurons and represents the visual scene in terms of objects, thus providing a fast sketch of the location and rough shapes of objects in the scene [25, 26]. In this work, we focus on the interaction of modulatory top-down influences with the driving bottom-up input. As in our previous model [41], we increased the Gobj-cell activity in the bound condition and observed the influence of its activity on BOS neurons. Likewise, we increased the activity of G-cells to represent attentional selection of the target object and location without being concerned about the source of attentional input.
The activity of Gsp- and Gobj-cells were simulated as Poisson-distributed spike trains. The rates of G-cells for representing stimulus and attention conditions are summarized in Table 1. In the unbound condition (left panel in Fig 1B), we assume that
We integrated the differential equations using a fourth-order Runge-Kutta algorithm with a time step of 0.1 ms. We simulated 100 trials of a length of 200 sec each, for a total of 20,000 simulated biological seconds per condition. The first 750 ms of simulated results was always discarded to minimize the effect of transients (in analogy to the onset transients that are routinely removed in electrophysiological experiments, including in ref [39]). We also extended the simulation beyond 200 sec by the length of the correlation window to allow computation of the correlation function (see below). The code for the simulations was written in the C programming language (source code available as S1 Code).
We quantified spike synchrony by first dividing time into bins of 1 ms width, each containing either 0 or 1 spike. A spike train was thus transformed into a stochastic process (
The correlogram between two spike trains

Changes in firing rate, e.g. those produced by attention, will change the correlations. We compensated for any such effects by subtracting the average spike frequency from the neuron spike train for each trial [36]. The mean spike count per bin of spike train of




The magnitude of synchrony between model BOS neurons (Mi) is represented as the integral of the correlogram (Eq 10) in the range of ±T:


Loose synchrony (correlations on the order of tens of milliseconds) was computed using T = 40 ms.
The application of jitter methods was used for testing the hypothesis that neurons operate at or below any specific temporal resolution [50, 51]. In this method, the data from each neuron are divided into bins based on the jitter window, starting at the stimulus onset. Each spike of each neuron is then independently moved to a new location, selected from the uniform distribution on the jitter window to which it belonged in the original data (see also Fig 2 in Amarasingham et al. [50]). In this way, the number of spikes within each bin is preserved in the resampling data. The advantage of this method is that it helps to disambiguate short-term from long-termp correlations in the correlograms. Shorter jitter windows remove more of the long timescale correlation between the neurons (the loose synchrony) while preserving short time-scale correlation (the tight synchrony). In order to compute the tight synchrony, based on the physiological work [39], the influences of spikes outside 20 ms were removed by implementing an interval jitter method.
For each spike train, spikes were jittered in a uniform distribution in disjoint, contiguous jitter window of 20 ms. Whereas the original spike trains were binned in 1 ms bins with a maximum of 1 spike/bin, the jittered spike trains could have as many spike in a bin as were present in each 20 ms jitter window of the original binned spike train. By shifting the spikes to new positions in each 20 ms jitter window, the overall firing rate profile of each trial was preserved at the resolution of the width of jitter window. Repeating this jittering produced a sequence of surrogate spike trains. The cross-correlation of each of the surrogates produced a distribution of correlogram. The mean of this distribution was subtracted from the mean of correlogram of the original spike trains (see also Eq 11). The r jittered correlogram were found by taking the trial-wise mean cross-correlation of each jittered spike train

The tightened, jitter-correlated correlogram, CCG*, was found by subtracting the mean of the r jittered correlogram, 〈Jr〉r, for the amount of overlap, as follows:

We also computed the integral of the tight synchrony (Eq 16) in the range of ±5 ms:

We thank Ko Sakai for discussions.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80