Change
blindness
From Wikipedia, the free encyclopedia
Change blindness is a phenomenon in visual perception where apparently
large changes within a visual scene are undetected by the viewer.
Typically for change blindness to occur, the change in the scene
has to coincide with some visual disruption such as an eye movement
or a brief obscuration of the observed scene or image.
When
change blindness was first explored systematically by George McConkie
and his colleagues in the late 1970s, the phenomenon was largely
limited to the study of changes introduced to words and text during
eye movements. A student of McConkie's, John Grimes, was the first
to extend this phenomenon to the domain of scene perception (in
a conference presentation in 1992, not published until a book
chapter in 1996). Grimes showed that people miss large changes
to scenes when the changes are introduced during an eye movement.
For example, many people failed to notice when two people in a
scene exchanged heads! In these saccade-contingent change blindness
studies, changes to the scene were synchronized with measured
movements of the observer's eyes, so that the changes occurred
only when the eyes were moving. Under these conditions, changes
are often hard to detect. A number of studies since then have
explored saccade-contingent change blindness (e.g., Henderson
& Hollingworth, 1999; McConkie & Currie, 1996).
Later
experiments showed that change blindness was not specifically
related to eye movements -- other forms of visual disruption could
also induce change blindness. Rensink et al, popularized the "flicker"
technique in which two images alternate repeatedly with a brief
(80ms) blank screen after each image (giving the display a flickering
appearance). With the blank screen in place, surprisingly large
changes could be made to the scene without the observer noticing.
Rensink et al (1997) also introduced the term "change blindness."
Other
studies showed that change blindness occurs when the change is
introduced during a cut or pan in a motion picture, even when
the change is to the central actor in a scene (Levin & Simons,
1997). People also regularly fail to notice editing errors in
commercial movies, despite the intense scrutiny of movies during
the production process.
Change
blindness can be particularly dramatic when changes occur unexpectedly,
with many observers even failing to notice when a person they
were talking to was surreptitiously replaced by a different actor
(Simons & Levin, 1998). Change blindness has now been shown
to occur with a wide variety of visual disruptions (e.g., blinks,
transient noise flashed on a display, etc).
Change
blindness is related to other induced failures of awareness, such
as inattentional blindness. A crucial difference is that successful
change detection in the presence of a visual disruption requires
a comparison of one image to another one held in memory. Consequently,
change blindness can occur due to a failure to store the information
in the first place or to a failure to compare the relevant information
from the current scene to the representation (hence models of
visual short term memory may be important for understanding the
phenomenon). In contrast, inattentional blindness reflects the
failure to detect an unexpected stimulus that is fully visible
in a single display -- it does not require a comparison to memory.
In
the real world, change blindness may be responsible for traffic
accidents. If drivers fail to notice significant changes around
them (e.g., the presence of a pedestrian in their path).
Choice
blindness
In psychology, choice blindness is a phenomenon in which subjects
fail to detect conspicuous mismatches between their intended (and
expected) choice and the actual outcome.
Writing
in Science, psychologist Petter Johansson and coworkers describe
choice blindness in an ingenious experiment.
The
subject is presented with two cards, on which different (female)
faces appear. The subject is asked to choose which one he finds
more attractive. In the non-manipulated (NM) version, the subject
is handed the card that he chose and asked to say why he chose
that one. In the manipulated (M) version, the experimenter uses
sleight of hand techniques to switch the cards without the subject's
knowledge and give the subject the other card.
The workers found that most subjects failed to notice the switch,
and furthermore justified their decision using post-hoc confabulated
evidence. For example, in a M trial, a subject might say "I
preferred this one because I prefer blondes" when he had
in fact chosen (and pointed to) the dark-haired woman, but was
handed a blonde.
They
point out that his experiment allows one to investigate the relationship
between choice and introspection.
Johansson
concludes that he has found that some normal participants unequivocally
produce confabulatory reports when asked to describe the reasons
behind their choices and suggests that choice blindness affords
some insight into the mechanisms behind truthful report.
Inattentional
blindness, closely related to the subject of change blindness,
is observed phenomena of the inability to perceive features in
a visual scene if they are not being attended to. That is to say
that humans have a limited capacity for attention which thus limits
the amount of information processed at any particular time. Any
otherwise salient feature within the visual field will not be
observed if not processed by attention.
Experiments
demonstrating inattentional blindness
The
most well known study demonstrating inattentional blindness was
conducted by Daniel Simons of the University of Illinois at Urbana-Champaign
and Christopher Chabris of Harvard University. In their study,
subjects are asked to watch a short video [1] in which two groups
of people (wearing black and white t-shirts) pass a basketball
back among themselves. The subjects are told to either count the
number of passes made by one of the teams, or to keep count of
bounce passes vs. aerial passes. In different versions of the
video a woman walks through the scene carrying an umbrella, or
wearing a full gorilla suit. In one version the woman in the gorilla
suit even stops in the middle, faces the camera, and pounds her
chest before walking out of the scene. After watching the video
the subjects are asked if they saw anything out of the ordinary
take place. In most groups 50% of the subjects did not report
seeing the gorilla. Simons interprets this by stating that we
are mistaken with regard to how important events will automatically
draw our attention away from current tasks or goals. This result
indicates that the relationship between what is in our visual
field and perception is based much more significantly on attention
than was previously thought.
Another
research finding was done by Steve Most, Chabis and Scholl. They
had objects moving up and down on a computer screen. Participants
were instructed to attend to the black objects and ignore the
white, or vice versa. After several trials, a red cross unexpectedly
appeared and traveled across the display, remaining on the computer
screen for five seconds. The results of the experiment showed
that even though the cross was distinctive from the black and
white objects both in color and shape, about a third of participants
nonetheless missed it. They had found that people may be attentionally
tuned to certain perceptual dimensions, such as brightness or
shape.
Thus
the common saying "out of sight, out of mind", meaning
that people usually don't think about what they do not see, could
be reversed into: "out of mind, out of sight".
Visual
short term memory
From Wikipedia, the free encyclopedia
In the study of vision, visual short-term memory (VSTM) is one
of three broad memory systems including iconic memory and long-term
memory. VSTM is a type of short-term memory, but one limited to
information within the visual domain.
The
Visuospatial Sketchpad is a VSTM subcomponent within the theoretical
model of working memory proposed by Alan Baddeley. However, the
term VSTM refers in a theory-neutral manner to the non-permanent
storage of visual information over an extended period of time.
Whereas
iconic memories are fragile, decay rapidly, and are unable to
be actively maintained, visual short-term memories are robust
to subsequent stimuli and last over many seconds.
Overview
The
introduction of stimuli which were hard to verbalize, and unlikely
to be held in long-term memory, revolutionized the study of visual
short-term memory (VSTM) in the early 1970s (Cermak, 1971; Phillips,
1974; Phillips & Baddeley, 1971). The basic experimental technique
used required observers to indicate whether two matrices (Phillips,
1974; Phillips &Baddeley, 1971), or figures (Cermak, 1971),
separated by a short temporal interval, were the same. The finding
that observers were able to report that a change had occurred,
at levels significantly above chance, indicated that they were
able to encode some aspect of the first stimulus in a purely visual
store, at least for the period until the presentation of the second
stimulus. However, as the stimuli used were complex, and the nature
of the change relatively uncontrolled, these experiments left
open various questions, such as: (1) whether only a subset of
the perceptual dimensions comprising a visual stimulus are stored
(e.g., spatial frequency, luminance, or contrast); (2) whether
some perceptual dimensions are maintained in VSTM with greater
fidelity than others; and (3) the nature by which these dimensions
are encoded (i.e., are perceptual dimensions encoded within separate,
parallel channels, or are all perceptual dimensions stored as
a single bound entity within VSTM.
Psychophysical approaches to VSTM
In
a typical psychophysical VSTM experiment, observers' ability to
discriminate between sequentially presented test and reference
patterns are measured using a two-interval forced-choice (2-IFC)
paradigm. For example, in a study involving spatial frequency,
observers might be required to make a judgment as to whether the
first or second pattern presented was of higher (or lower) spatial
frequency. Typically the test and reference patterns are separated
by ISIs in the range of 0 s to 30 s. The properties of stimulus
pairs can be controlled by using a psychophysical staircase procedure,
or via the method of constant stimuli (for details of these techniques,
see Regan, 2000). When a staircase procedure is used, the properties
of the stimulus pairs are altered until a criterion threshold
level of performance is achieved (e.g., 75% correct).
Fidelity
of memory representations
A
series of studies over the last decade and a half (for good reviews,
see Magnussen, 2000; Magnussen & Greenlee, 1999) have demonstrated
that VSTM stores various perceptual dimensions (e.g., spatial
frequency, orientation, hue) with a remarkable degree of fidelity
and stability (Magnussen & Greenlee, 1992; Magnussen, Greenlee,
Asplund, & Dyrnes, 1991; Magnussen, Idas, & Myhre, 1998;
Regan, 1985). It has been shown, for instance, that with a reference
frequency of 10 c/deg, spatial frequency thresholds tested with
ISIs of up to several seconds (measured as Weber fractions, Df/f)
differ by only three-to-six percent from those recorded when gratings
are presented simultaneously (Regan, 1985). With a period difference
of 360 arcseconds, a threshold of 0.04 Df/f implies that observers
are able to distinguish spatial frequency differences of 14.4
arcsec (Magnussen & Greenlee, 1999). As this is approximately
half the average cone spacing on the fovea, it implies that observers
are able to store spatial frequency information within the hyperacuity
range for upwards of 60 s (Bennett & Cortese, 1996).
A
series of psychophysical studies have found that many perceptual
dimensions (i.e., spatial frequency, hue, orientation, speed)
are stored with little or no loss in VSTM. As already mentioned,
spatial frequency can be stored for upwards of 60 s with no increase
in thresholds (Bennett & Cortese, 1996; Magnussen & Greenlee,
1997; Magnussen et al., 1991; Magnussen, Greenlee, & Thomas,
1996; Regan, 1985). Other studies have shown that colour (Nilsson
& Nelson, 1981), speed (Magnussen & Greenlee, 1992), and
orientation (Magnussen et al., 1998), are also stored in VSTM
for upwards of 10 s with no significant decay.
The
one notable exception to this rule is contrast. Several studies
have shown that thresholds for contrast discrimination grow rapidly
as ISIs increase, with thresholds doubling as ISIs are raised
from 0 s to 10 s (Lee & Harris, 1996; Magnussen et al., 1991;
Magnussen et al., 1996). This appears to be due to a loss of information
about contrast as ISI increases, which correspondingly makes it
increasingly unlikely that a change will be reported as ISIs increase.
The decay in contrast information is likely to underlie the apparent
decay in information for VSTM experiments using matrix patterns
(e.g., Phillips, 1974; Phillips & Baddeley, 1971).
Structure of memory representations
With
the exception of contrast, basic perceptual attributes are similar
in terms of both the accuracy and stability with which they are
stored in VSTM. However, it is unclear whether the information
from each perceptual stream is encoded separately within parallel
channels, or whether information for different perceptual dimensions
is represented within VSTM as a single, bound set of features.
Two different lines of evidence – one derived from the experimental
paradigm known as memory masking, the other associated with the
differential effects observed for decisions made either within
or between perceptual dimensions – suggest that VSTM stores
information within multiple parallel perceptual channels.
Memory masking
Memory
masking refers to an experimental technique in which the addition
of a "masking" grating, placed between the reference
and test stimuli in a psychophysical VSTM experiment (Bennett
& Cortese, 1996; Magnussen & Greenlee, 1992; Magnussen
et al., 1991), leads to an increase in psychophysical thresholds.
It is important to note, however, that the use of the term "masking"
here is somewhat misleading, as the temporal placement of the
additional grating is such that it acts neither as a pattern mask
nor as an energy mask for the test or reference stimuli (Breitmeyer,
1984).
If
the masking grating matches the test or reference grating in the
perceptual dimension being discriminated, no increase in threshold
is observed relative to a no-mask control condition. However,
the more the masking stimulus differs from the reference grating
on the dimension being discriminated, the more thresholds increase,
until thresholds are approximately double those recorded in the
absence of a mask (Bennett & Cortese, 1996; Magnussen &
Greenlee, 1992; Magnussen et al., 1991).
The
short presentation times of the masking stimuli (e.g., 200 ms),
coupled with the relatively long time periods between the mask
and both the test and reference stimuli (e.g., Magnussen et al.,
1991) argue against the possibility that spatial adaptation is
an explanation for the increase in thresholds caused by the presence
of the mask (e.g., Blakemore & Campbell, 1969).
Another
feature of the memory-masking paradigm is that the effects of
the mask are specific to the type of discrimination being made.
For instance, when performing a spatial frequency judgment, the
orientation of the masking grating has no effect on threshold
levels. Likewise, the spatial frequency of the masking grating
does not alter thresholds obtained when orientation is being discriminated
(Magnussen et al., 1991). This specificity of the masking effect
on thresholds is evidence against its being mediated either through
distracting the observer, or by adding an additional non-specific
burden to memory. Since orientation and spatial frequency are
conjointly coded early in the visual system (DeValois & DeValois,
1990), this result supports the view that the neurophysiological
site affected by memory masking occurs post-V1, at a locus where
orientation and spatial frequency information are coded into independent
perceptual channels. This argument is supported by the finding
that masking by spatial frequency follows perceptual, rather than
retinal coordinates (Bennett & Cortese, 1996), as size constancy
is also thought to occur at a point post-V1 in the visual processing
hierarchy (Magnussen, 2000), perhaps in V4 (see, for instance,
Schiller, 1995).
Dual discrimination costs
It
is well established that observers are able to make independent
decisions about multiple stimulus dimensions (e.g., spatial frequency,
contrast, orientation) with little or no cost (e.g., Chua, 1990;
Greenlee & Thomas, 1993; Vincent & Regan, 1995). These
studies support the view that spatial frequency, orientation,
and contrast are encoded within independent, parallel channels.
Since individual neurons in striate cortex conjointly code spatial
frequency and orientation (DeValois & DeValois, 1990), these
channels are likely to exist at a point later in the visual processing
hierarchy than V1.
The
memory masking literature supports the view that different perceptual
properties are encoded independently within parallel channels.
This view is further supported by evidence from experiments examining
the costs of making dual decisions for attributes that are encoded
either within the same or between different perceptual channels
(Greenlee & Thomas, 1993; Magnussen & Greenlee, 1997;
Magnussen et al., 1996; Thomas, Magnussen, & Greenlee, 2000).
The Greenlee-Thomas model assumes that different perceptual dimensions
(e.g., spatial frequency, contrast, orientation, movement) are
encoded within independent channels (Greenlee & Thomas, 1993).
According to this model, making dual judgments about different
perceptual dimensions will lead to only a moderate increase in
threshold, associated with the increased uncertainty of making
two independent judgments (i.e., decision-noise). However, if
the two judgments made are not independent — as might be
expected if observers were required to make two decisions which
draw on the same limited resource — thresholds are predicted
to increase to a greater extent than can be explained on the basis
of decision-noise alone.
Magnussen
and Greenlee (1997) performed a series of VSTM experiments in
which the relative costs associated with making dual discriminations
within and between stimulus dimensions were compared. Their results
can be summarized as follows: (1) when making discriminations
regarding both contrast and spatial frequency, observers' thresholds
rise by an amount predicted by the additional uncertainty in making
two independent decisions; (2) when judgments are made within
the same perceptual dimension, there is a much greater increase
in associated thresholds than predicted on the basis of the increased
uncertainty associated with making multiple independent decisions,
suggesting that these judgments are not made independently.
Summary of results from psychophysical experiments
Psychophysical
experiments in VSTM suggest that most perceptual dimensions (e.g.,
spatial frequency, orientation, colour, speed) are stored with
remarkable fidelity over relatively long periods of time (Magnussen,
2000). The one exception to this rule is contrast, which has been
shown to decay rapidly in VSTM (Lee & Harris, 1996). Converging
evidence, drawn both from experiments using memory masking (Magnussen
& Greenlee, 1992), and from a comparison of single and dual-discrimination
costs (Magnussen & Greenlee, 1997), suggests that information
is encoded in VSTM in the form of multiple independent channels,
each channel representing a different perceptual dimension. Further
evidence suggests that this information is encoded at a level
in the visual hierarchy later than V1 (e.g., Bennett & Cortese,
1996).
Set-size effects in VSTM
In
a typical VSTM experiment, observers are presented with two arrays,
composed of a number of stimuli. The two arrays are separated
by a short temporal interval, and the task of observers is to
decide if the first and second arrays are composed of identical
stimuli, or whether one item differs across the two displays (e.g.,
Luck & Vogel, 1997). Increasing the number of stimuli present
within the two arrays leads to a monotonic decrease in the sensitivity
of observers to differences in stimuli across the two arrays (Luck
& Vogel, 1997; Pashler, 1988). This capacity limit has been
linked to the posterior parietal cortex, the activity of which
increases with the number of stimuli in the arrays, but only up
to the capacity limit of about four stimuli (Todd & Marois,
2004). There are a number of frameworks that attempt to explain
the effect of increasing set-size on performance in VSTM. These
can be broadly grouped under three categories: (1) psychophysical
frameworks (e.g., Magnussen & Greenlee, 1997); (2) sample
size models (e.g., Palmer, 1990); and (3) urn models (e.g., Pashler,
1988).
Problems with psychophysical explanations
Psychophysical
experiments suggest that information is encoded in VSTM across
multiple parallel channels, each channel associated with a particular
perceptual attribute (Magnussen, 2000). Within this framework,
a decrease in an observer's ability to detect a change with increasing
set-size can be attributed to two different processes: (1) if
decisions are made across different channels, decreases in performance
are typically small, and consistent with decreases expected when
making multiple independent decisions (Greenlee & Thomas,
1993; Vincent & Regan, 1995); (2) if multiple decisions are
made within the same channel, the decrease in performance is much
greater than expected on the basis of increased decision-noise
alone, and is attributed to interference caused by multiple decisions
within the same perceptual channel (Magnussen & Greenlee,
1997).
However,
the Greenlee-Thomas model (Greenlee & Thomas, 1993) suffers
from two failings as a model for the effects of set-size in VSTM.
First, it has only been empirically tested with displays composed
of one or two elements. It has been shown repeatedly in various
experimental paradigms that set-size effects differ for displays
composed of a relatively small number of elements (i.e., approximately
= 4 items), and those associated with larger displays (i.e., approximately
> 4 items). The Greenlee-Thomas (1993) model offers no explanation
for why this might be so. Second, while Magnussen, Greenlee, and
Thomas (1997) are able to use this model to predict that greater
interference will be found when dual decisions are made within
the same perceptual dimension, rather than across different perceptual
dimensions, this prediction lacks quantitative rigor, and is unable
to accurately anticipate the size of the threshold increase, or
give a detailed explanation of its underlying causes.
In
addition to the Greenlee-Thomas model (Greenlee & Thomas,
1993), there are two other prominent approaches for describing
set-size effects in VSTM. These two approaches are can be referred
to as sample size models (Palmer, 1990), and urn models (e.g.,
Pashler, 1988). They differ from the Greenlee-Thomas (1993) model
by: (1) ascribing the root cause of set-size effects to a stage
prior to decision making; and (2) making no theoretical distinction
between decisions made in the same, or across different, perceptual
dimensions.
Models of capacity limits in VSTM
If
observers are asked to report on the quality (e.g., color) of
an item stored in memory, while performance might be perfect when
only a few items are encoded (the number of items that can be
perfectly encoded varies depending on the attribute being encoded,
but is usually less than five), after which performace invariably
declines in a monotonic fashion as more items are added. Different
theoretical models have been put forward to explain this decline
in performance.
Slot models
A
prominent class of model proposes that observers are limited by
the total number of items which can be encoded, either because
the capacity of VSTM itself is limited (e.g., Cowan, 2001; Luck
& Vogel, 1997; Pashler, 1988), or because of a bottleneck
in the number of items which can be attended to prior to encoding.
This type of model has obvious similarities to urn models used
in probability theory (see, for example, Mendenhall, 1967). In
essence, an urn model assumes that VSTM is restricted in storage
capacity to only a few items, k (often estimated to lie in the
range of three-to-five). The probability that a suprathreshold
change will be detected is simply the probability that the change
element is encoded in VSTM (i.e., k/N). Although urn models are
used commonly to describe performance limitations in VSTM (e.g.,
Luck & Vogel, 1997; Pashler, 1988; Sperling, 1960), it is
only recently that the actual structure of items stored has been
considered. Luck and colleagues have reported a series of experiments
designed specifically to elucidate the structure of information
held in VSTM (Luck & Vogel, 1997). This work provides evidence
that items stored in VSTM are coherent objects, and not the more
elementary features of which those objects are composed.
Noise models
A
much more controversial framework has more recently been put forward
by Wilken and Ma (2004) who suggest that apparent capacity limitations
in VSTM are caused by a monotonic decline in the quality of the
internal representations stored (i.e., monotonic increase in noise)
as a function of set size. In this conception capacity limitations
in memory are not caused by a limit on the number of things that
can be encoded, but by a decline in the quality of the representation
of each thing as more things are added to memory.
In
their 2004 experiments, they varied color, spatial frequency,
and orientation of objects stored in VSTM using a signal detection
theory (SDT) approach. The participants were asked to report difference
between the visual stimuli presented to them in consecutive order.
The invesigators found that different stimuli were encoded independently
and in parallel, and that the major factor limiting discrimination
performance was neuronal noise.
Sample
size models
Sample
size models (Palmer, 1990) propose that the monotonic decrease
in performance with increasing set-size in VSTM experiments is
a direct outcome of a limit in the amount of information observers
can extract from a visual display.
In
the sample size model, each perceptual attribute of a stimulus
is associated with an internal, unidimensional percept, formed
by the collection of a finite number of discrete samples. It is
assumed that the total number of samples that can be collected
across the entire visual scene is fixed. Assuming that equal attention
is paid to each stimulus, it follows that the total number of
samples taken from each element in an array will be inversely
proportional to the number of stimuli present, N. Central limit
theorem implies that the mean of the samples taken, and therefore
the mean of the internal percept, will have a variance inversely
proportional to N. Signal detection theory defines sensitivity
(i.e., d') as being inversely proportional to the standard deviation
of the underlying representation to be discriminated (Macmillan
& Creelman, 1991). Therefore according to the sample size
model, in a VSTM experiment an observer's sensitivity to a stimulus
change, d', will be inversely proportional to square-root of N.
Unfortunately,
few studies have directly tested this prediction of the sample
size model. Some evidence has been provided by Palmer (1990),
who performed a VSTM experiment using arrays composed of lines
of varying length, and set-sizes of one, two or four. The task
of observers was to determine whether there had been a change
in the length of one of the lines. It was found that observers'
thresholds increased proportional to square-root of N, in accordance
with the predictions of the sample size model. |
|