Table Of ContentInformation-Theoretic Analysis of Refractory
Effects in the P300 Speller
Vaishakhi Mayya∗, Boyla Mainsah∗, and Galen Reeves∗†
∗Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
†Department of Statistical Science, Duke University, Durham, NC, USA
Abstract—The P300 speller is a brain-computer interface that
enables people with neuromuscular disorders to communicate
based on eliciting event-related potentials (ERP) in electroen-
cephalography (EEG) measurements. One challenge to reliable
7
communication is the presence of refractory effects in the
1
P300 ERP that induces temporal dependence in the user’s
0
EEG responses. We propose a model for the P300 speller
2
as a communication channel with memory. By studying the
n maximum information rate on this channel, we gain insight into
a the fundamental constraints imposed by refractory effects. We
J constructcodebooksbasedontheoptimalinputdistribution,and
2 compare them to existing codebooks in literature. Fig. 1: Example of a P300 speller visual interface. The flash
1 group is the set of characters that are illuminated.
I. INTRODUCTION
]
A brain-computer interface (BCI) is a system that monitors
T
electrophysiological signals and translates the information
I
. encoded in these signals into commands that are relayed to a The order in which the target and non-target stimulus events
s
are presented plays a significant role in the ERP elicitation
c computer [1]. The P300 speller, developed by Farwell and
[ process.Thisisdue,inpart,torefractoryeffects[5],wherethe
Donchin [2], is a BCI that provides an alternative means
ability to elicit a strong ERP response to every target stimulus
1 of communication for individuals with severe neuromuscular
event presentation is affected by the target-to-target interval
v diseases, such as amyotrophic lateral sclerosis [3], that impair
3 neural pathways that control muscles. In the extreme case (TTI), which is the amount of time between stimulus events. If
1 a P300 ERP is elicited following the presentation of a target,
of locked-in syndrome, individuals lose all voluntary muscle
3 the amplitude of a successive P300 ERPs elicited in response
controlandareunabletocommunicateverballyorviagestures.
3
to subsequent target events with low TTI may be attenuated or
0 The P300 speller relies on eliciting and detecting event-
distorted [6]. The precise behavior of the refractory effects is
. related potentials (ERP) embedded in electroencephalography
1
not well understood and can vary depending on the user and
0 (EEG) data. These ERPs are elicited in response to specific
the type of system [6].
7 stimuluseventswithinthecontextofanoddballparadigm.The
1 user is presented with a sequence of stimulus events that fall In this paper, we develop and analyze a communication
v: into one of two classes: a rare oddball, i.e. target stimulus, and model for the P300 speller that allows us to understand some
i a frequent non-target stimulus [4]. The presentation of the rare of the fundamental constraints imposed by refractory effects.
X
target stimulus event elicits an ERP response that includes a Our channel model consists of a finite state machine (FSM)
ar distinct positive deflection called the P300 signal. followed by a memoryless noisy channel; together they form a
The P300 allows a user to communicate one character at finitestatechannel(FSC).TheFSMusesL+1statestomodel
a time. In a visual P300 speller, the user is presented with a refractory effect that last for L time steps. The memoryless
an array of choices on a screen, such as the grid shown in noise channel describes the mapping between an ERP and
Figure 1. While the user focuses on a target character, subsets the response generated by processing the EEG measurements.
of characters, called flash groups, are sequentially illuminated We study properties of the distributions that maximize the
on the grid. In this context, the illumination of a flash group mutualinformationrateonthischannel.Weprovideanexplicit
is a stimulus event. Under ideal conditions, a P300 ERP is characterizationoftheoptimalinputdistributioninthenoiseless
elicited each time that the target character is presented. The case, and we use the generalized Blahut-Arimoto algorithm
elicited ERPs are embedded in noisy EEG data. Following (GBAA) [7], [8] to compute the optimal input distributions
each stimulus event, a time window of the EEG waveform is numerically for the noisy case. We then use the optimal
analyzed to determine the likelihood that the stimulus event distributiontodesigncodebooks(i.e.sequencesofflashgroups)
contains the target character. After a sufficient amount of data that can be used for the P300 speller. Performance is assessed
has been collected, the target character is estimated based on using numerical simulations.
the observed responses. Therestofthisdocumentisorganizedasfollows.SectionII,
reviews the previous approaches to designing flash groups for dependoncurrentandpreviousinputs.Inthiscase,itispossible
the P300 speller and provides relevant background concerning to describe a state sequence that is uniquely determined by the
channels with memory. Section III introduces our channel input sequence, i.e. P(Y |X ,S )=P(Y |S ,S ).
n n n−1 n n n−1
model and a method to design flash groups using this model. We focus on the setting where the channel input {X }
n
The results in are presented Section IV. is an r-th order, time-invariant Markov process. Following
Kavcˇic´ [7], the distribution on {X } is parametrized using an
II. BACKGROUND |X|r×|X|r transition matrix P annd the corresponding mutual
A. Current codebooks used for P300 speller information rate is defined according to
Within the context of the P300 speller, a codebook C is a
1
W ×N binary matrix that indicates which of the W character R(P)=nl→im∞nI(X1n;Y1n|S0), (1)
areflashedacrossN trials.Eachrowofthematrixcorresponds
where Xn = [X ,··· ,X ]. For an IFSC, the limit exists
to the flash pattern (or codeword) of a specific character. The 1 1 n
and is independent of the distribution on S . The maximum
columns correspond to flash groups. Given that a user’s target 0
character is w, the entry C(w,n) indicates whether the target information rate over r-th order Markov sources is defined
character is flashed in the n-th trial. according to
Withinthissetting,previousworkhavefocusedonthedesign
R = maxR(P), (2)
r
of codebooks in order to increase the accuracy of the P300 P∈Pr
speller [2], [9]–[13]. In the row-column paradigm (RCP) [2],
where P is the set of all transition matrices for an r-th order
r
the codebook is generated using a random permutation of the
Markov source. A distribution P∗ ∈P is said to be optimal
r
charactersinrowsandcolumnsinagridlayout,suchastheone
if it achieves the maximum in (2). Note that R provides a
r
showninFigure1.Duetotherandomizedorderofpresentation
lower bound on the capacity of the channel.
of the row and column flash groups in the RCP, characters are
Kavcˇic´ [7] provides a stochastic method for solving the op-
often flashed twice consecutively.
timization problem (2) for ISI IFSCs based on a generalization
The checkerboard paradigm (CBP) [9] was developed to
of the Blahut-Arimoto algorithm. This method is extended to
mitigaterefractoryeffectsbyimposingaminimumtimeinterval
a larger class of IFSCs by Vontobel et al. [8].
between target character presentations. Due to the method of
construction of the codebook in the CBP, the duration of the III. PROPOSEDCHANNELMODELANDCODEBOOKDESIGN
minimum target interval depends on the specific geometry of
A. P300 speller channel model
the grid layout.
Other approaches have used ideas from coding theory The P300 speller is modeled as a cascade of the FSM and
to construct codebooks, such as maximizing the minimum a noisy memoryless channel, as shown in Figure 2. The states
Hamming distance [10], [11], e.g. the D10 codebook [10]. of the FSM model the memory in the channel induced by the
However, in online studies, these codebooks resulted in similar refractory period. Throughout this paper, the time step refers
orworseperformancewhencomparedtotheRCP.Onepossible to the duration between the presentation of successive flash
explanation for this behavior is the fact the design of these groups. A refractory period that lasts L time steps is modeled
codebooks did not account for refractory effects. using L+1 states.
TheexplicitconnectionbetweentheBCIandaninformation The channel input, X , is a binary variable that indicates
n
channelhasbeenconsideredpreviously[10],[14].Forexample, whether the target character is present (X =1) or not present
n
Omar et al. [14] represented the BCI-based communication (X =0) in the flash groups in the n-th trial. The state, S ,
n n
(based on motor imagery) as a memoryless binary symmetric represents whether the channel is in the ground state, or one
channel. However, in a P300 speller context, a memoryless of L possible refractory states. The transitions between states
channel assumption does not account for system memory due are determined according to
to refractory effects.
G, if X =0, S =G or S =R
n n−1 n−1 L
B. Information rates for finite-state channels
S = R , if X =0, S =R ,l∈{2,··· ,L}
n l n n−1 l−1
The channel model we study is an indecomposable finite- R , if X =1 ,
1 n
state channel (IFSC) [15]. The channel input is a discrete-
time process {X } supported on a finite alphabet X. The and are illustrated in Figure 3. Note that the state transition is
n
channel state at time n is modeled by a random variable a deterministic function of the previous L channel inputs.
Sn ∈ {1,··· ,k}. The channel output at time n is a The intermediate output, Zn, is a binary variable that is
random variable Y ∈ Y whose distribution is a func- equal to one if and only if the input is one and the channel in
n
tion of the input X and state S . The channel is de- not in a refractory state, i.e.
n n−1
fined by the conditional distribution P(Yn,Sn|Xn,Sn−1) = 1, if X =1, S =G
P(Yn|Xn,Sn−1)P(Sn|Xn,Sn−1). n n−1
A channel is said to be an intersymbol interference (ISI) Zn = 0, if Xn =1, Sn−1 =Rl,l∈{1,2··· ,L}
channel if the state transitions and the output of the channel 0, if X =0 .
n
Finite state channel
Xn Zn Yn
1 1 1
W Encoder FSM Memoryless Channel Decoder W(cid:99)
Fig. 2: Model of the P300 speller communication channel. The target character, W, is encoded with a codeword, Xn, which
1
is transmitted through a cascade of a finite state machine (FSM), with an intermediate output, Zn, and a noisy memoryless
1
channel. The output sequence, Yn, is used to obtain an estimate of the target character, W(cid:99).
1
When the channel is in one of the refractory states, Z =0, Observe that if the channel input is drawn according to a
n
independently of the input. The output, Y , is the observed distribution P in the constrained set P˜ , then the mapping
n L
response that depends only on the intermediate output Z . between the input and noiseless output is invertible, and thus
n
In the context of the P300 speller, the distribution on Xn the mutual information is equal to the entropy of the input:
1
is induced by the codebook, C, and the distribution over the
I(Xn;Yn|S )=H(Xn|S )−H(Xn|Yn,S )
target character. The intermediate output Z indicates whether 1 1 0 1 0 1 1 0
n
a P300 ERP was elicited for the n-th trial. The probabilistic =H(X1n|S0).
mapping from Z to Y models the noise induced by the EEG
n n Therefore, the maximum information rate in the constrained
measurement and classification process.
setting can be expressed as
The case L=1 was studied in our previous work [16]. In
this paper, we study the behavior of channels with general L. max R(P)= max lim 1H(Xn|S )
P∈P(cid:101)L P∈P(cid:101)Ln→∞n 1 0
B. Codebook design
H (a)
= max b , (4)
Our approach to codebook design is to find distributions on a∈[0,1]1+La
theinputsequence{X }thatmaximizethemutualinformation
n
where H (p)=−plogp−(1−p)logp is the binary entropy
rateinthechannel,andthendesigncodebooksthatapproximate b
function.Moreover,bydifferentiationitcanbeverifiedthatthe
these distributions. For the P300 speller, the channel input is
maximum in (4) is achieved by the unique solution a∗ ∈[0,1]
a function of the user’s target character and the codebook,
to the equation
C(w,n). For a fixed codebook of length n, the randomness
a=(1−a)L+1. (5)
in the input is due to the randomness in the target character.
In order to design codebooks with good properties, we use a
In the case of L = 1, the optimal value can be computed
√
random construction in which the rows of the codebook are explicitly as a∗ = 3− 5. As pointed out in [19], 1−a∗ is the
drawn i.i.d. from the distribution that maximizes the mutual 2
inverse of the golden ratio.
information. We refer to this construction as a memory-based
The next result shows that this rate also provides an upper
codebook for the model with refractory period of length L
bound on the mutual information rate.
(MBC(L)).
Proposition 1. Consider the P300 speller channel model in
IV. RESULTS Figure 2, with L refractory states. For any distribution on the
channelinput{X }andinitialstateS ,themutualinformation
A. Analysis of the optimal input distribution n 0
satisfies
This section studies properties of the maximum information
1
rate in the noiseless case, where the observed response Yn limsup I(Xn;Yn|S )≤RUB, (6)
is equal to the intermediate output Zn. In this setting, the n→∞ n 1 1 0 L
optimizationproblemcanbeexpressedintermsofmaximizing where
entropy rates of run-length limited sequences [17], [18].
H (a)
To facilitate our analysis, we find it useful to introduce a RUB = max b . (7)
L a∈[0,1]1+La
constrained class of Markov sources. Specifically, we define
P(cid:101)r to be the set of all r-th order Markov processes of the Proof: Starting with the data processing inequality, we
form: have
P(Xn =1|Xnn−−r1)=(cid:40)a0,, iofthXernnw−−ir1se=0. (3) n1I(X1n,Y1n|S0)≤ n1I(X1n,Z1n|S0)
1
= H(Zn|S ),
Note that every sequence drawn according to this distribution n 1 0
has at least r zeros between ones. In the rest of the paper, where the second step follows from the chain rule for mutual
we will focus exclusively on the setting where the memory in information and the fact that Z is a deterministic function of
n
source is matched to the memory in the channel, i.e. r =L. the channel input. Next, we note that the IFSC must transition
0
0 0 0 0
0 G RL RL−1 ··· R2 1 R1 1
1
1
1
Fig. 3: Model of the P300 ERP elicitation process as an FSM. The input, X , which governs the state transitions is marked
n
over the arrows. G is the ground state and R ,l=1,2...L are refractory states.
l
through all L refractory states between successive ones, and Accuracy with different refractory states
thus{Z }isan(L,∞)constrainedbinarysequence[17],[18]. 1
n L = 1
Therefore, by [17, Theorem 1], it follows that 0.9 L = 2
L = 3
1 H (a) 0.8
limsup H(Zn|S )≤ max b . (8)
n→∞ n 1 0 a∈[0,1]1+La 0.7
y
ac0.6
ur
In light of Proposition 1, we see that as far as the noiseless cc0.5
A
case is concerned, we can find an optimal distribution by
0.4
restricting our attention to the constrained set P(cid:101)L.
0.3
Proposition 2. Consider the P300 speller channel model in
0.2
Figure 2, with no noise (i.e. Y = Z ) and L refractory
n n
states. Let P∗ be the optimal distribution in the constrained 0.10 0.5 1 1.5 2 2.5 3
set P(cid:101)L with parameter a∗ defined by (5). Then, P∗ achieves <2 Noise power
the maximum information rate for the channel, i.e. Fig. 5: Performance of the MBC(L) as L increases.
R (P∗)= max R (P)= max R (P). (9)
L L L
P∈P(cid:101)L P∈PL
character is transmitted across the channel. The received
We remark that the distribution described in Proposition 2
sequence is used to estimate the target character using an
might not be the only distribution that achieves the maximum.
optimal decoder. The accuracy is the percentage of characters
In [16], we showed that in a channel with L = 1, when the
that are correctly estimated over 1000 runs.
inputisafirstorderMarkovsource,thereareatleasttwoinput
Figure 4 compares the performance of the codebooks
transition matrices for which the maximum information rate is
for channels with L = 1,2 refractory states, where the
achieved.
corresponding MBC(L) is generated based on the number
In the presence of noise, the problem of optimizing the
of channel states. In both cases, the MBC(L) performs better
informationrateR(P)overr-thorderMarkovsourcesP ∈P
r
than all the other codebooks for a given channel.
can be solved numerically using the GBAA [8].
Figure 5 shows the performance of MBC(L) associated
B. Simulation results with a channel with L = 1,2,3 refractory states. The figure
illustrates the decrease in the performance of MBC(L) as L
This section uses numerical simulations to compare the
increases.
performance of our memory-based codebook design with the
Theseresultsillustratethebenefitsthatcanbeachievedwhen
RCP, CBP, and D10 codebooks. We estimate accuracy as
the codebooks are designed based on the process underlying
a function of the channel noise parameter and the number
the generation of refractory effects. The extent to which our
of states in the channel. Following the setup in [20], these
model is representative of refractory effects in the P300 speller
simulations apply to the P300 speller layout shown in Figure 1.
is an important direction for future work.
The noise is modeled using an additive white Gaussian noise
(AWGN) channel with noise power σ2.
V. CONCLUSION
UsingtheGBAA,wefindtheoptimalinputtransitionmatrix
for general σ2. We then generate codebooks that are optimized This paper develops a communication model to represent
for channel noise and memory, MBC(L), as described in the ERP elicitation process in the P300 speller and then
Section III. uses this model to design codebooks that are optimized as
In simulations, we select one of 36 characters uniformly a function of the length of the refractory period and channel
as the target character. The codeword associated with that noise. Simulation results suggest that this flexible framework
FSC with one refractory state FSC with two refractory states
1 1
MBC(1)
0.9 CBP 0.9
D10
RCP
0.8 0.8
0.7 0.7
Accuracy00..56 Accuracy00..56
0.4 0.4
MBC(2)
0.3 0.3 CBP
D10
0.2 0.2 RCP
0.1 0.1
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
<2 Noise power <2 Noise power
(a) (b)
Fig. 4: Codebook performance as a function of channel noise parameter, σ2, for the P300 speller channel with (a) L=1 and
(b) L=2 refractory states. The MBC(L) outperforms the other codebooks in both channel models.
for codebook design could lead to improved performance in [12] R.Ma,N.Aghasadeghi,J.Jarzebowski,T.Bretl,andT.P.Coleman,“A
settings where refractory effects have a significant impact on stochasticcontrolapproachtooptimallydesigninghierarchicalflashsets
inP300communicationprostheses,”NeuralSystemsandRehabilitation
the accuracy of the P300 speller. The performance of the
Engineering,IEEETransactionson,vol.20,no.1,pp.102–112,2012.
memory-based codebook needs to be verified with EEG data [13] G.Cuntai,M.Thulasidas,andW.Jiankang,“HighperformanceP300
and validated with online implementation. spellerforbrain-computerinterface,”inBiomedicalCircuitsandSystems,
Dec2004.
[14] C. Omar, A. Akce, M. Johnson, T. Bretl, R. Ma, E. Maclin, M. Mc-
REFERENCES Cormick,andT.Coleman,“Afeedbackinformation-theoreticapproach
to the design of brain-computer interfaces,” International Journal of
[1] J.Wolpaw,N.Birbaumer,D.McFarland,G.Pfurtscheller,andT.Vaughan, Human-computerInteraction,vol.27,pp.5–23,2011.
“Brain-computer interfaces for communication and control,” Clinical [15] R.G.Gallager,InformationTheoryandReliableCommunication. New
Neurophysiology,vol.113,no.6,pp.767–91,2002. York,NY,USA:JohnWiley&Sons,Inc.,1968.
[2] L. A. Farwell and E. Donchin, “Talking off the top of your head: [16] V. Mayya, B. Mainsah, and G. Reeves, “Modeling the P300-based
Toward a mental prosthesis utilizing event-related brain potentials,” brain-computerinterfaceasachannelwithmemory,”September2016,
Electroencephalogr Clin Neurophysiol, vol. 70, no. 6, pp. 510–523, unpublishedconferencepaperat:54thAnnualAllertonConferenceon
1988. Communication,Control,andComputing,Allerton.
[3] E. W. Sellers, T. M. Vaughan, and J. R. Wolpaw, “A brain-computer [17] E.ZehaviandJ.K.Wolf,“Onrunlengthcodes,”IEEETransactionson
interface for long-term independent home use,” Amyotrophic Lateral InformationTheory,vol.34,no.1,pp.45–54,January1988.
Sclerosis,vol.11,no.5,pp.449–455,2010. [18] K. A. S. Immink, P. H. Siegel, and J. K. Wolf, “Codes for digital
[4] S. Sutton, M. Braren, J. Zubin, and E. R. John, “Evoked-potential recorders,”IEEETransactionsonInformationTheory,vol.44,no.6,pp.
correlates of stimulus uncertainty,” Science, vol. 150, no. 3700, pp. 2260–2299,October1998.
1187–8,1965. [19] H.Permuter,P.Cuff,B.VanRoy,andT.Weissman,“Capacityofthe
[5] J. Jin, E. W. Sellers, and X. Wang, “Targeting an efficient target-to- trapdoor channel with feedback,” IEEE Transactions on Information
targetintervalforP300spellerbrain–computerinterfaces,”Medical& Theory,vol.54,no.7,2008.
biologicalengineering&computing,vol.50,no.3,pp.289–296,2012. [20] B. O. Mainsah, L. M. Collins, and C. S. Throckmorton, “Using the
[6] S.Martens,N.Hill,J.Farquharetal.,“Overlapandrefractoryeffectsin detectability index to predict p300 speller performance,” Journal of
abraincomputerinterfacespellerbasedonthevisualP300event-related NeuralEngineering,vol.13,no.6,p.066007,2016.
potential,”Journalofneuralengineering,vol.6,no.2,p.026003,2009.
[7] A.Kavcˇic´,“OnthecapacityofMarkovsourcesovernoisychannels,”in
GlobalTelecommunicationsConference,2001.GLOBECOM’01.IEEE,
vol.5,2001,pp.2997–3001.
[8] P. O. Vontobel, A. Kavcˇic´, D. M. Arnold, and H.-A. Loeliger, “A
generalizationoftheBlahut–Arimotoalgorithmtofinite-statechannels,”
IEEETransactionsonInformationTheory,vol.54,no.5,pp.1887–1918,
2008.
[9] G. Townsend, B. K. LaPallo, C. B. Boulay, D. J. Krusienski, G. E.
Frye, C. K. Hauser, N. E. Schwartz, T. M. Vaughan, J. R. Wolpaw,
and E. W. Sellers, “A novel P300-based brain-computer interface
stimulus presentation paradigm: Moving beyond rows and columns,”
ClinNeurophysiol,vol.121,no.7,pp.1109–20,2010.
[10] J.Hill,J.Farquhar,S.Martens,F.Biessmann,andB.Schlkopf,“Effects
of stimulus type and of error-correcting code design on BCI speller
performance,”inAdvancesinNeuralInformationProcessingSystems
21. CurranAssociates,Inc.,2009,pp.665–672.
[11] J.Geuze,J.D.Farquhar,andP.Desain,“Densecodesathighspeeds:
Varying stimulus properties to improve visual speller performance,”
Journalofneuralengineering,vol.9,no.1,p.016009,2012.