Table Of ContentLinköping University Post Print
Message classification as a basis for studying
command and control communication: an
evaluation of machine learning approaches
Ola Leifler and Henrik Eriksson
N.B.: When citing this work, cite the original article.
The original publication is available at www.springerlink.com:
Ola Leifler and Henrik Eriksson, Message classification as a basis for studying command and
control communication: an evaluation of machine learning approaches, 2011, Journal of
Intelligent Information Systems.
http://dx.doi.org/10.1007/s10844-011-0156-5
Copyright: Springer Science Business Media
http://www.springerlink.com/
Postprint available at: Linköping University Electronic Press
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-67227
JournalofIntelligentInformationSystemsmanuscriptNo.
(willbeinsertedbytheeditor)
Message Classification as a basis for studying command and
control communications - An evaluation of machine learning
approaches
OlaLeifler (cid:1) HenrikEriksson
thedateofreceiptandacceptanceshouldbeinsertedlater
Abstract In military command and control, success relies on being able to perform key
functionssuchascommunicatingintent.Moststafffunctionsarecarriedoutusingstandard
means of text communication. Exactly how members of staff perform their duties, who
they communicate with and how, and how they could perform better, is an area of active
research.Incommandandcontrolresearch,thereisnotyetasinglemodelwhichexplains
all actions undertaken by members of staff well enough to prescribe a set of procedures
for how to perform functions in command and control. In this context, we have studied
whether automated classification approaches can be applied to textual communication to
assistresearcherswhostudycommandteamsandanalyzetheiractions.
Specifically,wereporttheresultsfromevaluatingmachineleaningwithrespecttotwo
metrics of classification performance: (1) the precision of finding a known transition be-
tweentwoactivitiesinaworkprocess,and(2)theprecisionofclassifyingmessagessimi-
larlytohumanresearchersthatsearchforcriticalepisodesinaworkflow.
Theresultsindicatethatclassificationbasedontextonlyprovideshigherprecisionre-
sultswithrespecttobothmetricswhencomparedtoothermachinelearningapproaches,and
thattheprecisionofclassifyingmessagesusingtext-basedclassificationinalreadyclassi-
fieddatasetswasapproximately50%.Wepresenttheimplicationsthattheseresultshavefor
thedesignofsupportsystemsbasedonmachinelearning,andoutlinehowtopracticallyuse
textclassificationforanalyzingteamcommunicationsbydemonstratingaspecificprototype
supporttoolforworkflowanalysis.
Keywords Command and control; classification; exploratory sequential data analysis;
workflowmining;randomindexing;textclustering
O.Leifler(cid:1)H.Eriksson
Dept.ofComputerandInformationScience
LinköpingUniversity
SE-58183Linköping,Sweden
Tel.+4613281000
E-mail:{ola.leifler,henrik.eriksson}@liu.se
2
1 Introduction
Althoughsuccessfulcommandandcontrolisessentialtothesuccessofcrisismanagement
andmilitaryoperations,ourunderstandingofhowcommandandcontrolisperformedisstill
limited (Brehmer 2007). Studying command teams present commanders and researchers
with great challenges. First, commanders need to accomodate shifting circumstances and
uncertaininformationabouttheenvironmentintheirworkprocesswhichmakesthework
processinherentlydynamic(Kleinetal.1993).Second,theuseofelectroniccommunica-
tionsandnewmediaincommandteamsyieldslargeamountsofdata(e.g.textcommunica-
tions,audio,video,computerlogs)thataredifficultforresearcherstoprocess.
An important aspect of analyzing command and control is to find critical episodes in
theworkflowthatwarrantfurtherstudy.Currently,mostanalysesofelectroniccommunica-
tioninbothsituatedanddistributedteamworkareconductedmanuallythroughtheuseof
classificationschemes(Silverman2006).Aconsequenceofthesignificanteffortrequiredby
manuallyclassifyingcommunicationsisthatonlyapartofteams’communicationpatterns
canbeexplored.Theprospectofusingautomaticsupportforfindingrelationsincommand
andcontrolcommunicationsisthereforeappealing.
Thispaperpresentsanevaluationofautomatedapproachesforclassifyingtextmessages
intheworkflowsofcommandandcontrolteamsbycomparingaselectionofclassifierswith
respecttotheirprecisionofclassifyingmessagessimilarlytohumanexperts.Ourselection
ofclassificationapproachestocomparewasjustifiedbytherequirementsofawidelyused
methodforstudyingcommandandcontrol,ExploratorySequentialDataAnalysis(ESDA)
(SandersonandFisher1994).Theresultsfromourevaluationaretwofold:first,weidentifya
classificationapproachwhichissuitableforuseinanESDAapplication,andsecond,based
ontheprecisionresultsattained,weoutlinehowtheclassificationapproachcouldbeused
tosupportthestudyofcommandandcontrolworkflows.
In the following sections, we describe command and control research and the ratio-
naleforinvestigatingmachinelearningapproachesforsupportingitinSection2.Section3
presentsresearchonextractingpatternsrelatedtoworkflowsandrelatedconceptsfromtexts.
InSection4wepresentthespecificdatasetswehaveappliedourclassificationapproaches
on.WepresentourclassificationapproachesinSection5andtheresultsofclassifyingmes-
sages in Section 6. Based on these results, we discuss their implications on the design of
support tools for analyzing command and control communications and present an imple-
mentation that uses automatic classification of text messages in Section 7, and Section 8
concludesthispaper.
2 Background
Commandandcontrolresearchersinvestigatehowgroupsandgroupmembersperformtheir
tasks,identifyperformancemeasuresforthegroupandstudyhowtheycouldimprovetheir
performance (Brehmer 2007). There are several frameworks for understanding teams and
teamwork(e.g.(Argyle1972;Salasetal.2008)).Acommonrepresentationofteamwork-
flows is to use graphs, where nodes represent tasks and arcs denote transitions between
tasks.Suchgraph-basedworkflowmodelshavebeensuggestedfortheanalysisandsupport
thecoordinationofworkinvariousprofessionalsettings(Medina-Moraetal.1992;vander
AalstandvanHee2002).
One example of a workflow model that aims to describe how members of command
teams perform their tasks is the Dynamic Observe-Orient-Decide-Act model in Figure 1.
3
Fig.1 TheDynamicObserve-Orient-Decide-ActloopbyBrehmerasanabstractmodelofaworkflowin
commandandcontrolwithtasksandtransitionsbetweenthem(Brehmer2005).
DOODAdescribesasetoftaskswithtransitionsfromonetasktoanother.Thesetaskscan
beoverlappingoriterating,suchasthetasksofsensemaking(Weick1995)anddatacollec-
tioninDOODA.Atonepoint,however,thereisatransitionfromsensemakingtoplanning,
when the commander’s intent is formulated and communicated to subordinate units. Irre-
spective of whether this model accurately describes command and control at a sufficient
levelofdetailforcorrelatingtheactivitiesinthemodeltotheobservableactivitiesinacom-
mandstaff,themodelcouldbeusedasahypothesisforanalyzingstaffwork.Ifwebelieve,
accordingtoamodelsuchasDOODA,thatthestaffshouldbeginwithdatacollection,and
weknowthatmessagesofcertaintypesdenoteatransitiontothesensemakingstepinthe
DOODAprocess,thentheabsenceorpresenceofsuchtypesofmessageswouldbepartof
aresearcher’sworkofestablishingperformancemeasuresforacommandstaff.
Ingeneral,wecaninterpretthetaskofunderstandingcommandandcontrolasthreesep-
aratetasks.First,understandinghowcommandteamsandteammembersperformtheirtasks
meansconstructingageneralworkflowmodelsuchasDOODAfromcommandandcontrol
scenarios.Second,establishingdirectperformancemeasuresissynonymoustorelatingthe
workflowmodeltotheestimatedoutcomeofscenariosasdefinedbyindirectmeasurements
ofscenariooutcome(forexample,performancescoresincomputersimulations(Johansson
etal.2003)orevaluationsbyhumanexpertsofteamperformanceinrole-playingexercises
(Jensen2009)).Third,improvingperformanceisequalto,ineachparticularscenario,using
thoseperformancemeasurestorelatestaffactionstotheproposedworkflow.Theprocessby
whichresearchersestablishaworkflowandrelatestaffactionstoitfromrecordedscenario
dataisbasedontwoprincipalactivities:(1)labelingcommunicationactswithacategoriza-
tionscheme(Thorstenssonetal.2001),and(2)lookingforhigherlevelpatternsofepisodes
(tasks)withthelabelledcommunicationactstofocusthesearchforcriticalpointsthathave
affectedtheoutcomeofthescenario(SandersonandFisher1994;Albinssonetal.2004).
The work of labeling messages according to message categories is the most time-
consumingstep,withvastquantitiesofcommunicationdatatosiftthroughiteratively,first
searchingforcommonalitiesthatcanleadtoclassificationschemes,andlaterbyapplying
classificationschemestoallutterancesandreducingtheamountofdatatoasetofepisodes
basedontheclassification.Thisisalsotheactivityforwhichweevaluatetheuseofmachine
learningtechniques.
3 Relatedwork
Theproblemofinferringactivitiesfromtext-basedcommunicationshasbeenstudiedpre-
viouslybyKushmerickandLau(KushmerickandLau2005).Theirapproachwasbasedon
searchingforspecificsyntacticpatternsoriginatingfromtheuseofcomputersoftware(e-
4
commercesystems).Thosepatternswereinturnusedtoorganizemessagesintoworkflows.
Patternsoriginatingfromtheuseofcomputersystemshasalsobeenstudiedbythework-
flowmanagementcommunity(vanderAalstetal.2003)whereworkflowshavebeenelicited
frominteractionswithworkflowmanagementsystemsorothersoftwaresystems.Boththese
approachesconcerntheminingofmachine-generatedpatterns,notpatternsoriginatingfrom
humanactivities.
Regardingtherecognitionofhumanactivitiesfromtext,Scerrietal.(2008)havepro-
posed a model for human workflow management in a semantic desktop environment that
relies on the detection or tagging of speech acts in e-mail. Their approach is based on
SpeechActrecognitionperformedbyaspeechactextractionwebservicewhichusesgram-
marpatternsfordetectingspeechacts.Theirstatedapplicationistosupportindividualsby
monitoringunresolvedissuesine-mailconversationssuchasunansweredquestions.Other
researchershavedescribedanapproachtoworkflowminingfromunstructureddatawhich
reliesontheexistenceofafixed,knownnumberofactivitytypesornamedentitiesinmes-
sages for determining which activity a message pertains to (Wen et al. 2009; Geng et al.
2009). Mainly, however, the problem of extracting patterns from e-mail has been studied
for the purpose of filtering spam (Sahami et al. 1998) which is essentially equivalent to
consideringwhetheramessageisatallrelatedtoanykindofactivitytheuserisengagedin.
Severalprojectshaveattemptedtoelicitpatternsofadomain-specificdiscourse,mainly
from questions and responses sent between customers and company support lines for the
purpose of helping customer support identify previous,relevant answers to newquestions
(e.g.(LarssonandJönsson2009;Chalamallaetal.2008)).
Indocumentmanagement,researchershavestudiedapproachestorelatespecificdomain
knowledgeintheformofconcepts,objectsandrelationstotextualdocuments(McDowell
andCafarella2006;Eriksson2007)andbasedonsuchsemanticdocuments,someprojects
havestudiedhowtocreatesupportforinformationmanagementinteamworkflowsbyusing
domain-specificdocumentfeatures(Franzetal.2007;LeiflerandEriksson2009).
4 Material
Weusedthreedatasetstoestablishhowwellmachinelearningapproacheswouldclassify
messagescomparedtohumanclassification.Ourdatasetscamefromthreecommandand
controlscenarios(LabeledALFA-05,C3Fire-05andLKSfromtheprojectstheyoriginate
from) in which crisis management teams had used free text-based means of communica-
tionforcoordinatingtheirwork(fendingoffforestfiresinALFA-05andC3Fire-05,and
defendingagainstinformationwarfareinLKS).Inallsettings,theparticipantsengagedin
activitiestheywerelikelytoencounterintheirprofessionandthesettingsusedhadauthentic
chainsofcommandandscenariodescriptions.Thetasksineachscenariowereconductedas
simulatedexerciseswheretheparticipantscollaboratedinteamstosolveatask.Theirper-
formancehadbeenassessedbythestaffleadingtheexercises,whichinallcasesconsisted
ofresearchersstudyingteamperformances.
In Table 1 we list the attributes made available to the classifiers we studied. Some of
theattributeswerederivedfromotherattributesandreflectedwhatwebelievedwasrelevant
for human classification of the messages in each dataset. The Message direction attribute
iscalculatedusingthealgorithminFigure2,whichimplementsthecompareTomethod
availableinJavaandotherprogramminglanguages.
5
Table1 Non-textattributesusedformessageclassification.
Attribute Description ALFA-05 C3Fire-05 LKS
Sender Text (cid:2) (cid:2) (cid:2)
Recipient Text (cid:2) (cid:2) (cid:2)
Senderlevel1 f0:::4g, low values represent (cid:2) (cid:2)
highrankintheorganization
Recipientlevel1 f0:::4g, low values represent (cid:2) (cid:2)
highrankintheorganization
Questionmarkspresent1 ftrue;falseg (cid:2) (cid:2)
Messagedirection1 f(cid:0)1;0;1g, calculated as the (cid:2)
normalized difference between
the sender level and recipient
level
Messagetext Text (cid:2) (cid:2) (cid:2)
Messagetime Date (cid:2) (cid:2) (cid:2)
Messagetype Nominaldecisionattribute (cid:2) (cid:2) (cid:2)
def get_message_direction(instance):
direction = instance.sender_level - instance.recipient_level
if direction > 0:
return 1
elif direction < 0:
return -1
else:
return 0
Fig.2 AlgorithmforcalculatingtheMessagedirectiondatasetattribute.
4.1 ALFA-05
TheALFA-05datasetconsistedof849textmessagesexchangedbetweensevencomman-
dersinasimulatedcrisisresponsescenario(Trnkaetal.2006).Duringthescenario,com-
manders operated at three levels of command, in two administrative areas (approximately
county-sizedareas)andplayedarole-playingsimulationexercise(TrnkaandJenvald2006)
inwhichtherewasinitiallyaforestfirebutsubsequentlyalsoanevacuationfromazooas
wellasasearchandrescueoperation.Participantscommunicatedwithoneanotherthrough
atext-basedmessagingsystemdesignedforuseinmicro-worldsimulationswiththeC3Fire
simulationenvironment(Johanssonetal.2003),althoughitsharedthebasicfeaturesofan
e-mailmessagingsystemwithouttheuseofsubjectlinesorotherauxiliarye-mailheaders.
Thescenariowasplayedoverthecourseofoneday.
Eachmessagehadbeenassignedoneof19differentclassesbyhand.Theseclassesfall
intofourspeech-act-relatedcategoriestiedtothefunctionsofcommandandcontrol(Trnka
etal.2006).Thefourcategorieswerequestions,information,commandsandothermessages
(theMessagetypeinTable1).WhenresearchershadlookedforpatternsintheALFA-05
dataset,theyhadstudiedboththegeneralproportionsofmessagesofeachclasssenttoand
fromtheparticipantsinthescenario,buttheyhadalsostudiedspecificsequencesofspeech
acts, such whether as a set of information and question-labelled message exchanges had
precededacommand.
6
4.2 C3Fire-05
TheC3Fire-05datasetwassimilartoALFA-05withregardtothescenarioplayedandthe
categorizationused.Itconsistedof619messages.Oneofthemaindifferenceswasthatit
wascategorizedbytwoindependentresearcherswitha77.86%agreementbetweenthetwo
on which category to assign each message (the agreement was 87.02% when considering
onlythefourmaincategoriesdescribedinthesectionabove).Onlythosemessageswhich
hadbeenclassifiedsimilarlybythetworesearcherswereselectedforclassifiercomparison.
The other main difference compared to ALFA -05 messages was the participants of the
study,whoweredomainexpertsintheALFA-05scenarioandstudentsintheC3Fire-05
scenario.
4.3 LKS
The LKS dataset consisted primarily of 113 e-mail messages exchanged during a training
exercise concerning information warfare at the Swedish Defense Research Institute. All
participantswereexpertsinthedomainandtheexerciseservedthedualpurposebeingof
exercise for them as well as a study of performance indicators in command and control.
The scenario was role-played over the course of two days and the participants received
instructionsfromtheirhighercommandtoengageinintelligenceoperationsforthefirstday
tofindinformationabout,locate,andmonitorpotentialterrorists,andrepelthreatsduring
anevacuationofaVIPduringtheseconddayoftheiroperation.
Due to these instructions, we categorized the e-mail exchanges pertaining to the first
dayasintelligenceandthosefromthesecondasevacuation,whichwasconsistentwiththe
expectedoutcomeoftheexercise.Themanualclassificationsofbothdatasetswereusedas
validationoftheautomaticclassificationapproacheswereportinthispaper.
5 Method
Toverifythattheinformationinmessagescouldbeusedfordistinguishingcontextuallysig-
nificantclassesofmessagesfromoneanother1 consistentwithhowcommandandcontrol
researcherswouldclassifymessages,weaddedmeta-datatoourdatasetsthatwebelievedto
berelevanttoclassification.Withthesedatasets,weconductedacomparisonbetweensev-
eralclassificationapproachesbyusingstandardmethodsforevaluatingMachineLearning
algorithms.
Messagesinamilitarycommandandcontrolworkflowusuallycontaindomain-specific
attributessuchastherankandroleofparticipants.Also,researchersmayclassifyaccording
to the appearance of question marks and the grammatical structure of messages. To un-
derstand how these attributes affect automated classification, we compared the impact on
classificationresultsofencodingtheseattributesaspartofthemessageinstances.Theap-
pearanceofquestionmarksbecameabinaryattributeavailabletonon-textclassifierswhile
thegrammaticalstructurewasmadeavailabletoaStringSubsequenceKernel-basedclas-
sifier (see Section 5.2). We also evaluated the relative significance of non-text attributes
inrelationtothetextbyusingacombinedclassifierthatwoulduseatext-basedclassifier
and a non-text classifier in combination for classification. The combined classifier would
1 suchasidentifyingthetwotasksintheLKSdatasetorthemessageclassesrelatedtospeechactsinthe
ALFA-05andC3Fire-05datasets
7
Table2 FrequencyofmessagesineachofthefourmessagecategoriesoftheALFA-05dataset.
Category Proportion
Questions 23%
Information 39%
Orders 17%
Othermessages 24%
alsoprovideinformationontherelativecontributionsofanon-textclassifiercomparedtoa
text-basedone.
Apartfromdomain-specificmessageattributeswhicharelikelytoinfluencehumanclas-
sificationsofmessages,weconsideredtheinfluenceofanumericalattributewithstatistically
significantdifferencesofattributevaluesacrossthecategoriesofmessages:messagelength.
Toestablishwhetherasignificantdifferenceinmessagelengthswouldbeusedbyaclassi-
fierwhenbuildingaclassifiermodel,westudiedwhetherastandarddiscretizationapproach
(FayyadandIrani1992)(asrequiredbytheclassifiersweevaluated)wouldgeneratemean-
ingfulnominalintervalvaluesandifso,whatprecisionresultstheclassifierswouldattain.
We also considered the precision of a random classifier and used that as a baseline
for comparing the results of using our selected classification algorithms. If our classifier
wouldnotfindameaningfuldistancemeasureforthepurposeofclassifyingwithrespectto
messagecategories(intheALFA-05dataset)orbelongingtodifferentstagesinthescenario
workflow(intheLKSdataset),theclassifierwouldbasicallychooseaclassatrandom.The
precisionitcouldattainforeachdecisionclasscouldthenbedescribedasafunctionofthe
proportionofinstancesofeachdecisionclassinthetrainingdata.
Theprecisionofthealgorithmisexpressedasthenumberoftimesthealgorithmanswers
correctly,dividedbythetotalnumberofquestionsasked.Thus,itisthesumofthenumber
ofcorrectclassificationswithrespecttoeachoftheclasses.Acompletelyrandomclassifier,
givenadatasetU andafunctiondformappingmessagestothedomainofdecisionclasses
fc ;c ; :::;c gwherethesizesofeachclassisjc j = jfx 2 U : d(x) = c gjwouldattain
1 2 l i i
precisionofEquation1.
( )
jc j 2
(cid:6)l i (1)
i=1 jUj
The LKS dataset consisted of two classes, evenly distributed with 61 messages from
day one and 52 from day 2. The random precision would be (61=113)2 +(52=113)2 =
0:5032,closeto50%.GiventhedistributionofdecisionclassesinTable2,randomprecision
attainableintheALFA-05datasetwas0:232+0:392+0:172+0:242 =0:29.Classification
resultsofapproximately29%inALFA-05wouldthereforebeattributedtothedistribution
ofmessagesandnottothemessagecontents.FortheC3Fire-05dataset,thedistributionsof
classeswasmoreevenforbothsetsofclassificationsfromthetworesearchers,resultingin
randomprecisionof24.04%and25.07%respectively.
When evaluating the different approaches to classify messages, we used a stratified
cross-validation (Witten and Frank 2005) on each dataset. To accomodate the execution
times of text-based classification, we decided to use a 3-times 3-fold stratified cross-
validation on our datasets for evaluation. The results were stable when confirmed with a
train-and-testprocedureoneachdataset.
8
Table3 Messagelengthsinallcategories
Information Commands Questions Other
Mean 110(cid:6)78:94 76(cid:6)93:18 90(cid:6)68:56 55(cid:6)60:18
Median 92 58 68 32
5.1 Messagelengths
ThemessagesfromthefourmaincategoriesoftheALFA-05datasetwerecomparedwith
oneanotherwithrespecttothelengthsofthemessagesineachcategory.Sincethediffer-
entmessagecategoriescontainedadifferentnumberofmessagesandthemessagelengths
couldnotbeassumedtobenormallydistributed,wecomparedthedifferenceswithanon-
parametricMann-WhitneyU-test.Allcategoriesofmessageswerecomparedtooneanother
pairwise. All pairs of categories displayed significant differences in message lengths (p <
0.002)andthemeanandmedianvaluesdifferedasoutlinedinTable3,alongwithstandard
deviationsfromthemeans.
5.2 Classifierselection
Theclassificationschemesweusedforbothtextclassificationandnon-textclassificationon
ourdatasetswereselectedbasedontwoprimarycriteria:
1. themodelsbuiltaspartoflearningpatternsindatashouldbeaccessibletohumanin-
spection,and
2. theyshouldbecomputationallytractableforinteractiveuseinbothscenarios.
The first criterion, accessibility, was considered important because of the prospect of
usingthe resulting classifier model as abasis for a support tool for commandand control
researchers.InESDAanalysis,explorationmeansusingvariousdatasourcesincombination
todetectpatternsofteamactivity.Foracomputer-basedsupporttoolinthisprocess,estab-
lishing trust is critical, and understanding the basis for making classifications could even
be more important than high precision for classification, depending on the role of a clas-
sifier. The second criterion, computational tractability for interactive use, was considered
importantforthepracticaluseofautomaticclassification.Indataexplorationtoolssuchas
MacSHAPA(Sandersonetal.1994)andMIND(Thorstenssonetal.2001),researchersnav-
igatescenariodatalookingforcriticalepisodesbyscanningatimelineaccordingtowhich
allscenariodataisloggedtofindincidentsthatareimportantforfurtherstudy.Whenusing
suchtools,researchersexpectinteractionwithdatatobesmoothandallowfastmanipula-
tionsduetothelabor-intensivetaskoffindingcriticalepisodes.Foranautomaticclassifier
tocontributeinsuchexploration,itwouldhavetobuildaclassifiermodelfastenoughnot
tointerruptthecloserstudyofdata.
Basedonthesecriteria,weselectedatext-basedclassificationschemethatwouldcon-
nectimportanttermsaswellastherelationshipsbetweentermsduringtheprocessofclas-
sification,withtheintentionofusingthosetermsaspartofaworkflowanalysistool.Also,
it would have to handle the datasets we had with little computational overhead. Based on
these criteria, we decided to use the Random Indexing (RI) (Kanerva et al. 2000) vector
space model as the primary method of text classification. RI assigns random vectors of a
fixeddimensionalitytowordsandtextstocreatethevectormodelformeasuringsimilarity
betweentexts(Kanervaetal.2000).PriortobuildingtheRImodel,wefilteredthemessages
9
Questionmark present = true: 1
Questionmark present = false
| Sender = RL Nkpng: 2
| Sender = LKC E-ln
| | Recipient_level <= 3
| | | Time <= 1133433043000
| | | | Time <= 1133432780000: 2
| | | | Time > 1133432780000: 3
| | | Time > 1133433043000
| | | | Recipient = Ambu E-ln: 2
| | | | Recipient = Patruller E-ln: 2
| | | | Recipient = RL Nkpng: 2
| | | | Recipient = LKC D-ln
| | | | | Time <= 1133436580000: 4
| | | | | Time > 1133436580000: 2
Fig.3 PartofthedecisiontreegeneratedbytheJ48classifierontheALFA-05dataset
sothatcommonlyused,domain-independentwords(stopwords)wouldnottaintourresults.
InadditiontotheRI-basedtextclassificationmethod,aStringSubsequenceKernelwasalso
usedforanalyzingthegrammaticalstructureofmessages(seeSection5.4)andforcompar-
isonoftheRItextclassificationresults.Fornon-textclassification,weusedfourdifferent
classifiers,representingfourclassesofinferencemechanisms:
1. J48,aclassifierbasedondecision-trees(Quinlan1993)
2. aDecisionTableclassifier(Kohavi1995)
3. PART,arule-basedclassifier(FrankandWitten1998)
4. aNaveBayesclassifier(JohnandLangley1995)
The first three classifiers were selected based on the accessibility of the models they
construct, and the fourth, the Bayesian classifier, was selected due to previously reported
resultsonclassifyingmessageswithrespecttoworkflow-relatedactivitytypes(Gengetal.
2009)withaBayesianclassifier.WeconductedtheevaluationwithintheWEKAknowledge
analysisframework(Halletal.2009),withinwhichwealsoimplementedanRI-basedtext
classifier.
In deserves to be noted that, due to the relatively small datasets available for training
the classifiers, we assumed that there would be differences in precision which could be
attributedtoclassifierselection,apartfromthedifferencesinwhatmodelstheybuild.For
largerdatasets,ithasbeenarguedthatclassifierselectionmaybecomelessimportant(Banko
andBrill2001),whichiswhytheissueofevaluatingclassifiersmaymakemoresensewith
smallerdatasets.
Whenstudyinghowaccessiblethemodelsproducedbytheclassifierswere,wetriedto
elicit the heuristics that the classifier models expressed in order to establish whether they
were sound compared to how human experts would reason. The decision tree in Figure 3
shows,inASCIIformat,anumberofdecisionbrancheswheretheshortestbranchindicates
thatthepresenceofaquestionmarkshouldclassifythemessageasbeingaquestion(Mes-
sage type 1 in the ALFA -05 dataset). Then, there are a number of conditions in the tree
which correspond to combinations of sender, the organizational level of the recipient and
time, which can be explained by the change in interactions between the command center
and the field units during the scenario. Early in the scenario, most exchanges concerned
informationexchanges(Messagetype2),whereaslaterexchanges,initiatedbyhighercom-
mand,concernedorders(Messagetype3).
Description:Although successful command and control is essential to the success of crisis management and military operations, our understanding of how command and