Table Of Content

2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion) 8 7 7 3 9 7 9 2. 2 0 2 7. 9 2 5 5 Is GitHub Copilot a Substitute for Human Pair-programming? n o ni pa An Empirical Study m o C E- S SakiImai C 9/I [email protected] 0 11 ColbyCollege 0. 1 Waterville,Maine,USA OI: D ABSTRACT IfGitHubCopilotcouldproduceequivalentadvantagesfound E | inhumanpair-programming,thenadoptingthepracticeofpair- E This empirical study investigates the effectiveness of pair pro- 2 IE gramming with GitHub Copilot in comparison to human pair- programmingwithGitHubCopilotwouldleadtoamoreproductive 2 andhigherqualitysoftwaredevelopmentwithoutacquiringaddi- 0 programming. Through an experiment with 21 participants we ©2 focusoncodeproductivityandcodequality.Forexperimentalde- tionalcostsofaddingasecondprogrammer.WhileVinitShahdeo,a 00 sign,aparticipantwasgivenaprojecttocode,underthreecon- softwareengineeratPostman,saidCopilotis“goingtoincreasede- $31. ditionspresentedinarandomizedorder.Theconditionsarepair- veloper’sefficiencybyreducingdevelopmenttimeandsuggesting 22/ programmingwithCopilot,humanpair-programmingasadriver, betteralternatives“,technicalbloggerRayVillalobosstatesthatitis 8-1/ andasanavigator.Thecodesgeneratedfromthethreetrialswere hardtogetausefulresultandthatheneedstoretypecommentsto 59 analyzedtodeterminehowmanylinesofcodeonaveragewere getaproductivepieceofcode[2].Althoughthereareclaimsthat 4-9 addedineachconditionandhowmanylinesofcodeonaverage theseAItoolsmakesoftwaredevelopmentmoreproductiveand 665 wereremovedinthesubsequentstage.Theformermeasuresthe theycouldevensubstitutehumanpair-programmers,wehavenot 978-1- pofrothduecptrivoidtyucoefdecaochdec.oTnhdeitrioensuwltshisluegthgeesltatthteartmaletahsouurgehstChoepqiluoatliitny- saereenmaonreemppriordicuacltsitvuedyantodvgeirviefyhiifgAhIetrooqlusailnitsyofctowdaer.eIdnetvheilsoppmapeenrt, n) | creasesproductivityasmeasuredbylinesofcodeadded,thequality wefocusontheissueofproductivityandcodequalitywhenusing nio ofcodeproducedisinferiorbyhavingmorelinesofcodedeleted GitHubCopilotinsoftwaredevelopment.Wedesignedadedicated a empiricalexperimenttocompareAIwithhumanparticipantsina p inthesubsequenttrial. m naturalsoftwaredevelopmentenvironment.Throughcodeanalysis, o C E- CCSCONCEPTS weaimtoanswerourtwocentralresearchquestionsfocusingon S measuringproductivityandcodequalitywithGitHubCopilot. gs (IC •anSodfetwnvairreonanmdeintstse;n•gHinuemearinn-gce→nteDreevdecloopmmpeuntitnfgra→meCwololarbkos- n 2 BACKGROUNDANDRELATEDWORK di rativeandsocialcomputingsystemsandtools. e ce Werecognizetwomajorthemesinthepreviousworksthathave o Pr KEYWORDS beendoneinthisfield.ThefirstistheuseofAIinsoftwaredevel- nion GitHub,Copilot,SoftwareDevelopment,AI opment.ManystudieshaveshownthattheuseofAIassistswith a softwaredevelopment.Forinstance,onestudyusedatransformer- p m ACMReferenceFormat: basedmodelreportedaccuracyofupto69%inpredictingtokens o C SakiImai.2022.IsGitHubCopilotaSubstituteforHumanPair-programming? g: AnEmpiricalStudy.In44thInternationalConferenceonSoftwareEngineering whencodetokensweremasked[4].Anotherstudyusinglarge n languagemodelsreportedthatAIcouldrepair100%ofhandcrafted eeri Companion(ICSE’22Companion),May21–29,2022,Pittsburgh,PA,USA. securitybugsinadditionto58%ofhistoricalbugsinopen-source n ACM,NewYork,NY,USA,3pages.https://doi.org/10.1145/3510454.3522684 gi projects[8].Moreover,atrainedGPTlanguagemodelhasbeen n e E 1 RESEARCHPROBLEMANDMOTIVATION exhibitedtosolve70.2%ofproblemswith100trainingsamplesper ar problem[3],andisalsocapableofrepairingbugsincode[9].One w oft GitHubCopilotisasoftwaredevelopmenttoolthatofferscode studypredicteddefectswith87%accuracy,decreasedinspection n S generationoflines,codechunks,orevenentireprogramsbasedon effortby72%,andreducedpost-releasedefectsby44%[11]. e o existingcodeandcomments[10].Copilotismarketedasasubstitute Thesecondthemefocusesonthestudyofsoftwaredevelopment enc forpair-programming,asoftwaredevelopmentpracticewheretwo environments,whereempiricalexperimentationofhowpeople nfer programmerscollaborativelywriteasinglepieceofcode. writecodegivesusinsightsintohowtoenhancethesetoolsand Co topossiblydiscoverthebestpracticeofsoftwaredevelopment[6]. al Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalor Therehavebeenstudiesonhowprofessionaldeveloperscompre- ation cfolarspsrroofiomtoursceomismgrearncitaeldawdvitahnotaugtefeaenpdrtohvaitdceodptiheastbceoapritehsisarneontioctemanaddethoerfduilsltcriitbautitoend hendsoftwaretounderstandhowsoftwaredevelopmentshould ntern omnutshtebefirhsotnpoargeed..CAobpsytrraigcthitnsgfowritchomcrpedointeinstpseorfmtihttisedw.oTrokcoowpyneodthbeyrwotishee,rosrthreapnuAblCisMh, bperodgorname,msuecrhs[a7s]h,oanwdphroowgraimmpmleemrsernetfaatcitoonrowfhtialsekvcaolindtaetxintgfoorththeer 4th I tfoeep.oRsetqounessterpveerrmsiosrsitoonrsedfriostmribpuetremtiossliisotns,[email protected]/ora Eclipsedevelopmentenvironmentimprovedproductivityofpro- M 4 ICSE’22Companion,May21–29,2022,Pittsburgh,PA,USA grammers[5].WerecognizethatthestudyofAItoolsinsoftware C ©2022AssociationforComputingMachinery. developmenthasnotbeenstudiedempiricallywithadedicated A ACMISBN978-1-6654-9598-1/22/05...$15.00 E/ https://doi.org/10.1145/3510454.3522684 experiment. E E 2 I 2 0 2 319 Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 01,2022 at 01:08:33 UTC from IEEE Xplore. Restrictions apply. ICSE’22Companion,May21–29,2022,Pittsburgh,PA,USA SakiImai 3 APPROACHANDUNIQUENESS 4 RESULTS Inthisresearch,weaimtostudyGitHubCopilotempiricallyina To answer our research questions, code productivity in RQ1 is naturalsoftwaredevelopmentenvironment(VSCodeIDE).Hence, assessedbycomparingthenumberofaddedlinestothecode,and theresearchquestionstobeaddressedinthisstudyareasfollows: codequalityinRQ2isanalyzedbycomparingthenumberoflines (RQ1)IsthereanadvantageinproductivitywhileusingGitHub deletedinthesubsequenttrial.Deletionisanindicationoflow Copilotascomparedtoahumanpairprogrammer?(RQ2)Whatis qualitycode. thequalityofcodewrittenwithCopilotincomparisontohuman TheresultofRQ1isshowninFigure1,wherewecanseethat pairprogrammers? theCopilotconditionproducedthehighestmaximumandmean Inpairprogramming,twoprogrammerscollaborativelyworkon additionstolinesofcode.Themaximumnumberoflineswrittenin thesamecode(typicallyonthesamecomputer).Eachprogrammer thetrialwithCopilotwas43whilethecodewrittenasadriverand periodicallyswitchesbetweentworoles,adriverornavigator.The navigatorwere27and33respectively.Theminimumlinesofcode drivercontrolsthemouseandkeyboardandwritescodewhilethe addedwas9.5forCopilotand6forbothdriverandnavigator.These navigatorobservesthedriver’sworkandcriticallythinksabout resultssuggesthigherproductivityduringpair-programmingwith defects,structuralissues,andalternativesolutions,whilelooking Copilotversushumanpair-programmers. atalargerpicture[1]. UsingGitHubCopilotasasecondprogrammer,wecomparecode whenaparticipantispairprogrammingwithahumanprogram- merversusCopilot.Twenty-oneparticipantswhohavetakenat leastoneprogrammingcourseworkedondevelopingtext-based minesweepergameinPython.Noneoftheparticipantshadimple- mentedthisgamebefore,andtheparticipantsfamiliarizedthem- selveswiththerulesbyplayingthisgamepriortothedevelopment task.Thedevelopmenttaskwasdoneunderthreeconditions.The conditionsarepairprogrammingwithCopilot;pairprogramming withanotherhumanexperimenterasadriver,andpairprogram- mingwithanotherhumanexperimenterasanavigator.Thetime allocatedforis20minutesforCopilot,10minutesasadriver,and 10minutesasanavigator(20minutestotalwithahumanpair).The orderoftheseconditionswererandomizedtopreventtheexperi- Figure2:Numberoflinesofcodedeletedinatrialsubsequent menteffect.Duringtheexperiment,eyemovementisrecordedto tothreedifferentconditions. measurethedifferencebetweenhavingCopilotasacollaboratorin comparisontoahumanprogrammer. ToanswerRQ2,wecountedthenumberofdeletedlinesinthe Theanalysisoftheproducedcodeisdonebyusingthendifffunc- tionfromdifflib1.Thisisusedtocomparethenumberofadded followingtrialandnormalizethecountbythetrialduration.For this,thelinecountsforthelastconditionwereexcludedsincethere linestothecodeandnumberofdeletedlinestothecodeaftereach was no trial subsequent to that where low quality code can be trial,normalizedbythedurationofthetrial. removed.ThemaximumlinesofcodedeletedaftertheCopilottrial was42whilethelinesofcodedeletedafterthedriverandnavigator trialwerelowerwith31and10,respectively.Figure2alsoshows thatthedeletedlinecountinthefollowingtrialwashigherfor Copilotthantheothertwoconditions.Hence,ourresultsuggests thatthecodegeneratedwithCopilothas,onaverage,lowerquality thanthatproducedbyhumanpair-programmers. 5 CONTRIBUTIONS OurresultssuggestthatalthoughprogrammingwithCopilothelps generate more lines of code than human pair-programming in thesameperiodoftime,thequalityofcodegeneratedbyCopi- lot appears to be lower. This result seems to suggest that pair- programmingwithCopilotdoesnotmatchtheprofileofhuman pair-programming. Wearestillintheprocessofcollectingexperimentdataandanalyz- Figure1:Numberoflinesaddedtoacodeunderthreediffer- ingtheeye-trackingdatathathavebeenrecordedthroughoutthe entconditions. experiment.Withtheeye-trackingdata,wearetryingtocompare howprogrammerinspectthecodegeneratedbyAItothatbyhu- manpair-programmer.Ourhypothesisisthattheoverconfidence 1https://docs.python.org/3/library/difflib.html ofAItoolsleadstolessinspectionofcodegeneratedbyCopilot. 320 Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 01,2022 at 01:08:33 UTC from IEEE Xplore. Restrictions apply. IsGitHubCopilotaSubstituteforHumanPair-programming?AnEmpiricalStudy ICSE’22Companion,May21–29,2022,Pittsburgh,PA,USA REFERENCES onFoundationsofsoftwareengineering.1–11. [1] RitchieSchacherAdamArcherandScottWill.[n.d.].PrograminPairs. Retrieved [6] GailCMurphy,MikKersten,andLeahFindlater.2006.HowareJavasoftware December31,2021fromhttps://www.ibm.com/garage/method/practices/code/ developersusingtheElipseIDE?IEEEsoftware23,4(2006),76–83. practice_pair_programming/ [7] EmersonMurphy-Hill,ChrisParnin,andAndrewPBlack.2011.Howwerefactor, [2] ScottCarey.2021. DevelopersreacttoGitHubCopilot. RetrievedDecember andhowweknowit. IEEETransactionsonSoftwareEngineering38,1(2011), 31,2021fromhttps://www.infoworld.com/article/3624688/developers-react-to- 5–18. github-copilot.html [8] HammondPearce,BenjaminTan,BaleeghAhmad,RameshKarri,andBrendan [3] MarkChen,JerryTworek,HeewooJun,QimingYuan,HenriquePondedeOliveira Dolan-Gavitt.2021.CanOpenAICodexandOtherLargeLanguageModelsHelp Pinto,JaredKaplan,HarriEdwards,YuriBurda,NicholasJoseph,GregBrockman, UsFixSecurityBugs?arXivpreprintarXiv:2112.02125(2021). etal.2021. Evaluatinglargelanguagemodelstrainedoncode. arXivpreprint [9] JulianAronPrennerandRomainRobbes.2021.AutomaticProgramRepairwith arXiv:2107.03374(2021). OpenAI’sCodex:EvaluatingQuixBugs.arXivpreprintarXiv:2111.03922(2021). [4] MatteoCiniselli,NathanCooper,LucaPascarella,AntonioMastropaolo,Emad [10] DominikSobania,MartinBriesch,andFranzRothlauf.2021.ChooseYourPro- Aghajani,DenysPoshyvanyk,MassimilianoDiPenta,andGabrieleBavota.2021. grammingCopilot:AComparisonoftheProgramSynthesisPerformanceof AnEmpiricalStudyontheUsageofTransformerModelsforCodeCompletion. GitHubCopilotandGeneticProgramming. arXivpreprintarXiv:2111.07875 IEEETransactionsonSoftwareEngineering(2021). (2021). [5] MikKerstenandGailCMurphy.2006.Usingtaskcontexttoimproveprogrammer [11] AyseTosun,AyseBener,andResatKale.2010.Ai-basedsoftwaredefectpredictors: productivity.InProceedingsofthe14thACMSIGSOFTinternationalsymposium Applicationsandbenefitsinacasestudy.InTwenty-SecondIAAIConference. 321 Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on July 01,2022 at 01:08:33 UTC from IEEE Xplore. Restrictions apply.

Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study PDF

3 Pages·0.769 MB·English

by Saki Imai

Checking for file health...

Save to my drive

Quick download

Download

Download Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study PDF Free - Full Version

by Saki Imai| 3 pages| 0.769| English

Download Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study by Saki Imai in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study

No description available for this book.

Detailed Information

Author:	Saki Imai
ISBN:	9781665495981
Pages:	3
Language:	English
File Size:	0.769
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study PDF?

Yes, on https://PDFdrive.to you can download Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study by Saki Imai completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study on my mobile device?

After downloading Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study?

Yes, this is the complete PDF version of Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study by Saki Imai. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Is GitHub Copilot a Substitute for Human Pair-programming? An Empirical Study PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.