FIND THE ATTACHED DOCUMENT FOR FULL DETAILSExploring OOP and its Data StructuresComplete and submit Research Problem #3, on page 104, Chapter 3 (ATTACHED)( last part in the attachment.Write a 2-3 page

CHAPTER 3 D A T A R E P R E S E N T A T IO N C H A PTER G O A LS Describenumberingsystemsandtheiruseindatarepresentation Comparedifferentdatarepresentationmethods SummarizetheCPUdatatypesandexplainhownonnumericdata isrepresented Describecommondatastructuresandtheiruses Computersmanipulateandstoreavarietyofdata,suchasnumbers,text,sound,andpictures.This chapterdescribeshowdataisrepresentedandstoredincomputerhardware.Italsoexplainshow simpledatatypesareusedasbuildingblockstocreatemorecomplexdatastructures,suchasarrays andrecords.Understandingdatarepresentationiskeytounderstandinghardwareandsoftware technologies. DATA REPRESENTATION AND PROCESSING Peoplecanunderstandandmanipulatedatarepresentedinavarietyofforms.Forexample, theycanunderstandnumbersrepresentedsymbolicallyasArabicnumerals(suchas8714), Romannumerals(suchasXVII),andsimplelinesortickmarksonpaper(forexample,|||to representthevalue3).Theycanunderstandwordsandconceptsrepresentedwithpictorial characters( )oralphabeticcharacters( computer and ,Cyrillictextof theRussianwordforcomputer)andintheform ofsoundwaves(spokenwords).People alsoextractdatafrom visualimages(photosandmovies)andfrom thesensesoftaste, smell,andtouch.Thehumanbrain sprocessingpowerandflexibilityareevidentinthe richvarietyofdatarepresentationsitcanrecognizeandunderstand. Tobemanipulatedorprocessedbythebrain,externaldatarepresentations,suchas printedtext,mustbeconvertedtoaninternalformatandtransportedtothebrain spro- cessing circuitry. Sensoryorgansconvertinputs,suchassight,smell,taste,sound,and skinsensations,intoelectricalimpulsesthataretransportedthroughthenervoussystem tothebrain.Processinginthebrainoccursasnetworksofneuronsexchangedata electrically. Anydataandinformationprocessor,whetherorganic,mechanical,electrical,oropti- cal,mustbecapableofthefollowing: Recognizingexternaldataandconvertingittoaninternalformat Storingandretrievingdatainternally Transportingdatabetweeninternalstorageandprocessingcomponents Manipulatingdatatoproduceresultsordecisions Notethatthesecapabilitiescorrespondroughlytocomputersystem components describedinChapter2 I/O units,primaryandsecondarystorage,thesystem bus,and theCPU. Automated DataProcessing Computersystemsrepresentdataelectricallyandprocessitwithelectricalswitches.Two- state(onandoff)electricalswitchesarewellsuitedforrepresentingdatathatcanbe expressedinbinary(1or0)format,asyouseelaterinChapter4.Electricalswitchesare combinedtoform processingcircuits,whicharethencombinedtoform processingsub- systemsandentireCPUs.Youcanseethisprocessingasanequation: A B C Inthisequation,datainputsAandB,representedaselectricalcurrents,aretrans- portedthroughprocessingcircuits(seeFigure3.1).Theelectricalcurrentemergingfrom thecircuitrepresentsadataoutput,C.Automateddataprocessing,therefore,combines physics(electronics)andmathematics. Thephysicallawsofelectricity,optics,andquantum mechanicsaredescribedby mathematicalformulas.Ifadevice sbehaviorisbasedonwell-defined,mathematically describedlawsofphysics,thedevice,intheory,canimplementaprocessortoperform the equivalentmathematicalfunction.Thisrelationshipbetweenmathematicsandphysics underliesallautomatedcomputationdevices,from mechanicalclocks(usingthe FIGURE3.1 Twoelectricalinputsontheleftflowthroughprocessingcircuitrythatgeneratestheir sumontheright CourtesyofCourseTechnology/CengageLearning 62 Chapter3 mathematicalratiosofgears)toelectronicmicroprocessors(usingthemathematicsof electricalvoltageandresistance).AsyoulearnedinChapter2,inquantum mechanics,the mathematicallawsareunderstoodbutnothowtobuildreliableandcost-effectivecom- putingdevicesbasedontheselaws. Basingcomputerprocessingonmathematicsandphysicshaslimits,however.Proces- singoperationsmustbebasedonmathematicalfunctions,suchasadditionandequality comparison;usenumericaldatainputs;andgeneratenumericaloutputs.Theseprocessing functionsaresufficientwhenacomputerperformsnumerictasks,suchasaccountingor statisticalanalysis.Whenyouaskacomputertoperform taskssuchassearchingtext documentsandeditingsound,pictures,andvideo,numeric-processingfunctionsdohave limitations,butonesthatmodernsoftwarehaslargelyovercome.However,whenyouwant touseacomputertomanipulatedatawithnoobviousnumericequivalent forexample, literaryorphilosophicalanalysisofconceptssuchas mother, friend, love, and hate numeric-processingfunctionshavemajorshortcomings.Asthedatayouwantto processmovesfurtherawayfrom numbers,applyingcomputertechnologytoprocessing thedatabecomesincreasinglydifficult andlesssuccessful. BinaryDataRepresentation Inadecimal(base10)number,eachdigitcanhave1of10possiblevalues:0,1,2,3,4,5, 6,7,8,or9.Ina binarynumber,eachdigitcanhaveonlyoneoftwopossiblevalues:0or 1.Computersrepresentdatawithbinarynumbersfortworeasons: Binarynumbersrepresentedasbinaryelectricalsignalscanbetransported reliablybetweencomputersystemsandtheircomponents(discussedindetail inChapter8). Binarynumbersrepresentedaselectricalsignalscanbeprocessedbytwo- stateelectricaldevicesthatareeasytodesignandfabricate(discussedin detailinChapter4). Forcomputerapplicationstoproduceaccurateoutputs,reliabledatatransportis important.Givencurrenttechnology,binarysignalsandprocessingdevicesrepresentthe mostcost-efficienttradeoffsbetweencapacity,accuracy,reliability,andcost. Binarynumbersarealsowellsuitedtocomputerprocessingbecausetheycorrespond directlywithvaluesin Booleanlogic.Thisform oflogicisnamedfor19th-centurymathe- maticianGeorgeBoole,whodevelopedmethodsofreasoningandlogicalproofthatuse sequencesofstatementsthatcanbeevaluatedonlyastrueorfalse.Similarly,acomputer canperform logicalcomparisonsoftwobinarydatavaluestodeterminewhetheronedata valueisgreaterthan,equalto,lessthan,lessthanorequalto,notequalto,orgreaterthan orequaltoanothervalue.AsdiscussedinChapter2,acomputerusesthisprimitive logicalcapabilitytoexhibitintelligentbehavior. Bothcomputersandhumanscancombinedigitstorepresentandmanipulate largernumbers.Decimalandbinarynotationsarealternativeformsofapositional numberingsystem,inwhichnumericvaluesarerepresentedasgroups,orstrings,of digits.Thesymbolusedtorepresentadigitandthedigit spositioninastring determineitsvalue.Thevalueoftheentirestringisthesum ofthevaluesofalldigits inthestring. 63 DataRepresentationandProcessing Forexample,inthedecimalnumberingsystem,thenumber5689isinterpretedas follows: 5 1000 6 100 8 10 9 5000 600 80 9 5689 Thesameseriesofoperationscanberepresentedincolumnarform,withpositionsof thesamevaluealignedincolumns: 5000 600 80 9 5689 Forwholenumbers,valuesareaccumulatedfrom righttoleft.Intheprecedingexam- ple,thedigit9isinthefirstposition,8isinthesecondposition,6isinthethird,and5is inthefourth. Themaximum value,orweight,ofeachpositionisamultipleoftheweightofthe positiontoitsright.Inthedecimalnumberingsystem,thefirst(rightmost)positionisthe ones(100),andthesecondpositionis10timesthefirstposition(101).Thethirdposition is10timesthesecondposition(102,or100),thefourthis10timesthethirdposition(103, or1000),andsoon.Inthebinarynumberingsystem,eachpositionis2timestheprevious position,sopositionvaluesforwholenumbersare1,2,4,8,andsoforth.Themultiplier thatdescribesthedifferencebetweenonepositionandthenextisthe base,orradix,of thenumberingsystem.Inthedecimalnumberingsystem,it s10,andinthebinary numberingsystem,it s2. Thefractionalpartofanumericvalueisseparatedfrom thewholepartbyaperiod, althoughinsomecountries,acommaisusedinsteadofaperiod.Inthedecimalnumber- ingsystem,theperiodorcommaiscalleda decimalpoint.Inothernumberingsystems, theterm radixpointisusedfortheperiodorcomma.Heresanexampleofadecimal valuewitharadixpoint: 5689 368 Thefractionalportionofthisrealnumberis.368,anditsvalueisinterpretedas follows: 3 10-1 6 10-2 8 10-3 3 1 6 01 8 001 0 3 006 0008 0368 Proceedingtowardtherightfrom theradixpoint,theweightofeachpositionis afractionofthepositiontoitsleft.Inthedecimal(base10)numberingsystem,the firstpositiontotherightofthedecimalpointrepresentstenths(10-1),thesecond positionrepresentshundredths(10-2),thethirdrepresentsthousandths(10-3),and soforth. 64 Chapter3 Inthebinarynumberingsystem,thefirstpositiontotherightoftheradixpoint representshalves(2-1),thesecondpositionrepresentsquarters(2-2),thethirdrepresents eighths(2-3),andsoforth.Aswithwholenumbers,eachpositionhasaweight10(or2) timesthepositiontoitsright.Table3.1comparesdecimalandbinarynotationsforthe values0through10. Thenumberofdigitsneededtorepresentavaluedependsonthenumberingsystem s base:Thenumberofdigitsincreasesasthenumberingsystem sbasedecreases.Therefore, valuesthatcanberepresentedinacompactformatindecimalnotationmightrequire lengthysequencesofbinarydigits.Forexample,thedecimalvalue99requirestwo decimaldigitsbutsevenbinarydigits.Table3.2summarizesthenumberofbinarydigits neededtorepresentdecimalvaluesupto16positions. TABLE3.1 Binaryanddecimalnotationsforthevalues0through10 Binarysystem(base2) Decimalsystem(base10) Place 23 22 21 20 103 102 101 100 Values 8 4 2 1 1000 100 10 1 0 0 0 0 5 0 0 0 0 0 0 0 1 5 0 0 0 1 0 0 1 0 5 0 0 0 2 0 0 1 1 5 0 0 0 3 0 1 0 0 5 0 0 0 4 0 1 0 1 5 0 0 0 5 0 1 1 0 5 0 0 0 6 0 1 1 1 5 0 0 0 7 1 0 0 0 5 0 0 0 8 1 0 0 1 5 0 0 0 9 1 0 1 0 5 0 0 1 0 65 DataRepresentationandProcessing Toconvertabinaryvaluetoitsdecimalequivalent,usethefollowingprocedure: 1. Determineeachpositionweightbyraising2tothenumberofpositionsleft ( )orright(-)oftheradixpoint. 2. Multiplyeachdigitbyitspositionweight. 3. Sum allthevaluescalculatedinStep2. N O TE ThestandardWindowscalculatorcanconvertbetweenbinary,octal,decimal,andhexadecimal.Toopen thecalculatorinWindows7,clickStart,AllPrograms,Accessories,Calculator.Toconvertabinarynum- bertodecimal,clickView,Programmerfromthemenu(View,ScientificforVistaandearlierWindows versions).ClicktheBin(forbinary)optionbuttonattheupperleft,enterthebinarynumberinthetext box,andthenclicktheDec(fordecimal)optionbutton. TABLE3.2 Binarynotationsfordecimalvaluesupto16positions Numberofbits(n) Numberofvalues(2n) Numericrange(decimal) 1 2 0 1 2 4 0 3 3 8 0 7 4 16 0 15 5 32 0 31 6 64 0 63 7 128 0 127 8 256 0 255 9 512 0 511 10 1024 0 1023 11 2048 0 2047 12 4096 0 4095 13 8192 0 8191 14 16,384 0 16,383 15 32,768 0 32,767 16 65,536 0 65,535 66 Chapter3 Figure3.2showshowthebinarynumber101101.101isconvertedtoitsdecimal equivalent,45.625. Incomputerterminology,eachdigitofabinarynumberiscalleda bit.Agroupofbits thatdescribeasingledatavalueiscalleda bitstring.Theleftmostdigit,whichhasthe greatestweight,iscalledthe mostsignificantdigit,orhigh-orderbit.Conversely,the rightmostdigitisthe leastsignificantdigit,orlow-orderbit.Astringof8bitsiscalleda byte .Generally,abyteisthesmallestunitofdatathatcanbereadfrom orwrittentoa storagedevice. Thefollowingmathematicalrulesdefineadditionofpositivebinarydigits: 0 0 0 1 0 1 0 1 1 1 1 10 Toaddtwopositivebinarybitstrings,youfirstmustaligntheirradixpointsasfollows: 101101 101 101000010 FIGURE3.2 Computingthedecimalequivalentofabinarynumber CourtesyofCourseTechnology/CengageLearning 67 DataRepresentationandProcessing Thevaluesineachcolumnareaddedseparately,startingwiththeleastsignificant,or rightmost,digit.Ifacolumnresultexceeds1,theexcessvaluemustbecarriedtothenext columnandaddedtothevaluesinthatcolumn. N O TE ThestandardWindowscalculatorcanaddandsubtractbinaryintegers.Tousethisfeature,clickView, Programmerfromthemenu(View,ScientificforVistaandearlierWindowsversions),andclicktheBin optionbutton.YoucanalsoclickView,Digitgroupingfromthemenutoplacedigitsingroupsoffourfor easierreadability. Thisistheresultofaddingthetwoprecedingnumbers: 111 1 101101 101 101000010 10000011100 Theresultisthesameaswhenaddingthevaluesinbase-10notation: Real Real Binary fractions decimal 101101 101 4558 45625 101000010 2018 20125 1000001 1100 6534 65750 Binarynumbersusuallycontainmanydigitsandaredifficultforpeopletoremember andmanipulatewithouterror.Compilersandinterpretersforhigh-levelprogramminglan- guages,suchasCandJava,convertdecimalnumbersintobinarynumbersautomatically whengeneratingCPUinstructionsanddatavalues.However,sometimesprogrammers mustdealwithbinarynumbersdirectly,suchaswhentheyprogram inmachinelanguage orforsomeoperatingsystem (OS)utilities.Tominimizeerrorsandmakedealingwith binarynumberseasier,numberingsystemsbasedonevenmultiplesof2aresometimes used.Thesenumberingsystemsincludehexadecimalandoctal,discussedinthefollowing sections. HexadecimalNotation Hexadecimalnumberinguses16asitsbaseorradix( hex 6and decimal 10). Therearen tenoughnumericsymbols(Arabicnumerals)torepresent16differentvalues, soEnglishlettersrepresentthelargervalues(seeTable3.3). 68 Chapter3 Theprimaryadvantageofhexadecimalnotation,comparedwithbinarynotation,isits compactness.Largenumericvaluesexpressedinbinarynotationrequirefourtimesas manydigitsasthoseexpressedinhexadecimalnotation.Forexample,thedatacontentof abyterequireseightbinarydigits(suchas11110000)butonlytwohexadecimaldigits (suchasF0).Thiscompactrepresentationhelpsreduceprogrammererror. Hexadecimalnumbersoftendesignatememoryaddresses.Forexample,a64KB memoryregioncontains65,536bytes(64 1024bytes/KB).Eachbyteisidentifiedbya sequentialnumericaddress.Thefirstbyteisalwaysaddress0.Therefore,therangeof possiblememoryaddressesis0to65,535indecimalnumbers,0to1111111111111111in binarynumbers,and0toFFFFinhexadecimalnumbers.Asyoucanseefrom thisexam- ple,hexadecimaladdressesaremorecompactthandecimalorbinaryaddressesbecause ofthenumberingsystem shigherradix. Whenreadinganumericvalueinwrittentext,thenumber sbasemightnotbe obvious.Forexample,whenreadinganOSerrormessageorahardwareinstallation manual,shouldthenumber1000beinterpre tedinbase2,10,16,orsomethingelse? Inmathematicalexpressions,thebaseisusuallyspecifiedwithasubscript,asinthis example: 10012 Thesubscript2indicatesthat1001shouldbeinterpretedasabinarynumber. Similarly,inthefollowingexample,thesubscript16indicatesthat6044shouldbe interpretedasahexadecimalnumber: 604416 Thebaseofawrittennumbercanbemadeexplicitbyplacingaletterattheend.For example,theletterBinthisexampleindicatesabinarynumber: 1001B TheletterHinthisexampleindicatesahexadecimalnumber: 6044H TABLE3.3 Hexadecimalanddecimalvalues Base-16digit Decimalvalue Base-16digit Decimalvalue 0 0 8 8 1 1 9 9 2 2 A 10 3 3 B 11 4 4 C 12 5 5 D 13 6 6 E 14 7 7 F 15 69 DataRepresentationandProcessing Normally,noletterisusedtoindicateadecimal(base10)number.Someprogram- minglanguages,suchasJavaandC ,usetheprefix0xtoindicateahexadecimal number.Forexample,0x1001isequivalentto100116. Unfortunately,theseconventionsaren tobservedconsistently.Oftenitslefttothe readertoguessthecorrectbasebythenumber scontentorthecontextinwhichit appears.Avaluecontaininganumeralotherthan0or1can tbebinary,forinstance. Similarly,theuseoflettersAthroughFindicatesthatthecontentsareexpressedin hexadecimal.Bitstringsareusuallyexpressedinbinary,andmemoryaddressesare usuallyexpressedinhexadecimal. OctalNotation SomeOSsandmachineprogramminglanguagesuse octalnotation.Octalnotationuses thebase-8numberingsystem andhasarangeofdigitsfrom 0to7.Largenumericvalues expressedinoctalnotationareone-thirdthelengthofcorrespondingbinarynotationand doublethelengthofcorrespondinghexadecimalnotation. GOALS OF COMPUTER DATA REPRESENTATION Althoughallmoderncomputersrepresentdatainternallywithbinarydigits,theydon t necessarilyrepresentlargernumericvalueswithpositionalbitstrings.Positionalnumber- ingsystemsareconvenientforpeopletointerpretandmanipulatebecausethesequential processingofdigitsinastringparallelsthewaythehumanbrainfunctionsandbecause peoplearetaughttoperform computationsinalinearfashion.Forexample,positional numberingsystemsarewellsuitedtoaddingandsubtractingnumbersincolumnsby usingpencilandpaper.Computerprocessors,however,operatedifferentlyfrom ahuman brain.Datarepresentationtailoredtohumancapabilitiesandlimitationsmightnotbebest suitedtocomputers. Anyrepresentationformatfornumericdatarepresentsabalanceamongseveral factors,includingthefollowing: Compactness Range Accuracy Easeofmanipulation Standardization Aswithmanycomputerdesigndecisions,alternativesthatperform wellinonefactor oftenperform poorlyinothers.Forexample,adataformatwithahighdegreeofaccuracy andalargerangeofrepresentablevaluesisusuallydifficultandexpensivetomanipulate becauseit snotcompact. Compactnessand Range Theterm compactness (orsize)describesthenumberofbitsusedtorepresenta numericvalue.Compactrepresentationformatsusefewerbitstorepresentavalue,but they relimitedintherangeofvaluestheycanrepresent.Forexample,thelargestbinary integerthatcanbestoredin32bitsis232,or4,294,967,29610.Halvingthenumberofbits 70 Chapter3 tomaketherepresentationmorecompactdecreasesthelargestpossiblevalueto216,or 65,53510. Computerusersandprogrammersusuallypreferalargenumericrange.Forexample, wouldyoubehappyifyourbank scomputerlimitedyourmaximum checkingaccount balanceto65,535pennies?Theextrabitpositionsrequiredtoincreasethenumericrange haveacost,however.Primaryandsecondarystoragedevicesmustbelargerand,there- fore,moreexpensive.Additionalandmoreexpensivecapacityisrequiredfordatatrans- missionbetweendevicesinacomputersystem oracrosscomputernetworks.CPU processingcircuitrybecomesmorecomplexandexpensiveasmorebitpositionsare added.Themorecompactadatarepresentationformat,thelessexpensiveitisto implementincomputerhardware. Accuracy Althoughcompactdataformatscanminimizehardware scomplexityandcost,theydoso attheexpenseofaccuratedatarepresentation.Theaccuracy,orprecision,ofrepresenta- tionincreaseswiththenumberofdatabitsused. It spossibleforroutinecalculationstogeneratequantitiestoolargeortoosmalltobe containedinamachine sfinitecircuitry(thatis,inafixednumberofbits).Forexample, thefraction1/3can tberepresentedaccuratelyinafixednumberofbitsbecauseitsa nonterminatingfractionalquantity(0.333333333 ,withaninfinitenumberof3s).In thesecases,thequantitiesmustbemanipulatedandstoredasapproximations,andeach approximationintroducesadegreeoferror.Ifapproximateresultsareusedasinputsfor othercomputations,errorscanbecompoundedandevenresultinmajorerrors.Forthis reason,aprogram canhavenoapparentlogicalflawsyetstillproduceinaccurateresults. Ifalldatatypeswererepresentedinthemostcompactform possible,approximations wouldintroduceunacceptablemarginsoferror.Ifalargenumberofbitswereallocatedto eachdatavalueinstead,machineefficiencyandperformancewouldbesacrificed,and hardwarecostwouldbeincreased.Thebestbalanceinperformanceandcostcanbe achievedbyusinganoptimum codingmethodforeachtypeofdataoreachtypeofoper- ationtobeperformed.Strivingforthisbalanceisthemainreasonforthevarietyofdata representationformatsusedinmodernCPUs. EaseofManipulation Whendiscussingcomputerprocessing, manipulationreferstoexecutingprocessor instructions,suchasaddition,subtraction,andequalitycomparisons,and ease refersto machineefficiency.Aprocessor sefficiencydependsonitscomplexity(thenumberofits primitivecomponentsandthecomplexityofthewiringthatbindsthem together).Effi- cientprocessorcircuitsperform theirfunctionsquicklybecauseofthesmallnumberof componentsandtheshortdistanceelectricitymusttravel.Morecomplexdevicesneed moretimetoperform theirfunctions. Datarepresentationformatsvaryintheircapabilitytosupportefficientprocessing. Forexample,mostpeoplehavemoredifficultyperformingcomputationswithfractions thanwithdecimalnumbers.Peopleprocessthedecimalformatmoreefficientlythanthe fractionalformat. 71 GoalsofComputerDataRepresentation Unfortunately,theresnobestrepresentationformatforalltypesofcomputation operations.Forexample,representinglargenumericvaluesaslogarithmssimplifiesmulti- plicationanddivisionforpeopleandcomputersbecauselogA logB log(A B)and logA-logB log(A B).Logarithmscomplicateotheroperations,suchasaddition andsubtraction,andtheycanincreaseanumber slengthsubstantially(forexample,log 99 1.9956351945975499153402557777533). Standardization Datamustbecommunicatedbetweendevicesinasinglecomputerandtoothercomputers vianetworks.Toensurecorrectandefficientdatatransmission,dataformatsmustbe suitableforawidevarietyofdevicesandcomputers.Forthisreason,severalorganizations havecreatedstandarddata-encodingmethods(discussedlaterinthe CharacterData section).Adheringtothesestandardsgivescomputeruserstheflexibilitytocombine hardwarefrom differentvendorswithminimaldatacommunicationproblems. CPU DATA TYPES TheCPUsofmostmoderncomputerscanrepresentandprocessatleastthefollowing primitivedatatypes: Integer Realnumber Character Boolean Memoryaddress Thearrangementandinterpretationofbitsinabitstringareusuallydifferentfor eachdatatype.Therepresentationformatforeachdatatypebalancescompactness, range,accuracy,easeofmanipulation,andstandardization.ACPUcanalsoimplement multipleversionsofeachtypetosupportdifferenttypesofprocessingoperations. Integers An integerisawholenumber avaluethatdoesnthaveafractionalpart.Forexample, thevalues2,3,9,and129areintegers,butthevalue12.34isnot.Integerdataformats canbesignedorunsigned.MostCPUsprovidean unsignedintegerdatatype,whichstores positiveintegervaluesasordinarybinarynumbers.Anunsignedinteger svalueisalways assumedtobepositive. A signedintegerusesonebittorepresentwhetherthevalueispositiveornegative.The choiceofbitvalue(0or1)torepresentthesign(positiveornegative)isarbitrary.Thesign bitisnormallythehigh-orderbitinanumericdataformat.Inmostdataformats,it s1 foranegativenumberand0foranonnegativenumber.(Notethat0isanonnegative number.) Thesignbitoccupiesabitpositionthatwouldotherwisebeavailabletostorepartof thedatavalue.Therefore,usingasignbitreducesthelargestpositivevaluethatcanbe storedinanyfixednumberofbitpositions.Forexample,thelargestpositivevaluethat 72 Chapter3 canbestoredinan8-bitunsignedintegeris255,or28-1.Ifabitisusedforthesign,the largestpositivevaluethatcanbestoredis127,or27-1. Withunsignedintegers,thelowestvaluethatcanberepresentedisalways0.With signedintegers,thelowestvaluethatcanberepresentedisthenegativeofthehighest valuethatcanbestored(forexample,-127for8-bitsignedbinary). ExcessNotation Oneformatthatcanbeusedtorepresentsignedintegersis excessnotation,whichalways usesafixednumberofbits,withtheleftmostbitrepresentingthesign.Forexample,the value0isrepresentedbyabitstringwith1intheleftmostdigitand0sinalltheother digits.AsshowninTable3.4,allnonnegativevalueshave1asthehigh-orderbit,andneg- ativevalueshave0inthisposition.Inessence,excessnotationdividesarangeofordinary binarynumbersinhalfandusesthelowerhalffornegativevaluesandtheupperhalffor nonnegativevalues. Torepresentaspecificintegervalueinexcessnotation,youmustknowhowmany storagebitsaretobeused,whetherthevaluefitswithinthenumericrangeofexcess notationforthatnumberofbits,andwhetherthevaluetobestoredispositiveornegative. Foranynumberofbits,thelargestandsmallestvaluesinexcessnotationare2( n-1)-1and -2( n-1),wherenisthenumberofavailablestoragebits. TABLE3.4 Excessnotation Bitstring Decimalvalue 1111 7 Nonnegative numbers 1110 6 1101 5 1100 4 1011 3 1010 2 1001 1 1000 0 0111 -1 Negative numbers 0110 -2 0101 -3 0100 -4 0011 -5 0010 -6 0001 -7 0000 -8 73 CPUDataTypes Forexample,considerstoringasignedintegerin8bitswithexcessnotation.Because theleftmostbitisasignbit,thelargestpositivevaluethatcanbestoredis27-1,or12710, andthesmallestnegativevaluethatcanbestoredis-27,or-12810.Therangeofpositive valuesappearstobesmallerthantherangeofnegativevaluesbecause0isconsidereda positive(nonnegative)numberinexcessnotation.Attemptingtorepresentlargerpositive orsmallernegativevaluesresultsinerrorsbecausetheleftmost(sign)bitmightbeover- writtenwithanincorrectvalue. Nowconsiderhow 9and-9arerepresentedin8-bitexcessnotation.Bothvaluesare wellwithinthenumericrangelimitsfor8-bitexcessnotation.Theordinarybinaryrepre- sentationof 910in8bitsis00001001.Recallthattheexcessnotationrepresentationof0 isalwaysaleading1bitfollowedbyall0bits 10000000for8-bitexcessnotation. Because 9isnineintegervaluesgreaterthan0,youcancalculatetherepresentationof 9byaddingitsordinarybinaryrepresentationtotheexcessnotationrepresentation of0asfollows: 10000000 00001001 10001001 Torepresentnegativevalues,youuseasimilarmethodbasedonsubtraction.Because -9isnineintegervalueslessthan0,youcancalculatetherepresentationof-9bysub- tractingitsordinarybinaryrepresentationfrom theexcessnotationrepresentationof0 asfollows: 10000000 -00001001 01110111 TwosComplementNotation Inthebinarynumberingsystem,thecomplementof0is1,andthecomplementof1is0. Thecomplementofabitstringisformedbysubstituting0forallvaluesof1and1forall valuesof0.Forexample,thecomplementof1010is0101.Thistransformationisthebasis of twoscomplementnotation.Inthisnotation,nonnegativeintegervaluesarerepresented asordinarybinaryvalues.Forexample,atwoscomplementrepresentationof710using 4bitsis0111. Bitstringsfornegativeintegervaluesaredeterminedbythefollowingtransformation: complementofpositivevalue 1 negativerepresentation Parenthesesareacommonmathematicalnotationforshowingavalue scomplement; forexample,ifAisanumericvalue,(A)representsitscomplement.In4-bittwoscomple- mentrepresentation,-710iscalculatedasfollows: 0111 0001 1000 0001 1001 -710 Asanotherexample,takealookatthetwoscomplementrepresentationof 3510and -3510in8bits.Theordinarybinaryequivalentof 3510is00100011,whichisalsothe 74 Chapter3 twoscomplementrepresentation.Todeterminethetwoscomplementrepresentationof -3510,usethepreviousformulawith8-bitnumbers: 00100011 00000001 11011100 00000001 11011101 -3510 Twoscomplementnotationisawkwardformostpeople,butitshighlycompatible withdigitalelectroniccircuitryforthefollowingreasons: Theleftmostbitrepresentsthesign. Afixednumberofbitpositionsareused. Onlytwologiccircuitsarerequiredtoperform additiononsingle-bitvalues. Subtractioncanbeperformedasadditionofanegativevalue. ThelattertworeasonsenableCPUmanufacturerstobuildprocessorswithfewer componentsthanareneededforotherintegerdataformats,whichsavesmoneyand increasescomputationalspeed.Forthesereasons,allmodernCPUsrepresentandmani- pulatesignedintegersbyusingtwoscomplementformat. RangeandOverflow MostmodernCPUsuse64bitstorepresentatwoscomplementvalueandsupport32-bit formatsforbackwardcompatibilitywitholdersoftware.A32-bitformatisusedinthe remainderofthisbooktosimplifythediscussionandexamples.Asmallpositivevalue, suchas1,occupies32bitseventhough,intheory,only2bitsarerequired(oneforthe valueandoneforthesign).Althoughpeoplecandealwithnumericvaluesofvarying lengths,computercircuitryisn tnearlyasflexible.Fixed-widthformatsenablemore efficientprocessoranddatacommunicationcircuitry.TheadditionalCPUcomplexity requiredtoprocessvariable-lengthdataformatsresultsinunacceptablyslowperformance. Therefore,whensmallnumericvaluesarestored,the extra bitpositionsarefilledwith leading0s. The numericrangeofatwoscomplementvalueis-(2n-1)to(2n-1-1),wherenisthe numberofbitsusedtostorethevalue.Theexponentis n-1because1bitisusedforthe sign.For32-bittwoscomplementformat,thenumericrangeis-2,147,483,64810to 2,147,483,64710.Withanyfixed-widthdatastorageformat,it spossiblethattheresultofa computationwillbetoolargetofitintheformat.Forexample,theGrossDomesticProd- uctofeachU.S.statewaslessthan$2billionin2005.Therefore,thesevaluescanbe representedas32-bittwoscomplementintegers.AddingthesenumberstocalculateGross NationalProduct(GNP),however,yieldsasum largerthan$2billion.Therefore,apro- gram thatcomputesGNPbyusing32-bittwoscomplementvalueswillgenerateavalue thatexceedstheformat snumericrange.Thiscondition,referredtoasoverflow,istreated asanerrorbytheCPU.Executingasubtractioninstructioncanalsoresultinoverflow forexample,-(231)-1.Overflowoccurswhentheabsolutevalueofacomputationalresult containstoomanybitstofitintoafixed-widthdataformat. AswithmostotheraspectsofCPUdesign,dataformatlengthisonedesignfactorthat needstobebalancedwithothers.Largeformatsreducethechanceofoverflow by increasingthemaximum absolutevaluethatcanberepresented,butmanybitsarewasted 75 CPUDataTypes (paddedwithleading0s)whensmallervaluesarestored.Ifbitswerefree,therewouldbe notradeoff.However,extrabitsincreaseprocessorcomplexityandstoragerequirements, whichincreasecomputersystem cost.ACPUdesignerchoosesadataformatwidthby balancingnumericrange,thechanceofoverflowduringprogram execution,andthe complexity,cost,andspeedofprocessingandstoragedevices. Toavoidoverflow andincreaseaccuracy,somecomputersandprogramminglan- guagesdefineadditionalnumericdatatypescalled double-precisiondataformats.A double-precisiondataformatcombinestwoadjacentfixed-lengthdataitemstoholda singlevalue.Double-precisionintegersaresometimescalled longintegers. Overflowcanalsobeavoidedbycarefulprogramming.Ifaprogrammeranticipates thatoverflow ispossible,theunitsofmeasureforprogram variablescanbemadelarger. Forexample,calculationsoncentimeterscouldbeconvertedtometersorkilometers,as appropriate. RealNumbers A realnumbercancontainbothwholeandfractionalcomponents.Thefractionalportion isrepresentedbydigitstotherightoftheradixpoint.Forexample,thefollowingcompu- tationusesrealnumberdatainputsandgeneratesarealnumberoutput: 18 0 40 45 Thisistheequivalentcomputationinbinarynotation: 10010 100 1001 Representingarealnumberincomputercircuitryrequiressomewaytoseparatethe value swholeandfractionalcomponents(thatis,thecomputerequivalentofawritten radixpoint).Asimplewaytoaccomplishthisistodefineastorageformatinwhicha fixed-lengthportionofthebitstringholdsthewholeportionandtheremainderofthebit stringholdsthefractionalportion.Figure3.3showsthisformatwithasignbitandfixed radixpoint. TheformatinFigure3.3isstructurallysimplebecauseoftheradixpoint sfixedloca- tion.TheadvantageofthissimplicityissimplerandfasterCPUprocessingcircuitry. FIGURE3.3 A32-bitstorageformatforrealnumbersusingafixedradixpoint CourtesyofCourseTechnology/CengageLearning 76 Chapter3 Unfortunately,processingefficiencyisgainedbylimitingnumericrange.Althoughthe sampleformatuses32bits,itsnumericrangeissubstantiallylessthan32-bittwoscom- plement.Only16bitsareallocatedtothewholeportionofthevalue.Therefore,thelarg- estpossiblewholevalueis216-1,or65,535.Theremainingbitsstorethefractional portionofthevalue,whichcanneverbegreaterthanorequalto1. Youcouldincreasetheformat snumericrangebyallocatingmorebitstothewhole portion(shiftingtheradixpointinFigure3.3totheright).Iftheformat stotalsizeisfixed at32bits,however,thereallocationwouldreducethenumberofbitsusedtostorethe fractionalportionofthevalue.Reallocatingbitsfrom thefractionalportiontothewhole portionreducestheprecisionoffractionalquantities,whichreducescomputational accuracy. Floating-PointNotation Onewayofdealingwiththetradeoffbetweenrangeandprecisionistoabandonthecon- ceptofafixedradixpoint.Torepresentextremelysmall(precise)values,movetheradix pointfartotheleft.Forexample,thefollowingvaluehasonlyasingledigittotheleftof theradixpoint: 0 0000000013526473 Similarly,verylargevaluescanberepresentedbymovingtheradixpointfartothe right,asinthisexample: 1352647300000000 0 Notethatbothexampleshavethesamenumberofdigits.Byfloatingtheradixpoint leftorright,thefirstexampletradesrangeofthewholeportionforincreasedfractional precision,andthesecondexampletradesfractionalprecisionforincreasedwholerange. Valuescanbeverylargeorverysmall(precise)butnotbothatthesametime. Peopletendtocommiterrorswhenmanipulatinglongstringsofdigits.Tominimize errors,theyoftenwritelargenumbersinamorecompactformatcalledscientificnotation. Inscientificnotation,thetwoprecedingnumbersshownarerepresentedas13,526,473 10-16and13,526,473 108.Notethatthenumberingsystem sbase(10)ispartofthe multiplier.Theexponentcanbeinterpretedasthenumberanddirectionofpositional movesoftheradixpoint,asshowninFigure3.4.Negativeexponentsindicatemovement totheleft,andpositiveexponentsindicatemovementtotheright. FIGURE3.4 Conversionofscientificnotationtodecimalnotation CourtesyofCourseTechnology/CengageLearning 77 CPUDataTypes Realnumbersarerepresentedincomputersbyusingfloating-pointnotation,whichis similartoscientificnotationexceptthat2(ratherthan10)isthebase.Anumericvalueis derivedfrom afloating-pointbitstringaccordingtothefollowingformula: value mantissa 2exponent Themantissaholdsthebitsthatareinterpretedtoderivetherealnumbersdigits.By convention,themantissaisassumedtobeprecededbyaradixpoint.Theexponentvalue indicatestheradixpoint sposition. ManyCPU-specificimplementationsoffloating-pointnotationarepossible.Differences intheseimplementationscanincludethelengthandcodingformatsofthemantissaand exponentandtheradixpoint slocationinthemantissa.Althoughtwoscomplementcan beusedtocodetheexponent,mantissa,orboth,othercodingformatsmightofferbetter designtradeoffs.Beforethe1980s,therewaslittlecompatibilityinfloating-pointformat betweendifferentCPUs,whichmadetransportingfloating-pointdatabetweendifferent computersdifficultorimpossible. TheInstituteofElectricalandElectronicsEngineers(IEEE)addressedthisproblem in standard754,whichdefinesthefollowingformatsforfloating-pointdata: binary32 32-bitformatforbase2values binary64 64-bitformatforbase2values binary128 128-bitformatforbase2values decimal64 64-bitformatforbase10values decimal128 128-bitformatforbase10values Thebinary32andbinary64formatswerespecifiedinthestandard s1985versionand havebeenadoptedbyallcomputerandmicroprocessormanufacturers.Theotherthree formatsweredefinedinthe2008version.Computerandmicroprocessormanufacturers arecurrentlyintheprocessofincorporatingtheseformatsintotheirproducts,andsome products(suchastheIBM POWER6processor)alreadyincludesomenewerformats.For theremainderofthischapter,allreferencestofloating-pointrepresentationrefertothe binary32format,unlessotherwisespecified. Figure3.5showsthebinary32format.Theleadingsignbitappliestothemantissa,not theexponent,andis1ifthemantissaisnegative.The8-bitexponentiscodedinexcess notation(meaningitsfirstbitisasignbit).The23-bitmantissaiscodedasanordinary binarynumber.It sassumedtobeprecededbyabinary1andtheradixpoint.Thisformat extendsthemantissa sprecisionto24bits,althoughonly23areactuallystored. 78 Chapter3 Range,Overflow,andUnderflow Thenumberofbitsinafloating-pointstringandtheformatsofthemantissaandexponent imposelimitsontherangeofvaluesthatcanberepresented.Thenumberofdigitsinthe mantissadeterminesthenumberofsignificant(nonzero)digitsinthelargestandsmallest valuesthatcanberepresented.Thenumberofdigitsintheexponentdeterminesthe numberofpossiblebitpositionstotherightorleftoftheradixpoint. Usingthenumberofbitsassignedtomantissaandexponent,thelargestabsolutevalue ofafloating-pointvalueappearstobethefollowing: 1 11111111111111111111111 211111111 Exponentscontainingall0sandall1s,however,representspecialdatavaluesinthe IEEEstandards.Therefore,theusableexponentrangeisreduced,andthedecimalrange fortheentirefloating-pointvalueisapproximately10-45to1038. Floating-pointnumberswithlargeabsolutevalueshavelargepositiveexponents. Whenoverflow occurs,italwaysoccursintheexponent.Floating-pointrepresentationis alsosubjecttoarelatederrorconditioncalled underflow.Verysmallnumbersarerepre- sentedbynegativeexponents.Underflowoccurswhentheabsolutevalueofanegative exponentistoolargetofitinthebitsallocatedtostoreit. PrecisionandTruncation Recallthatscientificnotation,includingfloating-pointnotation,tradesnumericrangefor accuracy.Accuracyisreducedasthenumberofdigitsavailabletostorethemantissais reduced.The23-bitmantissausedinthebinary32formatrepresentsapproximatelyseven decimaldigitsofprecision.However,manyusefulnumberscontainmorethanseven nonzerodecimaldigits,suchasthedecimalequivalentofthefraction1/3: 1 3 033333333 Thenumberofdigitstotherightofthedecimalpointisinfinite.Onlyalimited numberofmantissadigitsareavailable,however. Numberssuchas1/3arestoredinfloating-pointformatby truncation.Thenumeric valueisstoredinthemantissa,startingwithitsmostsignificantbit,untilallavailablebits areused.Theremainingbitsarediscarded.Anerrororapproximationoccursanytimea floating-pointvalueistruncated.However,thetruncateddigitsareinsignificantcompared FIGURE3.5 IEEEbinary32floating-pointformat CourtesyofCourseTechnology/CengageLearning 79 CPUDataTypes withthesignificant,orlarge,valuethatsstored.Problemscanresultwhentruncated valuesareusedasinputtocomputations.Theerrorintroducedbytruncationcanbe magnifiedwhentruncatedvaluesareusedandgenerateinaccurateresults.Theerror resultingfrom alongseriesofcomputationsstartingwithtruncatedinputscanbelarge. Anaddeddifficultyisthatmorevalueshavenonterminatingrepresentationsinthe binarysystem thaninthedecimalsystem.Forexample,thefraction1/10isnonterminat- inginbinarynotation.Therepresentationofthisvalueinfloating-pointnotationisa truncatedvalue,buttheseproblemscanusuallybeavoidedwithcarefulprogramming.In general,programmersreservebinaryfloating-pointcalculationsforquantitiesthatcan varycontinuouslyoverwideranges,suchasmeasurementsmadebyscientific instruments. Whenpossible,programmersusedatatypesotherthanbinary32toavoidorminimize theimpactoftruncation.Mostcurrentmicroprocessorscanstoreandmanipulate binary64values,andsupportforbinary128isgraduallybeingadded.Inaddition,most programminglanguagescanemulatebinary128,decimal64,anddecimal128values, althoughprocessingthesevaluesisconsiderablyslowerthanwhenthemicroprocessor supportsthem ashardwaredatatypes.Programmersseekingtominimizerepresentation andcomputationerrorsshouldchoosethelargestfloating-pointformatsupportedby hardwareorsoftware. Monetaryvaluesareparticularlysensitivetotruncationerrors.Mostmonetarysys- temshaveatleastonefractionalmonetaryunit,suchaspennies fractionsofaU.S.dol- lar.Noviceprogrammerssometimesassumethatmonetaryamountsshouldbestoredand manipulatedasbinaryfloating-pointnumbers.Inevitably,truncationerrorscausedby nonterminatingrepresentationsoftenthsandotherfractionsoccur.Cumulativeerrors mountwhentruncatednumbers,orapproximations,areinputinsubsequentcalculations. Onewaytoaddresstheproblem istouseintegerarithmeticforaccountingandfinan- cialapplications.Todoso,aprogrammerstoresandmanipulatesmonetaryamountsin thesmallestpossiblemonetaryunit forexample,U.S.penniesorMexicanpesos.Small denominationsareconvertedtolargeronesonlywhenneededforoutput,suchasprinting adollaramountonacheckoraccountstatement. Althoughrepresentingandmanipulatingmonetaryvaluesasintegersprovidescomputa- tionalaccuracy,thismethodhaslimitations.Forexample,complexformulasforcomputing interestonloansorinvestmentbalancesincludeexponentsanddivision.Intermediatecal- culationresultsforprogramsusingtheseformulascanproducefractionalquantitiesunless monetaryamountsarescaledtoverysmallunits(forexample,millionthsofapenny). Thedecimal64anddecimal128bitformatsdefinedinthe2008versionofIEEEstan- dard754areintendedtoaddresstheshortcomingsofbothbinaryfloating-pointandinte- gerrepresentationofmonetaryunits.Theseformatsprovideaccuraterepresentationof decimalvalues,andthestandardspecifiesroundingmethodsthatcanbeusedinsteadof truncationtoimprovecomputationalaccuracy.Bothformatsusethesamebasicapproach asinbinaryformats amantissaandanexponent buttheyencodethreedecimaldigits ineach10-bitgroup. ProcessingComplexity Thedifficultyoflearningtousescientificandfloating-pointnotationisunderstandable. Theseformatsarefarmorecomplexthanintegerdataformats,andthecomplexityaffects 80 Chapter3 bothpeopleandcomputers.Althoughfloating-pointformatsareoptimizedforprocessing efficiency,theystillrequirecomplexprocessingcircuitry.Thesimplertwoscomplement formatusedforintegersrequiresmuchlesscomplexcircuitry. Thedifferenceinprocessingcircuitrycomplexitytranslatestoadifferenceinspeedof performingcalculations.Themagnitudeofthedifferencedependsonseveralfactors, includingthecomputationandtheexactdetailsoftheprocessingcircuitry.Asageneral rule,simplecomputationaloperations,suchasadditionandsubtraction,takeatleast twiceaslongwithfloating-pointnumbersthanwithintegers.Thedifferenceiseven greaterforoperationssuchasdivisionandexponentiation.Forthisreasonandforreasons ofaccuracy,carefulprogrammersneverusearealnumberwhenanintegercanbeused, particularlyforfrequentlyupdateddataitems. CharacterData Intheirwrittenform,Englishandmanyotherlanguagesusealphabeticletters,numerals, punctuationmarks,andavarietyofotherspecial-purposesymbols,suchas$and&.Each symbolisa character.Asequenceofcharactersthatformsameaningfulword,phrase,or otherusefulgroupisa string.Inmostprogramminglanguages,singlecharactersaresur- roundedbysinglequotationmarks( 'c'),andstringsaresurroundedbydoublequotation marks( "computer"). Characterdatacan tberepresentedorprocesseddirectlyinacomputerbecause computersaredesignedtoprocessonlydigitaldata(bits).Itcanberepresentedindirectly bydefiningatablethatassignsanumericvaluetoeachcharacter.Forexample,theinte- gervalues0through9canbeusedtorepresentthecharacters(numerals) '0'through'9', theuppercaseletters 'A'through'Z'canberepresentedastheintegervalues10through 36,andsoforth. Atable-basedsubstitutionofonesetofsymbolsorvaluesforanotherisoneexample ofacodingmethod.Allcodingmethodsshareseveralimportantcharacteristics,including thefollowing:

Allusersmustusethesamecodinganddecodingmethods. Thecodedvaluesmustbecapableofbeingstoredortransmitted. Acodingmethodrepresentsatradeoffamongcompactness,range,easeof manipulation,accuracy,andstandardization. Thefollowingsectionsdescribesomecommoncodingmethodsforcharacterdata. EBCDIC ExtendedBinaryCodedDecimalInterchangeCode(EBCDIC) isacharacter-coding methoddevelopedbyIBM inthe1960sandusedinallIBM mainframeswellintothe 2000s.RecentIBM mainframesandmainframeOSssupportmorerecentcharacter-coding methods,butsupportforEBCDICisstillmaintainedforbackwardcompatibility.EBCDIC charactersareencodedasstringsof8bits. ASCII The AmericanStandardCodeforInformationInterchange(ASCII),adoptedintheUnited Statesinthe1970s,isawidelyusedcodingmethodindatacommunication.Theinternational 81 CPUDataTypes equivalentofthiscodingmethodisInternationalAlphabet5(IA5),anInternationalOrgani- zationforStandardization(ISO) standard.AlmostallcomputersandOSssupportASCII, althoughagradualmigrationisinprogresstoitsnewerrelative,Unicode. ASCIIisa7-bitformatbecausemostcomputersandperipheraldevicestransmitdata inbytesandbecauseparitycheckingwasusedwidelyinthe1960sto1980sfordetecting transmissionerrors.Chapter8discussesparitycheckingandothererrordetectionand correctionmethods.Fornow,theimportantcharacteristicofparitycheckingisthatit requires1extrabitpercharacter.Therefore,1ofevery8bitsisn tpartofthedatavalue, leavingonly7bitsfordatarepresentation. ThestandardASCIIversionusedfordatatransferissometimescalledASCII-7to emphasizeits7-bitformat.Thiscodingtablehas128,or27,definedcharacters.Computers thatuse8-bitbytesarecapableofrepresenting256,or28,differentcharacters.Inmost computers,theASCII-7charactersareincludedinan8-bitcharactercodingtableasthe first,orlower,128tableentries.Theadditional,orupper,128entriesaredefinedbythe computermanufacturerandtypicallyusedforgraphicalcharacters,suchasline-drawing charactersandmultinationalcharacters forexample,á,ñ,Ö,and .Thisencoding methodissometimescalledASCII-8.Theterm isamisnomer,asitimpliesthattheentire table(all256entries)isstandardized.Infact,onlythefirst128entriesaredefinedbythe ASCIIstandard.Table3.5showsportionsoftheASCIIandEBCDICcodingtables. TABLE3.5 PartiallistingofASCIIandEBCDICcodes Symbol ASCII EBCDIC 0 0110000 11110000 1 0110001 11110001 2 0110010 11110010 3 0110011 11110011 4 0110100 11110100 5 0110101 11110101 6 0110110 11110110 7 0110111 11110111 8 0111000 11111000 9 0111001 11111001 A 1000001 11000001 B 1000010 11000010 C 1000011 11000011 a 1100001 10000001 b 1100010 10000010 c 1100011 10000011 82 Chapter3 DeviceControl Whentextisprintedordisplayedonanoutputdevice,oftenitsformat- tedinaparticularway.Forexample,textoutputtoaprinterisnormallyformattedin linesandparagraphs,andacustomerrecordcanbedisplayedonscreensothatitlooks likeaprintedform.Certaintextcanbehighlightedwhenprintedordisplayedbyusing methodssuchasunderlining,boldfont,orreversedbackgroundandforegroundcolors. ASCIIdefinesseveraldevicecontrolcodes(seeTable3.6)usedfortextformattingby sendingthem immediatelybeforeorafterthecharacterstheymodify.Amongthesimpler codesarecarriagereturn,whichmovestheprintheadorinsertionpointtothebeginning ofaline;linefeed,whichmovestheprintheadorinsertionpointdownoneline;andbell, whichgeneratesashortsound,suchasabeeporbellring.InASCII,eachofthesefunc- tionsisassignedanumericcodeandashortcharactername,suchasCRforcarriage return,LFforlinefeed,andBELforbell.Inaddition,someASCIIdevicecontrolcodesare usedtocontroldatatransfer.Forexample,ACKissenttoacknowledgecorrectreceiptof data,andNAKissenttoindicatethatanerrorhasbeendetected. N O TE The33devicecontrolcodesintheASCIItableoccupythefirst32entries(numbered0through31)and thelastentry(number127). TABLE3.6 ASCIIcontrolcodes Decimalcode Controlcharacter Description 000 NUL Null 001 SOH Startofheading 002 STX Startoftext 003 ETX Endoftext 004 EOT Endoftransmission 005 ENQ Enquiry 006 ACK Acknowledge 007 BEL Bell 008 BS Backspace 009 HT Horizontaltabulation 010 LF Linefeed 011 VT Verticaltabulation 012 FF Formfeed 013 CR Carriagereturn 014 SO Shiftout 015 SI Shiftin 83 CPUDataTypes SoftwareandHardwareSupport Becausecharactersareusuallyrepresentedinthe CPUasunsignedintegers,there slittleornoneedforspecialcharacter-processing instructions.Instructionsthatmoveandcopyunsignedintegersbehavethesamewhether thecontentbeingmanipulatedisanactualnumericvalueoranASCII-encodedcharacter. Similarly,anequalityorinequalitycomparisoninstructionthatworksforunsignedinte- gersalsoworksforvaluesrepresentingcharacters. Theresultsofnonequalitycomparisonsarelessstraightforward.Theassignmentof numericcodestocharactersfollowsaspecificordercalleda collatingsequence.Agreater- thancomparisonwithtwocharacterinputs(forexample, 'a'lessthan'z')returnsaresult basedonthenumericcomparisonofthecorrespondingASCIIcodes thatis,whetherthe numericcodefor 'a'islessthanthenumericcodefor'z'.Ifthecharactersethasanorder andthecodingmethodfollowstheorder,less-thanandgreater-thancomparisonsusually produceexpectedresults. However,usingnumericvaluestorepresentcharacterscanproducesomeunexpected orunplannedresults.Forexample,thecollatingsequenceoflettersandnumeralsinASCII followsthestandardalphabeticorderforlettersandnumericorderfornumerals,but uppercaseandlowercaselettersarerepresentedbydifferentcodes.Asaresult,anequality comparisonbetweenuppercaseandlowercaseversionsofthesameletterreturnsfalse becausethenumericcodesaren tidentical.Forexample,'a'doesntequal'A',asshown TABLE3.6 ASCIIcontrolcodes(continued) Decimalcode Controlcharacter Description 016 DLE Datalinkescape 017 DC1 Devicecontrol1 018 DC2 Devicecontrol2 019 DC3 Devicecontrol3 020 DC4 Devicecontrol4 021 NAK Negativeacknowledge 022 SYN Synchronousidle 023 ETB Endoftransmissionblock 024 CAN Cancel 025 EM Endofmedium 026 SUB Substitute 027 ESC Escape 028 FS Fileseparator 029 GS Groupseparator 030 RS Recordseparator 031 US Unitseparator 127 DEL Delete 84 Chapter3 previouslyinTable3.5.Punctuationsymbolsalsohaveaspecificorderinthecollating sequence,althoughthere snowidelyacceptedorderingforthem. ASCIILimitations ASCIIsdesignerscouldntforeseethecodeslonglifetime(almost50 years)ortherevolutionsinI/O devicetechnologiesthatwouldtakeplace.Theynever envisionedmodernI/O devicecharacteristics,suchascolor,bitmappedgraphics,and selectablefonts.Unfortunately,ASCIIdoesn thavetherangetodefineenoughcontrol codestoaccountforalltheformattinganddisplaycapabilitiesinmodernI/O devices. ASCIIisalsoanEnglish-basedcodingmethod.Thisisn tsurprising,giventhatwhenit wasdefined,theUnitedStatesaccountedformostcomputeruseandalmostallcomputer production.ASCIIhasaheavybiastowardWesternlanguagesingeneralandAmerican Englishinparticular,whichbecameamajorlimitationascomputeruseandproduction proliferatedworldwide. Recallthat7-bitASCIIhasonly128tableentries,33ofwhichareusedfordevicecontrol. Only95printablecharacterscanberepresented,whichareenoughforausablesubsetofthe characterscommonlyusedinAmericanEnglishtext.Thissubset,however,doesn tinclude anymodifiedLatincharacters,suchasçandá,orthosefrom otheralphabets,suchas . TheISO partiallyaddressedthisproblem bydefiningmanydifferent256-entrytables basedontheASCIImodel.One,called Latin-1,containstheASCII-7charactersinthe lower128tableentriesandmostofthecharactersusedbyWesternEuropeanlanguagesin theupper128tableentries.Theupper128entriesaresometimescalled multinational characters .Thenumberofavailablecharactercodesina256-entrytable,however,isstill muchtoosmalltorepresentthefullrangeofprintablecharactersinworldlanguages. Furthercomplicatingmattersisthatsomeprintedlanguagesaren tbasedoncharac- tersintheWesternsense.Chinese,Japanese,andKoreanwrittentextconsistsofideo- graphs,whicharepictorialrepresentationsofwordsorconcepts.Ideographsarecomposed ofgraphicalelements,sometimescalledstrokes,thatnumberinthethousands.Other writtenlanguages,suchasArabic,presentsimilar,althoughlesssevere,codingproblems. Unicode TheUnicodeConsortium ( www.unicode.org)wasfoundedin1991todevelopamultilin- gualcharacter-encodingstandardencompassingallwrittenlanguages.Theoriginalmem- berswereAppleComputerCorporationandXeroxCorporation,butmanycomputer companiessoonjoined.Thisefforthasenabledsoftwareanddatatocrossinternational boundaries.MajorUnicodestandardreleasesarecoordinatedwithISO standard10646. Asofthiswriting,thelateststandardisUnicode5.2,publishedinOctober2009. LikeASCII, Unicodeisacodingtablethatassignsnonnegativeintegerstorepresent printablecharacters.TheISO Latin-1standard,whichincludesASCII-7,isincorporated intoUnicodeasthefirst256tableentries.Therefore,ASCIIisasubsetofUnicode.An importantdifferencebetweenASCIIandUnicodeisthesizeofthecodingtable.Early versionsofUnicodeused16-bitcode,whichprovided65,536tableentriesnumbered 0through65,535.Asdevelopmenteffortsproceeded,thenumberofcharactersexceeded thecapacityofa16-bitcode.LaterUnicodeversionsuse16-bitor32-bitcodes,andthe standardcurrentlyencompassesmorethan100,000characters. Theadditionaltableentriesareusedprimarilyforcharacters,strokes,andideographs oflanguagesotherthanEnglishanditsWesternEuropeansiblings.Unicodeincludesmany otheralphabets,suchasArabic,Cyrillic,andHebrew,andthousandsofChinese, 85 CPUDataTypes Japanese,andKoreanideographsandcharacters.SomeextensionstoASCIIdevicecontrol codesarealsoprovided.Ascurrentlydefined,Unicodecanrepresentwrittentextfrom all modernlanguages.Approximately6000charactersarereservedforprivateuse. Unicodeiswidelysupportedinmodernsoftware,includingmostOSsandword- processingapplications.BecausemostCP Ussupportcharactersencodedin32-bit unsignedintegers,there snoneedtoupgradeprocessorhardwaretosupportUnicode. ThemainimpactofUnicodeonhardwareisinstorageandI/O devices.Becausethe numericcode ssizeisquadruplethatofASCII,puretextfilesarefourtimesaslarge. Thisincreasedsizeseemstobeaproblem atfirstglance,buttheimpactisreduced withcustom word-processingfileformatsandtheever-decliningcostofsecondary storage.Inaddition,mostdocumentsaren tstoredasASCIIorUnicodetextfiles. Instead,they restoredinaformatthatintermixestextandformattingcommands. Becauseformattingcommandsalsooccupyfilespace,thefilesizeincreasecausedby switchingfrom ASCIItoUnicodeisgenera llylessthanimpliedbyquadruplingper- characterstorage. BeforeUnicode,devicesdesignedforcharacterI/O usedASCIIbydefaultandvendor- specificmethodsorolderISO standardstoprocesscharactersetsotherthanLatin-1.The typicalmethodwastomaintainaninternalsetofalternativecodingtables,witheachtable containingadifferentalphabetorcharacterset.Device-specificcommandsswitchedfrom onetabletoanother.Unicodeunifiesandstandardizesthecontentofthesetablestopro- videastandardizedmethodforprocessinginternationalcharactersets.Thisstandardhas beenwidelyadoptedbyI/O devicemanufacturers.Inaddition,backwardcompatibility withASCIIisensuredbecauseUnicodeincludesASCII. Boolean Data The Booleandatatypehasonlytwodatavalues trueandfalse.Mostpeopledontthink ofBooleanvaluesasdataitems,buttheprimitivenatureofCPUprocessingrequiresthe capabilitytostoreandmanipulateBooleanvalues.Recallthatprocessingcircuitryphysi- callytransformsinputsignalsintooutputsignals.Iftheinputsignalsrepresentnumbers andtheprocessorisperformingacomputationalfunction,suchasaddition,theoutput signalrepresentsthenumericalresult. Whentheprocessingfunctionisacomparisonoperation,suchasgreaterthanorequal to,theoutputsignalrepresentsaBooleanresultoftrueorfalse.Thisresultisstoredina register(asisanyotherprocessingresult)andcanbeusedbyotherinstructionsasinput (forexample,aconditionaloranunconditionalbranchinaprogram).TheBooleandata typeispotentiallythemostconcisecodingformatbecauseonlyasinglebitisrequired. Forexample,binary1canrepresenttrue,and0canrepresentfalse.Tosimplifyprocessor designandimplementation,mostCPUdesignersseektominimizethenumberofdifferent codingformatsused.CPUsgenerallyuseanintegercodingformatforintegerstorepresent Booleanvalues.Whencodedinthismanner,theintegervaluezerocorrespondstofalse, andallnonzerovaluescorrespondtotrue. Toconservememoryandstorage,sometimesprogrammers pack manyBoolean valuesintoasingleintegerbyusingeachbittorepresentaseparateBooleanvalue. Althoughthismethodconservesmemory,generallyitslowsprogram executionbecause ofthecomplicatedinstructionsrequiredtoextractandinterpretseparatebitsfrom an integerdataitem. 86 Chapter3 MemoryAddresses AsdescribedinChapter2,primarystorageisaseriesofcontiguousbytesofstorage.Con- ceptually,eachmemorybytehasauniqueidentifyingorserialnumber.Theidentifying numbersarecalledaddresses,andtheirvaluesstartwithzeroandcontinueinsequence (withnogaps)untilthelastbyteofmemoryisaddressed. InsomeCPUs,thisconceptualviewisalsothephysicalmethodofidentifyingand accessingmemorylocations.Thatis,memorybytesareidentifiedbyaseriesofnonnega- tiveintegers.Thisapproachtoassigningmemoryaddressesiscalleda flatmemorymodel. InCPUsusingaflatmemorymodel,usingtwoscomplementorunsignedbinaryasthe codingformatformemoryaddressesislogicalandtypical.Theadvantageofthisapproach isthatitminimizesthenumberofdifferentdatatypesandthecomplexityofprocessor circuitry. TomaintainbackwardcompatibilitywithearliergenerationsofCPUsthatdon tusea flatmemorymodel,someCPUdesignerschoosenottousesimpleintegersasmemory addresses.Forexample,IntelCoreCPUsmaintainbackwardcompatibilitywiththe8086 CPUusedinthefirstIBM PC.Earliergenerationsofprocessorstypicallyusedadifferent approachtomemoryaddressingcalleda segmentedmemorymodel,inwhichprimary storageisdividedintoequal-sized(forexample,64KB)segmentscalledpages.Pagesare identifiedbysequentialnonnegativeintegers.Eachbyteinapageisalsoidentifiedbya sequentiallyassignednonnegativeinteger.Eachbyteofmemoryhasatwo-partaddress: Thefirstpartidentifiesthepage,andthesecondpartidentifiesthebyteinthepage. Becauseasegmentedmemoryaddresscontainstwointegervalues,thedata-coding methodforsinglesignedorunsignedintegerscan tbeused.Instead,aspecificcodingfor- matformemoryaddressesmustbedefinedandused.Forthisreason,CPUswithseg- mentedmemorymodelshaveanextradatatype,orcodingformat,forstoringmemory addresses. T E C H N O L O G Y F O C U S IntelMemoryAddressFormats IntelmicroprocessorshavebeenusedinPCssince1981.TheoriginalIBM PCusedan Intel8086microprocessor.AlllatergenerationsofIntelprocessors(the80x86,Pentium, andCoreprocessorfamilies)havemaintainedbackwardcompatibilitywiththe8086and canexecuteCPUinstructionsdesignedforthe8086.Backwardcompatibilityhasbeen consideredanessentialingredientinthesuccessofPCsbasedonIntelCPUs. EarlyIntelmicroprocessorsweredesignedandimplementedwithspeedofprocessor operationasaprimarygoal.Becauseageneral-purposeCPUusesmemoryaddressesin nearlyeveryinstructionexecutioncycle,makingmemoryaccessasefficientaspossible isimportant.Processorcomplexityriseswiththenumberofbitsthatmustbeprocessed simultaneously.Largecodingformatsformemoryaddressesandotherdatatypesrequire complicatedandslowprocessingcircuitry.Inteldesignerswantedtominimizethecom- plexityofprocessorcircuitryassociatedwithmemoryaddresses.Theyalsoneededto balancethedesireforefficientmemoryprocessingwiththeneedtomaintainalarge ( continued) 87 CPUDataTypes rangeofpossiblememoryaddresses.Toachievethisbalance,theymadetwomajor designdecisionsforthe8086: A20-bitsizeformemoryaddresses Asegmentedmemorymodel The20-bitaddressformatlimitedusable(addressable)memoryto1MB(220 addressablebytes).Thislimitationdidn tseem likeamajoronebecausefewcomputers (includingmostmainframes)hadthatmuchmemoryatthetime.The20-bitformatis dividedintotwoparts:a4-bitsegmentidentifieranda16-bitsegmentoffset.The16-bit segmentoffsetidentifiesaspecificbyteina64(216)KBmemorysegment.Because4bits areusedtorepresentthesegment,thereare16,or24,possiblesegments. Inteldesignersanticipatedthatmostprogramswouldfitinasingle64KBmemory segment.Further,theyknewthat16-bitmemoryaddressescouldbeprocessedmore efficientlythanlargeraddresses.Therefore,theydefinedtwotypesofaddress-processing functions thoseusingthe4-bitsegmentportionoftheaddressandthoseignoringit. Whenthesegmentidentifierwasused,memoryaccesswasslow,butitwasmuchfaster whentheprocessorignoredthesegmentidentifier.Inteldesignersassumedthatmost memoryaccesseswouldn trequirethesegmentidentifier. Boththe64KBsegmentsizeandthe1MBmemorylimitsoonbecamesignificant constraints.AlthoughearlyprogramsfortheIBM PCweregenerallysmallerthan64KB, laterversionsoftheseprogramswerelarger.Becausetheprocessordesignimposeda performancepenaltywhenthesegmentidentifierwasused,laterversionsranmore slowlythanearlierversionsdid.Inaddition,boththeOSandthecomputerhardware werebecomingmorecomplexandconsumingmorememoryresources.Asaresult, computerdesigners,OSdevelopers,anduserschafedunderthe1MBmemorylimit. Inteldesignersfirstaddressedtheseconstraintsinthe80286byincreasingtheseg- mentidentifierto8bits,thusincreasingtotaladdressablememoryto16MB.The80386 wentastepfurtherbyprovidingtwomethodsofaddressing.Thefirst,calledrealmode, wascompatiblewiththe8086 saddressingmethod.Thesecond,calledprotectedmode, wasanewmethodbasedon32-bitmemoryaddresses.Protected-modeaddressingelimi- natedtheperformancepenaltyforprogramslargerthan64KB.Pentium andCore microprocessorscontinuetousebothaddressingmethods. The32-bitprotected-modeaddressesareadequateforphysicalmemoryupto4GB. Bythelate1990sand2000s,largercomputerclasseswerereachingthislimit,andPCs andworkstationswereexpectedtofollowinadecade.Intelexpandedprotected-mode memoryaddressesto64bitsbeginninginthemid-2000s.Thechangewasfarlessdis- ruptivethanearlierchangesbecauseaddressesofbothsizesusetheflatmemorymodel. Nonetheless,usershaveencounteredsomebumpsintheroadassoftwaremanufacturers shiftedfrom 32-bitto64-bitaddressing. Thisdiscussion,andtheearlierdiscussionofASCII,illustratesafewpitfallsfor computerdesignersinchoosingdatarepresentationmethodsandformats.Afundamental characteristicofanydatarepresentationmethodisthatitinvolvesbalancingseveral variables.ForIntelmemoryaddresses,designershadtobalanceprocessorcost,speedof program execution,andmemoryrequirementsoftypicalPCsoftware.Thebalancemight havebeenoptimal,oratleastreasonable,giventhestateofthesevariablesin1981,but neithertimenortechnologystandsstill.TheoutcomeofanyCPUdesigndecision, includingdatarepresentationformats,isaffectedquicklyastechnologychanges. AttemptstomaintainbackwardcompatibilitywithpreviousCPUdesignscanbedifficult andexpensiveandmightseverelylimitfuturedesignchoices. 88 Chapter3 DATA STRUCTURES TheprevioussectionsoutlinedthedatatypesthataCPUcanmanipulatedirectly:integer, realnumber,character,Boolean,andmemoryaddress.ThedatatypesaCPUsupportsare sometimescalled primitivedatatypesormachinedatatypes.Computerscanalsoprocess morecomplexdata,suchasstrings,arrays,textfiles,databases,anddigitalrepresenta- tionsofaudioandimagedata,suchasMP3,JPEG,andMPEG files.Chapter7discusses audioandimagedatarepresentationsandrelatedhardwaredevices.Theremainderofthis chapterconcentratesonothercomplexdataformatscommonlymanipulatedbysystem andapplicationsoftware. Ifonlyprimitivedatatypeswereavailable,developingprogramsofanykindwouldbe difficult.Mostapplicationprogramsneedtocombineprimitivedataitemstoform useful aggregations.Acommonexampleisacharacterortextstring.Mostapplicationprograms defineandusecharacterstrings,butfew CPUsprovideinstructionsthatmanipulatethem directly.Applicationdevelopmentissimplifiedifcharacterstringscanbedefinedand manipulated(thatis,read,written,andcompared)asasingleunitinsteadofonecharac- teratatime. A datastructureisarelatedgroupofprimitivedataelementsorganizedforsometype ofcommonprocessingandisdefinedandmanipulatedinsoftware.Computerhardware can tmanipulatedatastructuresdirectly;itmustdealwiththem intermsoftheirprimi- tivecomponents,suchasintegers,floating-pointnumbers,singlecharacters,andsoon. Softwaremusttranslateoperationsondatastructuresintoequivalentmachineinstruc- tionsthatoperateoneachprimitivedataelement.Forexample,Figure3.6showsacom- parisonoftwostringsdecomposedintocomparisonoperationsoneachcharacter. 89 DataStructures Thecomplexityofdatastructuresislimitedonlybyprogrammersimaginationand skill.Asapracticalmatter,certaindatastructuresareusefulinawidevarietyofsitua- tionsandareusedcommonly,suchascharacterstringsorarrays,records,andfiles.Sys- tem softwareoftenprovidesapplicationservicestomanipulatethesecommonlyuseddata structures.Forexample,anOSnormallyprovidesservicesforreadingfrom andwritingto files.

Otherdatastructuresarelesscommonlysupportedbysystem software.Examples includenumericarrays,indexedfiles,andcomplexdatabasestructures.Indexedfilesare supportedinsome,butnotall,OSs.Numericarraysaresupportedinmostprogramming languagesbutnotinmostOSs.Databasestructuresarenormallysupportedbyadatabase managementsystem.Mostprogramminglanguagessupportdirectmanipulationofcharac- terstrings. Datastructureshaveanimportantroleinsystem softwaredevelopment.Forexample, OSscommonlyuselinkedliststokeeptrackofmemoryblocksallocatedtoprograms anddiskblocksallocatedtofilesanddirectories.Indexesarewidelyusedindatabase FIGURE3.6 Softwaredecomposesoperationsondatastructuresintooperationsonprimitivedata elements CourtesyofCourseTechnology/CengageLearning 90 Chapter3 managementsystemstospeedsearchandretrievaloperations.Programmerswhodevelop system softwaregenerallytakeanentirecourseindatastructuresandefficientmethods formanipulatingthem. Pointersand Addresses Whetherimplementedinsystem orapplicationsoftware,almostalldatastructuresmake extensiveuseofpointersandaddresses.A pointerisadataelementcontainingthe addressofanotherdataelement.An addressisthelocationofadataelementinastorage device.Addressesvaryincontentandrepresentation,dependingonthestoragedevice beingaddressed.Secondarystoragedevicesarenormallyorganizedasasequenceofdata blocks.Ablockisagroupofbytesreadorwrittenasaunit.Forexample,diskdrivesusu- allyreadandwritedatain512-byteblocks.Forblock-orientedstoragedevices,anaddress canberepresentedasanintegercontainingtheblock ssequentialposition.Integerscan alsobeusedtorepresenttheaddressofasinglebyteinablock. AsdiscussedinthepreviousTechnologyFocus,memoryaddressescanbecomplexif asegmentedmemorymodelisused.Forthepurposeofdiscussingdatastructures,aflat memorymodelisused,andmemoryaddressesarerepresentedbynonnegativeintegers. Arraysand Lists Manytypesofdatacanbegroupedintolists.Alistisasetofrelateddatavalues.Inmath- ematics,alistisconsideredunordered,sonospecificlistelementisdesignatedasfirst, second,last,andsoon.Whenwritingsoftware,aprogrammerusuallypreferstoimpose someorderingonthelist.Forexample,alistofdaysoftheweekcanbeorderedsequen- tially,startingwithMonday. An arrayisanorderedlistinwhicheachelementcanbereferencedbyanindextoits position.Figure3.7showsanexampleofanarrayforthefirstfivelettersoftheEnglish alphabet.Notethattheindexvaluesarenumberedstartingat0,acommon(althoughnot universal)practiceinprogramming.AlthoughtheindexvaluesareshowninFigure3.7, theyaren tstored.Instead,theyreinferredfrom thedatavalueslocationinthestorage allocatedtothearray.Inahigh-levelprogramminglanguage,arrayelementsarenormally referencedbythearraynameandtheindexvalue.Forexample,thethirdletterofthe alphabetstoredinanarraymightbereferencedasfollows: alphabet 2 Inthisexample, alphabet isthearrayname,and2istheindexvalue(withnumber- ingstartingfrom 0). 91 DataStructures Figure3.8showsacharacterarray,orstring,storedsequentiallyincontiguousmem- orylocations.Inthisexample,eachcharacterofthenameJohnDoeisstoredinasingle byteofmemory,andthecharactersareorderedinsequentialbytelocationsstartingat byte1000.Anequivalentorganizationcanbeusedtostorethenameonasecondarystor- agedevice.Theaddressofanarrayelementcanbecalculatedwiththestartingaddressof thearrayandtheelement sindex.Forexample,ifyouwanttoretrievethethirdcharacter inthearray,youcancomputeitsaddressasthesum ofthefirstelement saddressplusthe indexvalue,assumingindexvaluesstartat0. Usingcontiguousstoragelocations,especiallyinsecondarystoragedevices,compli- catestheallocationofstoragelocations.Forexample,addingnewelementstotheendof anarraymightbedifficultifthesestoragelocationsarealreadyallocatedtootherdata items.Forthisreason,contiguousstorageallocationisgenerallyusedonlyforfixed-length arrays.

Foravarietyofreasons,youmightneedtostorearrayelementsinwidelydispersed storagelocations.A linkedlistisadatastructurethatusespointerssothatlistelements canbescatteredamongnonsequentialstoragelocations.Figure3.9showsagenericexam- pleofa singlylinkedlist.Eachlistelementoccupiestwostoragelocations:Thefirstholds thelistelement sdatavalue,andthesecondholdstheaddressofthenextlistelementasa pointer. FIGURE3.7 Arrayelementsincontiguousstoragelocations CourtesyofCourseTechnology/CengageLearning FIGURE3.8 Acharacterarrayincontiguousstoragelocations CourtesyofCourseTechnology/CengageLearning 92 Chapter3 Figure3.10showsacharacterstringstoredinalinkedlist.Notethatcharactersare scatteredamongnonsequential,ornoncontiguous,storagelocations. Morestoragelocationsarerequiredforalinkedlistthanforanarraywithequivalent contentbecausebothdataandpointersmustbestored.Usingpointersalsocomplicates thetaskoflocatingarrayelements.Referencestospecificarrayelementsmustberesolved byfollowingthechainofpointers,startingwiththefirstarrayelement.Thismethodcan beinefficientifthearraycontainsalargenumberofelements.Forexample,accessingthe 1000thelementmeansreadingandfollowingpointersinthefirst999elements. FIGURE3.9 Valueandpointerfieldsofasinglylinkedlist CourtesyofCourseTechnology/CengageLearning FIGURE3.10 Acharacterarraystoredinnoncontiguousmemorylocations,withpointers connectingthearrayelements CourtesyofCourseTechnology/CengageLearning 93 DataStructures Linkedlistsareeasiertoexpandorshrinkthanarraysare.Theprocedureforaddinga newelementisasfollows(seeFigure3.11): 1. Allocatestorageforthenew element. 2. Copythepointerfrom theelementprecedingthenew elementtothepointer fieldofthenew element. 3. Writetheaddressofthenew elementintheprecedingelement spointer field. Incontrast,insertinganelementinaliststoredincontiguousmemorycanbetime consuming.Theprocedureisasfollows(seeFigure3.12): 1. Allocateanew storagelocationattheendofthelist. 2. Foreachelementpasttheinsertionpoint,copyitsvaluetothenextstorage location,startingwiththelastelementandworkingbackwardtotheinsertion point. 3. Writethenew element svalueinthestoragelocationattheinsertionpoint. Insertinganelementnearthebeginningofthearrayishighlyinefficientbecauseof themanyrequiredcopyoperations. FIGURE3.11 Insertinganewelementinasinglylinkedlist CourtesyofCourseTechnology/CengageLearning 94 Chapter3 Figure3.13depictsamorecomplicatedlinkedlistcalledadoublylinkedlist.Each elementofadoublylinkedlisthastwopointers:onepointingtothenextelementinthe listandonepointingtothepreviouselementinthelist.Themainadvantageofdoubly linkedlistsisthattheycanbetraversedineitherdirectionwithequalefficiency.Thedis- advantagesarethatmorepointersmustbeupdatedeachtimeanelementisinsertedinto ordeletedfrom thelist,andmorestoragelocationsarerequiredtoholdtheextrapointers. FIGURE3.12 Insertinganewelementinanarraystoredincontiguousmemorylocations CourtesyofCourseTechnology/CengageLearning FIGURE3.13 Adoublylinkedlist CourtesyofCourseTechnology/CengageLearning 95 DataStructures Recordsand Files Arecordisadatastructurecomposedofotherdatastructuresorprimitivedataelements. Recordsarecommonlyusedasaunitofinputandoutputtoandfrom filesordatabases. Forexample,thefollowingdataitemsmightbethecontentsofthedatastructurefora customerrecord(seeFigure3.14): Account-Number Street-Address Last-Name City First-Name State Middle-Initial Zip-Code Eachcomponentoftherecordisabasicdataelement(forexample,Middle-Initial)or anotherdatastructure(forexample,acharacterarrayforStreet-Address).Tospeedinput andoutput,recordsareusuallystoredincontiguousstoragelocations,whichrestrictsthe record sarraycomponentstoafixedlength. Asequenceofrecordsonsecondarystorageiscalleda file.Asequenceofrecords storedinmainmemoryisnormallycalledatable,althoughitsstructureisessentiallythe sameasafile.Filescanbeorganizedinmanydifferentways,themostcommonbeing sequentialandindexed. Inasequentialfile,recordsarestoredincontiguousstoragelocations.Aswitharrays storedincontiguousstorage,accessingaspecificrecordissimple.Theaddressofthe nth recordinafilecanbecomputedasfollows: address-of-first-record n-1 record-size Ifthefirstbyteofthefirstrecordisataddress1andtherecordsizeis200bytes,the addressofthefourthrecordis601. Sequentialfilessufferthesameproblemsascontiguousarrayswhenrecordsarebeing insertedanddeleted.AcopyproceduresimilartotheoneinFigure3.12mustbeusedto addarecordtoafile.Theprocedureisevenlessefficientforfilesthanforarraysbecause ofthelargesizeoftherecordsthatmustbecopied. Onemethodofsolvingthisproblem istouselinkedlists.Withfiles,thedataelements ofalinkedlistareentirerecordsinsteadofbasicdataelements.Themethodsforsearch- ing,insertingrecords,anddeletingrecordsareessentiallythesameasdescribed previously. Anothermethodoforganizingfilesusesan index,anarrayofpointerstorecords.The pointerscanbeorderedinanysequenceyouneed.Forexample,afileofcustomerrecords canbeorderedbyascendingaccountnumber,asshowninFigure3.15. FIGURE3.14 Arecorddatastructure CourtesyofCourseTechnology/CengageLearning 96 Chapter3 Theadvantageofusinganindexliesintheefficiencyofrecordinsertion,deletion,and retrieval.Whenarecordisaddedtoafile,storageisallocatedfortherecord,anddatais placedinthestoragelocation.Theindexisthenupdatedbyinsertingthenewrecord s address.Anindexupdatefollowsthesameprocedureasanarrayupdate.Becausethe arraycontainsonlypointers,it ssmallandfasttoupdate. Classesand Objects Uptothispoint,dataandprogramshavebeendiscussedastwofundamentallydifferent things.Programscontaininstructions,whichtransform datainputsintodataoutputswhen executed.Dataitemsareheldinstoragedevices,movedtoCPUregisterswhenneeded, transformedintodataoutputsbyexecutinginstructions,andsenttooutputorstorage devicesagain.Inthisviewofcomputersystemsandsoftwarebehavior,dataitemsare mainlystatic,andprogramsaretheactivemeansfortransformingandupdatingdata items.

Inthe1980sand1990s,computerresearchersdevelopedadifferentviewofcomputer andsoftwarebehavior:combiningprogram instructionsanddataintoasingledatastruc- ture.A classisadatastructurecontainingbothtraditional(static)dataelementsandpro- gramsthatmanipulatethedata.Theprogramsinaclassarecalled methods.Aclass combinesrelateddataitemsinmuchthesamewayarecorddoes,butitextendsthe recordtoincludemethodsformanipulatingdataitems. ThinkofthecustomerrecordinFigure3.14asastartingpointfordefiningaCus- tomerclass.Therecordcontainssomeprimitivedataitemsanddatastructuresdescribing featuresofacustomer.(Others,suchasaccountbalance,couldbeaddedtocreateamore completerepresentationofacustomer.) ToturnthecustomerrecordintoaCustomerclass,methodsformanipulatingor modifyingdataelementsintherecordmustbeadded.Forexample,youcanaddasimple FIGURE3.15 Anindexedfile CourtesyofCourseTechnology/CengageLearning 97 DataStructures methodforupdatingadataelementscontentcalledUpdate-First-Name.Youcancheckfor legalvalues,suchasmakingsurethevaluesuppliedforZip-CodeisavalidU.S.zipcodein thecustomer sstate.Functions,suchasPrint-Mailing-Label,ormethodsformodifyingthe customer saccountbalancetoreflecttransactions,suchasApply-Payment,canbeadded. Figure3.16showssomepossibledataelementsandmethodsfortheCustomerclass. An objectisoneinstance,orvariable,ofaclass.Eachpersonwhosacustomeris representedbyonevariableorobjectoftheCustomerclass.Eachobjectcanbestoredin astoragedevice,andeachobject sdataelementsoccupyaseparatepartofthestorage device.

Viewedasjustanotherdatastructure,anobjectdifferslittlefrom arecord.Muchas thedataelementStreet-Addresscanberepresentedwithacharacterarray,amethodcan berepresentedasanarrayofCPUinstructions.Methodscanberepresentedinotherways, includinglinkedlistsofinstructionsorpointerstofilescontainingsequentialsetsof instructions.Inessence,theonlynewprimitivedatatypeneededtorepresentamethodin anobjectisaninstruction.AsyouseeinChapter4,instructions,likedata,havespecific representationandstorageformats. FIGURE3.16 ACustomerclasscontainingtraditionaldataelements(darkerboxes)andmethods (lighterboxes) CourtesyofCourseTechnology/CengageLearning 98 Chapter3 Sum m ary Datacanberepresentedinmanyways.Tobeprocessedbyanydevice,datamustbe convertedfrom itsnativeformatintoaform suitablefortheprocessingdevice.Modern computersrepresentdataaselectronicsignalsandimplementprocessingdevicesbyusing electroniccircuitry.Electronicprocessingdevicesexploitthephysicallawsofelectricity. Becausethelawsofelectricitycanbestatedasmathematicalequations,electronicdevices canperform thecomputationalfunctionsembeddedintheseequations.Otherwaysof implementingprocessors,suchasmechanicsandoptics,usesimilarmathematicallystated physicallaws. Alldata,includingnonnumericdata,arerepresentedinmoderncomputersasstringsof binarydigits,orbits.Bitsareusedbecause0and1canbeencodedinclearlyrecognizable electricalstates,suchashighandlowvoltage.Binaryelectricalstatescanbeprocessed andstoredbyreliable,cheapelectricaldevices.Ahandfulofprimitivedatatypesarerepre- sentedandprocessedbyaCPU,includingintegers,realnumbers,characters,Boolean values,andmemoryaddresses. Numericvaluesotherthan0and1arerepresentedbycombiningmultiplebitsintolarger groups,calledbitstrings,muchasthedecimaldigits2,5,and9canbecombinedtoform thelargervalue592.Eachbitstringhasaspecificdataformatandcodingmethod.There aremanydifferentdataformatsandcodingmethods.CPUdesignerschooseformatsand methodsthatrepresentthebestbalanceamongcompactness,range,easeofmanipulation, accuracy,andstandardization.Theoptimalbalancecanvary,dependingonthetypeof dataanditsintendeduses. Integershavenofractionalcomponentandcanusesimpledataformats.Twoscomplement isthemostcommonintegerformat,althoughexcessnotationisusedsometimes.Real numbershavebothwholeandfractionalcomponents.Theircomplexstructurerequiresa complexdataformatcalledfloating-pointnotationthatrepresentsanumericvalueasa mantissamultipliedbyapositiveornegativepowerof2.Avaluecanhavemanydigitsof precisioninfloating-pointnotationforverylargeorverysmallmagnitudesbutnotbothlarge andsmallmagnitudesatthesametime.Floating-pointformatsusedinmoderncomputers typicallyfollowIEEEstandards,whichincludebinaryanddecimalrepresentationsranging from 32to128bitsinlength.Floating-pointformatsarelessaccurateandmoredifficultto processthantwoscomplementformat. Characterdataisn tinherentlynumeric.Charactersareconvertedtonumbersbymeansof acodingtable,inwhicheachcharacteroccupiesasequentialposition.Acharacteris representedbyconvertingittoitsintegertableposition.Charactersareextractedfrom inte- gersbythereverseprocess.Manydifferenttablesarepossible.Mostcomputersusethe ASCIIorUnicodecodingtables.ASCIIisanolderstandardgearedtowardtheEnglishlan- guage.Unicodeisamorerecentstandardwithacharactersetlargeenoughtoencompass ASCIIandallwrittenlanguages. ABooleandatavaluemustbetrueorfalse.Memoryaddressescanbesimpleorcomplex numericvalues,dependingonwhethertheCPUusesaflatorsegmentedmemorymodel. Flatmemoryaddressescanberepresentedasasingleinteger.Segmentedmemory 99 Summary addressesrequiremultipleintegers.ManyCPUsalsoprovidedouble-precisionnumeric datatypes,whichdoublethenumberofbitsusedtostoreavalue. Programsoftendefineandmanipulatedatainlargerandmorecomplexunitsthanprimitive CPUdatatypes.Adatastructureisarelatedgroupofprimitivedataelementsorganizedfor sometypeofcommonprocessingandisdefinedinsoftware.ToenableaCPUtomanipu- lateadatastructure,softwaredecomposesoperationsonthedatastructureintoequivalent operationsonitsprimitivecomponents.Commonlyuseddatastructuresincludearrays, linkedlists,records,tables,files,indexes,andobjects.Manydatastructuresusepointers, whicharestoredmemoryaddresses,tolinkprimitivedatacomponents. Inthischapter,you velearnedthedifferentwaysdataisrepresentedinmoderncomputers, focusingparticularlyontherelationshipbetweendatarepresentationandtheCPU.Thischapter andtheprecedingchaptershaveintroducedbasicconceptsandterminologyofsystemsarchi- tecture.Theunderstandingyou vegainedinthefirstthreechapterslaysthefoundationforthe moredetaileddiscussionofthehardwareimplementationsofdataprocessing,storage,and communicationinupcomingchapters. Key Term s address AmericanStandardCodeforInformation Interchange(ASCII) array base binarynumber bit bitstring Booleandatatype Booleanlogic byte character class collatingsequence datastructure decimalpoint double-precision doublylinkedlist excessnotation ExtendedBinaryCodedDecimalInterchange Code(EBCDIC) file flatmemorymodel floating-pointnotation hexadecimalnotation high-orderbit index integer InternationalAlphabet5(IA5) InternationalOrganization forStandardization(ISO) Latin-1 leastsignificantdigit linkedlist longinteger low-orderbit machinedatatype manipulation method mostsignificantdigit multinationalcharacter numericrange object octalnotation overflow pointer primitivedatatype radix 100 Chapter3 radixpoint realnumber record segmentedmemorymodel signedinteger singlylinkedlist string truncation twoscomplementnotation underflow Unicode unsignedinteger Vocabulary Exercises 1.Anelementina(n) containspointerstoboththenextandpreviouslistelements. 2. notationencodesarealnumberasamantissamultipliedbyapower(exponent) of2. 3.A(n) isanintegerstoredindoublethenormalnumberofbitpositions. 4.Increasinganumericrepresentationformat ssize(numberofbits)increasesthe ofvaluesthatcanberepresented. 5.Assembly(machine)languageprogramsformostcomputersuse notationto representmemoryaddressvalues. 6.A(n) isadataitem composedofmultipleprimitivedataitems. 7.InolderIBM mainframecomputers,characterswereencodedaccordingtothe codingscheme. 8.A(n) istheaddressofanotherdataitem orstructure. 9.Inapositionalnumberingsystem,the separatesdigitsrepresentingwholenumber quantitiesfrom digitsrepresentingfractionalquantities. 10.A(n) isanarrayofcharacters. 11.MostIntelCPUsusethe ,inwhicheachmemoryaddressisrepresentedbytwo integers. 12.Asetofdataitemsthatcanbeaccessedinaspecifiedorderbyusingpointersiscalled a(n) . 13.A(n) contains8 . 14.A(n) liststoresonepointerwitheachlistelement. 15.Theresultofadding,subtracting,ormultiplyingtwointegersmightresultinoverflowbut never or . 16.A(n) isasequenceofprimitivedataelementsstoredinsequentialstorage locations. 17.A(n) isadatastructurecomposedofotherdatastructuresorprimitivedataele- ments,commonlyusedasaunitofinputandoutputtoandfrom filesordatabases. 18.A(n) dataitem cancontainonlythevaluestrueorfalse. 19.A(n) isanarrayofdataitems,eachofwhichcontainsakeyvalueandapointer toanotherdataitem. 101 VocabularyExercises 20.Manycomputersimplement numericdatatypestoincreaseaccuracyandprevent overflowandunderflow. 21.UnlikeASCIIandEBCDIC, isa16-bitor32-bitcharactercodingtable. 22.The isthebitoflowestmagnitudeinabyteorbitstring. 23. occurswhentheresultofanarithmeticoperationexceedsthenumberofbits availabletostoreit. 24.InaCPU, arithmeticgenerallyiseasiertoimplementthan arithmetic becauseofasimplerdatacodingschemeanddatamanipulationcircuitry. 25.Inthe ,memoryaddressesconsistofasingleinteger. 26.The hasdefinedacharacter-codingtablecalled ,whichcombinesthe ASCII-7codingtablewithanadditional128WesternEuropeanmultinationalcharacters. 27.Datarepresentedin istransmittedaccuratelybetweencomputerequipmentfrom differentmanufacturersifeachcomputer sCPUrepresentsrealnumbersbyusinganIEEE standardnotation. 28.Theorderingofcharactersinacodingtableiscalleda(n) . 29.A(n) isadatastructurecontainingbothstaticdataandmethods. 30.A(n) isoneinstanceorvariableofaclass. Review Questions 1.Whatarethebinary,octal,andhexadecimalrepresentationsofthedecimalnumber10? 2.Whyisbinarydatarepresentationandsignalingthepreferredmethodofcomputerhard- wareimplementation? 3.Whatisexcessnotation?Whatistwoscomplementnotation?Whyaretheyneeded?In otherwords,whycan tintegervaluesberepresentedbyordinarybinarynumbers? 4.Whatisthenumericrangeofa16-bittwoscomplementvalue?A16-bitexcessnotation value?A16-bitunsignedbinaryvalue? 5.Whatisoverflow?Whatisunderflow?Howcantheprobabilityoftheiroccurrencebe minimized? 6.Whyarerealnumbersmoredifficulttorepresentandprocessthanintegers? 7.Whymightaprogrammerchoosetorepresentadataitem inIEEEbinary128floating-point formatinsteadofIEEEbinary64floating-pointformat?Whatadditionalcostsmightbe incurredatruntime(whentheapplicationprogram executes)asaresultofusingthe128-bit insteadofthe64-bitformat? 8.Whydoesn taCPUevaluatetheexpression'A'='a'astrue? 9.WhatarethedifferencesbetweenASCIIandUnicode? 10.WhatprimitivedatatypescannormallyberepresentedandprocessedbyaCPU? 11.Whatisadatastructure?Listseveraltypesofcommondatastructures. 12.Whatisanaddress?Whatisapointer?Whatpurposearetheyusedfor? 102 Chapter3 13.Howisanarraystoredinmainmemory?Howisalinkedliststoredinmainmemory?What aretheircomparativeadvantagesanddisadvantages?Giveexamplesofdatathatwouldbe beststoredasanarrayandasalinkedlist. 14.Howdoesaclassdifferfrom otherdatastructures? Problem s and Exercises 1.Developanalgorithm orprogram toimplementthefollowingfunction: insert_in_linked_list(element,after_pointer) Theelementparameterisadataitemtobeaddedtoalinkedlist,andtheafter_pointer parameteristheaddressoftheelementafterwhichthenewelementwillbeinserted. 2.Developanalgorithm orprogram toimplementthefollowingfunction: insert_in_array(element,position) Theelementparameterisadataitem tobeaddedtothearray,andtheposition parameteristhearrayindexatwhichthenewelementwillbeinserted.Makesureto accountforelementsthatmustbemovedovertomakeroom forthenewelement. 3.Considerthefollowingbinaryvalue: 10000000001001100000011011011001 Whatnumber(base10)isrepresentedifthevalueisassumedtorepresentanumber storedintwoscomplementnotation?Excessnotation?IEEEbinary32floating-point notation? 4.Howarethevalues51510and-51510representedasordinarybinarynumbers?Howare theyrepresentedasoctalandhexadecimalnumbers?Howaretheyrepresentedin16-bit excessnotationandin16-bittwoscomplementnotation? 5.Howisthebinaryvalue101101 2-101101representedinIEEEbinary32floating-point notation? 6.Convertthefollowingvaluesrepresentedinbase12totheirequivalentrepresentationsin base2,base5,andbase10: 1A7812 -90B212 Research Problem s 1.Chooseacommonlyusedmicroprocessor,suchastheIntelCore(www.intel.com)orIBM POWER6(www.ibm.com).Whatdatatypesaresupported?Howmanybitsareusedto storeeachdatatype?Howiseachdatatyperepresentedinternally? 2.Mostpersonalandofficeproductivityapplications,suchasword-processingapplications, aremarketedinternationally.Tominimizedevelopmentcosts,softwareproducersdevelopa singleversionoftheprogram buthaveseparateconfigurationsforitemssuchasmenus anderrormessagesthatvaryfrom languagetolanguage.Investigateacommonlyused developmenttoolforapplications(calledan integrateddevelopmentenvironment),such 103 ResearchProblems asMicrosoftVisualStudio(www.microsoft.com).Whattoolsandtechniques(forexample, Unicodedatatypesandstringtables)aresupportedfordevelopingmultilingualprograms? 3.Object-orientedprogramminghasbeenadoptedwidelybecauseofitscapabilitytoreuse code.Mostapplicationdevelopmentsoftwareprovidesclasslibrariesandextensivesupport forcomplexdatastructures,includinglinkedlists.Investigateoneoftheselibraries,suchas theMicrosoftFoundationClass(www.microsoft.com)ortheJava2Platform (http://java. sun.com)applicationprogramminginterface(API).Whatdatastructuresaresupportedby thelibrary?Whattypesofdataarerecommendedforusewitheachdatastructureobject? Whichclassescontainwhichdatastructures,andwhatmethodsdoesthelibraryprovide? 104 Chapter3