July2,2014
TitlePopulationGeneticsVersion1.3.8.1Date2012-11-26
AuthorGregoryWarnes,withcontributionsfromGregorGorjanc,FriedrichLeisch,andMichaelMan.MaintainerGregoryWarnes DescriptionClassesandmethodsforhandlinggeneticdata.Includes classestorepresentgenotypesandhaplotypesatsinglemarkers uptomultiplemarkersonmultiplechromosomes.Functionincludeallelefrequencies,flag-ginghomo/heterozygotes,flaggingcarriersofcertainalleles,estimatingandtesting forHardy-Weinbergdisequilibrium,estimatingandtestingforlinkagedisequilibrium,...biocViewsGeneticsLicenseGPLRepositoryCRAN Date/Publication2013-09-0312:06:28NeedsCompilationno Rtopicsdocumented: ci.balance.....Depreciated....diseq........expectedGenotypesgenotype......gregorius.....groupGenotype..homozygote....HWE.chisq.... ............................................................................................................1.................................................................................................................................................................................................................................2447813151720 2 HWE.exact....HWE.test.....LD.........locus........makeGenotypes..order.genotype..plot.genotype...print.LD......summary.genotypeundocumented...write.pop.file... Index ...................................................................................................................................................................................................................................................................................................................................................................................... ci.balance.................................212224273032353638404042 ci.balance ExperimentalFunctiontoCorrectConfidenceIntervalsAtorNearBoundariesoftheParameterSpaceby’Sliding’theIntervalontheQuantileScale. Description Experimentalfunctiontocorrectconfidenceintervalsatornearboundariesoftheparameterspaceby’sliding’theintervalonthequantilescale.Usage ci.balance(x,est,confidence=0.95,alpha=1-confidence,minval,maxval, na.rm=TRUE)Arguments xestconfidencealphaminvalmaxvalna.rm Bootstrapparameterestimates.Observedvalueoftheparameter. Confidencelevelfortheinterval.Defaultsto0.95. TypeIerrorrate(size)fortheinterval.Defaultsto1-confidence. Anumericvaluespecifyingthelowerboundoftheparameterspace.Leaveunspecified(thedefault)ifthereisnolowerbound. Anumericvaluespecifyingtheupperboundoftheparameterspace.Leaveunspecified(thedefault)ifthereisnoupperbound.logical.Shouldmissingvaluesberemoved? ci.balanceDetails EXPERIMENTALFUNCTION: 3 Thisfunctionattemptstocomputeaproperconf*100%confidenceintervalforparametersatorneartheboundaryoftheparameterspaceusingbootstrappedparameterestimatesby’sliding’theconfidenceintervalonthequantilescale. Thisisaccomplishedbyattemptingtoplaceaconf*100%intervalsymmetrically*onthequantilescale*abouttheobservedvalue.Ifasymmetricintervalwouldexceedtheobserveddataattheupper(lower)end,aone-sidedintervaliscomputedwiththeupper(lower)boundaryfixedatthetheupper(lower)boundaryoftheparameterspace.Value Alistcontaining:ci A2-elementvectorcontainingthelowerandupperconfidencelimits.Thenamesoftheelementsofthevectorgivetheactualquantilevaluesusedfortheintervaloroneofthecharacterstrings\"UpperBoundary\"or\"LowerBoundary\". overflow.upper,overflow.lower Thenumberofelementsbeyondthoseobservedthatwouldbeneededtocom-puteasymmetric(onthequantilescale)confidenceinterval. n.above,n.below Thenumberofbootstrapvalueswhichareabove(below)theobservedvalue. lower.n,upper.n Theindexofthevalueusedfortheendpointoftheconfidenceintervalorthecharacterstring\"UpperBoundary\"(\"LowerBoundary\").Author(s) GregoryR.Warnes boot,bootstrap,Usedbydiseq.ci.Examples #Thesearenonsensicalexampleswhichsimplyexercisethe#computation.Seethecodetodiseq.ciforarealexample.# #FIXME:Addrealexampleusingbootorbootstrap.set.seed(7981357) x<-abs(rnorm(100,1))ci.balance(x,1,minval=0)ci.balance(x,1) x<-rnorm(100,1) x<-ifelse(x>1,1,x)ci.balance(x,1,maxval=1)ci.balance(x,1) 4diseq DepreciatedDepreciatedfunctions Description Thesefunctionsaredepreciated.Usage power.casectrl(...)Arguments ...Details Allargumentsareignored Thepower.casectlfunctioncontainedseriouserrorsandhasbeenreplacedbyGPC,GeneticPower.Quantitative.Factor,orGeneticPower.Quantitative.NumericintheBioConductorGeneticsDesignpackage.Inspecific,thepower.casectlfunctionusedanexpectedcontingencytabletocreatetheteststatis-ticthatwaserroneouslybasedontheunderlyingnull,ratherthanonthemarginaltotalsoftheobservedtable.Inaddition,themodelingofdominantandrecessivemodesofinheritancehadas-sumeda\"perfect\"genotypewithnodisease,whereasinrealityadominantorrecessivemodeofinheritancesimplymeansthattwoofthegenotypeswillhaveanidenticaloddsratiocomparedtothe3rdgenotype(theotherhomozygote). diseq EstimateorComputeConfidenceIntervalfortheSingle-MarkerDise-quilibrium Description Estimateorcomputeconfidenceintervalforsingle-markerdisequilibrium.Usage diseq(x,...) ##S3methodforclassdiseq print(x,show=c(\"D\\...) diseq.ci(x,R=1000,conf=0.95,correct=TRUE,na.rm=TRUE,...) diseqArguments xshow genotypeorhaplotypeobject. 5 acharactervalueorvectorindicatingwhichdisequilibriummeasuresshouldbedisplayed.Thedefaultistoshowalloftheavailablemeasures.show=\"table\"willdisplayatableofobserved,expected,andobserved-expectedfrequencies.ConfidenceleveltousewhencomputingtheconfidencelevelforD-hat.Defaultsto0.95,shouldbein(0,1). Numberofbootstrapiterationstousewhencomputingtheconfidenceinterval.Defaultsto1000.Seedetails. logical.Shouldmissingvaluesberemoved? optionalparameterspassedtoboot.ci(diseq.ci)orignored. confRcorrectna.rm...Details Forasingle-genemarker,diseqcomputestheHardy-Weinberg(dis)equilibriumstatisticD,D’,r(thecorrelationcoefficient),andr2foreachpairofallelevalues,aswellasanoverallsummaryvalueforeachmeasureacrossallalleles.print.diseqdisplaysthecontentsofadiseqobject.diseq.cicomputesabootstrapconfidenceintervalforthisestimate. Forconsistency,IhaveappliedthestandarddefinitionsforD,D’,andrfromtheLinkageDisequi-libriumcase,replacingallmarkerprobabilitieswiththeappropriatealleleprobabilities.Thus,foreachallelepair, •Disdefinedasthehalfoftherawdifferenceinfrequencybetweentheobservednumberofheterozygotesandtheexpectednumber: D= •D’rescalesDtospantherange[-1,1] D= where,ifD>0: Dmax=minpipj,pjpi=pipj orifD<0: Dmax=minpi(1−pj),pj(1−pi) •risthecorrelationcoefficientbetweentwoalleles,andcanbecomputedby −D r=(pi∗(1−pi)p(j)(1−pj)) where DDmax 1 (pij+pji)−pipj2 6 •-pidefinedastheobservedprobabilityofallele’i’,•-pjdefinedastheobservedprobabilityofallele’j’,and•-pijdefinedastheobservedprobabilityoftheallelepair’ij’. diseq Whentherearemorethantwoalleles,thesummaryvaluesforthesestatisticsareobtainedbycom-putingaweightedaverageoftheabsolutevalueofeachallelepair,wheretheweightisdeterminedbytheexpectedfrequency.Forexample: Doverall= i=j |Dij|∗pij Bootstrappingisusedtogenerateconfidenceintervalinordertoavoidrelianceonparametricas-sumptions,whichwillnotholdforalleleswithlowfrequencies(e.g.DfollowingaaChi-squaredistribution). SeethefunctionHWE.testfortestingHardy-WeinbergEquilibrium,D=0.Value diseqreturnsanobjectofclassdiseqwithcomponents•callfunctioncallusedtocreatethisobject•data2-waytableofallelepaircounts •D.hatmatrixgivingtheobservedcount,expectedcount,observed-expecteddifference,andestimateofdisequilibriumforeachpairofallelesaswellasanoveralldisequilibriumvalue.•TODOmoreslotstobedocumenteddiseq.cireturnsanobjectofclassboot.ciAuthor(s) GregoryR.Warnes genotype,HWE.test,boot,boot.ciExamples <-c(\"D/D\ \"D/D\ g1<-genotype(example.data)g1 diseq(g1)diseq.ci(g1) HWE.test(g1)#doesthesame,plustestsD-hat=0three.data <-c(rep(\"A/A\ example.data expectedGenotypes rep(\"C/A\rep(\"C/T\rep(\"C/C\rep(\"T/T\ g3g3 <-genotype(three.data) 7 diseq(g3) diseq.ci(g3,ci.B=10000,ci.type=\"bca\")#onlyshowobservedvsexpectedtableprint(diseq(g3),show=table) expectedGenotypes Constructexpectedgenotypes/haplotypesaccordingtoknownallelevariants Description expectedGenotypesconstructsexpectedgenotypesaccordingtoknownallelevariants,whichcanbequitetediouswithlargenumberofallelevariants.Itcanhandledifferentlevelofploidy.Usage expectedGenotypes(x,alleles=allele.names(x),ploidy=2,sort=TRUE, haplotype=FALSE) expectedHaplotypes(x,alleles=allele.names(x),ploidy=2,sort=TRUE, haplotype=TRUE)Arguments x allelesploidysort haplotype genotypeorhaplotype character,vectorofallelenames numeric,numberofchromosomesetsi.e.2forhumanautosomalgeneslogical,sortgenotypesaccordingtoorderofallelesinallelesargument logical,constructhaplotypesi.e.orderedgenotypeAtleastoneofxorallelesmustbegiven. Details expectedHaplotypes()justcallsexpectedGenotypes()withargumenthaplotype=TRUE.Value Acharactervectorwithgenotypenamesas\"alele1/alele2\"fordiploidexample.Lengthofoutputis(n∗(n+1))/2forgenotype(unorderedgenotype)andn∗nforhaplotype(orderedgenotype)fornallelevariants. 8Author(s) GregorGorjancSeeAlso allele.names,genotypeExamples ##Ongenotype prp<-c(\"ARQ/ARQ\\"ARQ/ARQ\\"ARR/ARQ\\"AHQ/ARQ\\"ARQ/ARQ\")alleles<-c(\"ARR\\"AHQ\\"ARH\\"ARQ\\"VRR\\"VRQ\")expectedGenotypes(as.genotype(prp)) expectedGenotypes(as.genotype(prp,alleles=alleles)) expectedGenotypes(as.genotype(prp,alleles=alleles,reorder=\"yes\"))##Onlyallelenames expectedGenotypes(alleles=alleles) expectedGenotypes(alleles=alleles,ploidy=4)##Haplotype expectedHaplotypes(alleles=alleles) expectedHaplotypes(alleles=alleles,ploidy=4)[1:20] genotype genotypeGenotypeorHaplotypeObjects. Description genotypecreatesagenotypeobject.haplotypecreatesahaplotypeobject. is.genotypereturnsTRUEifxisofclassgenotypeis.haplotypereturnsTRUEifxisofclasshaplotype as.genotypeattemptstocoerceitsargumentintoanobjectofclassgenotype. as.genotype.allele.countconvertsallelecounts(0,1,2)intogenotypepairs(\"A/A\\"A/B\\"B/B\"). as.haplotypeattemptstocoerceitsargumentintoanobjectofclasshaplotype.nallelereturnsthenumberofallelesinanobjectofclassgenotype.Usage genotype(a1,a2=NULL,alleles=NULL,sep=\"/\remove.spaces=TRUE, reorder=c(\"yes\\"no\\"default\\"ascii\\"freq\"),allow.partial.missing=FALSE,locus=NULL,genotypeOrder=NULL) genotype haplotype(a1,a2=NULL,alleles=NULL,sep=\"/\remove.spaces=TRUE, reorder=\"no\allow.partial.missing=FALSE,locus=NULL,genotypeOrder=NULL)is.genotype(x)is.haplotype(x)as.genotype(x,...) ##S3methodforclassallele.countas.genotype(x,alleles=c(\"A\...)as.haplotype(x,...) ##S3methodforclassgenotypeprint(x,...)nallele(x)Arguments xa1,a2allelessep remove.spacesreorder 9 eitheranobjectofclassgenotypeorhaplotypeoranobjecttobeconvertedtoclassgenotypeorhaplotype. vector(s)ormatrixcontainingtwoallelesforeachindividual.Seedetails,below.names(andorderifreorder=\"yes\")ofpossiblealleles. characterseparatororcolumnnumberusedtodividealleleswhena1isavectorofstringswhereeachstringholdsbothalleles.Seebelowfordetails. logicalindicatingwhetherspacesandtabswillberemovedfroma1anda2be-foreprocessing. howshouldalleleswithinanindividualbereordered.Ifreorder=\"no\",usetheorderspecifiedbytheallelesparameter.Ifreorder=\"freq\"orreorder=\"yes\",sortalleleswithineachindividualbyobservedfrequency.Ifreorder=\"ascii\",reorderallelesinASCIIorder(alphabetical,withalluppercasebeforelowercase).Thedefaultvalueforgenotypeis\"freq\".Thedefaultvalueforhaplotypeis\"no\". allow.partial.missing logicalindicatingwhetheronealleleispermittedtobemissing.WhensettoFALSEbothallelesaresettoNAwheneitherismissing. objectofclasslocus,gene,ormarker,holdinginformationaboutthesourceofthisgenotype. character,vectorofgenotype/haplotypenamessothatfurtherfunctionscansortgenotypes/haplotypesinwantedorderoptionalarguments locus genotypeOrder... 10Details genotype Genotypeobjectsholdinformationonwhichgeneormarkeralleleswereobservedfordifferentindividuals.Foreachindividual,twoallelesarerecorded. Thegenotypeclassconsidersthestoredallelestobeunordered,i.e.,\"C/T\"isequivalentto\"T/C\".Thehaplotypeclassconsiderstheorderoftheallelestobesignificantsothat\"C/T\"isdistinctfrom\"T/C\". Whencallinggenotypeorhaplotype: •Ifonlya1isprovidedandisacharactervector,itisassumedthateachelementencodesbothalleles.Inthiscase,ifsepisacharacterstring,a1isassumedtobecodedas\"Al-lele1 •Ifa1isamatrix,itisassumedthatcolumn1containsallele1andcolumn2containsallele2.•Ifa1anda2arebothprovided,eachisassumedtocontainoneallelevaluesothatthegenotypeforanindividualisobtainedbypaste(a1,a2,sep=\"/\"). Ifremove.spacesisTRUE,(thedefault)anywhitespacecontainedina1anda2isremovedwhenthegenotypesarecreated.Ifwhitespaceisusedastheseparator,(eg\"CC\\"CT\...),besuretosetremove.spacestoFALSE. Whentheallelesareexplicitlyspecifiedusingtheallelesargument,allpotentialallelesnotpresentinthelistwillbeconvertedtoNA. NOTE:genotypeassumesthattheorderoftheallelesisnotimportant(E.G.,\"A/C\"==\"C/A\").Useclasshaplotypeiforderissignificant. IfgenotypeOrder=NULL(thedefaultsetting),thenexpectedGenotypesisusedtogetstandardsortingorder.OnlyuniquevaluesingenotypeOrderareused,whichinturnsmeansthatthefirstoccurrenceprevails.WhengenotypeOrderisgivensomegenotypenames,butnotallthatappearinthedata,therest(thoseinthedataandpossiblecombinationsbasedonallelevariants)isauto-maticallyaddedattheendofgenotypeOrder.Thisputs\"missing\"genotypenamesattheendofsortorder.Thisfeatureisespeciallyusefulwhentherearealotofallelevariantsandespeciallyinhaplotypes.Seeexamples.Value Thegenotypeclassextends\"factor\"andhaplotypeextendsgenotype.Bothclasseshavethefollow-ingattributes:levelsallele.namesallele.map charactervectorofpossiblegenotype/haplotypevaluesstoredcodedbypaste(allele1,\"/\allele2charactervectorofpossiblealleles.ForaSNP,thesemightbec(\"A\Foravariablelengthdinucleotyderepeatthismightbec(\"136\matrixencodinghowthefactorlevelscorrespondtoalleles.Seethesourcecodetoallele.genotype()forhowtoextractallelevaluesusingthismatrix.Betteryet,justuseallele.genotype(). character,genotype/haplotypenamesindefinedorderthatcanusedforsortinginvariousfunctions.Notethatthisslotstoresbothorderedandunorderedgeno-typesi.e.\"A/B\"and\"B/A\". genotypeOrder genotypeAuthor(s) GregoryR.Warnes 11 HWE.test,allele,homozygote,heterozygote,carrier,summary.genotype,allele.count,sort.genotype,genotypeOrder,locus,gene,marker,and%in%fordefault%in%methodExamples #example.dataseveralexamples<-c(\"D/D\ ofgenotypedataindifferentformatsg1g1 <-genotype(example.data)\"D/D\ example.data2 <-c(\"C-C\ g2g2 <-genotype(example.data2,sep=\"-\")\"C-C\ example.nosep <-c(\"DD\g3\"DD\\"DI\\"DD\\"DD\\"DD\\"II\\"II\\"DD\ \"\") g3 <-genotype(example.nosep,sep=\"\")example.a1example.a2<-<-c(\"D\\"D\\"D\\"I\\"D\\"D\g4<-genotype(example.a1,example.a2)c(\"D\\"I\\"D\\"I\\"D\\"D\ \"D\\"D\ \"D\\"D\ g4 example.matg5<-g5 <-genotype(example.mat)cbind(a1=example.a1,a1=example.a2)example.data5 <-c(\"D\"D//D\D\//I\ // D\D\ // g5\"I/I\ D\g5 <-genotype(example.data5,rem=TRUE)#data1showdata2<-howgenotypeandhaplotypediffer<-c(\"C/C\c(\"C/C\\"C/T\\"T/C\\"T/C\")\"T/C\")test1test2<-<-genotype(genotype(data1data2))test3test4<-<-haplotype(haplotype(data1data2)) \"\")\"\") I\D\ \"I\\"I\ 12 test1==test2test3==test4test1==\"C/T\"test1==\"T/C\"test3==\"C/T\"test3==\"T/C\"##alsotest1test1test3test1test1test3test3 ##\"Messy\"examplem3 <-c(\"DD/\DD\I\\"DD/DD\I\\"DD/DD\D/DD\D/DD\D/DD\\"I/I\\ genotype genotype(m3) summary(genotype(m3))m4 <-c(\"DD\I\D\I\\"DD\D\D\D\\"II\\I\") genotype(m4,sep=1) genotype(m4,sep=\"\ summary(genotype(m4,sep=\"\m5 c(\"DD\\"DD\\"II\\I\") genotype(m5,sep=1) haplotype(m5,sep=1,remove.spaces=FALSE)g5h5 <-genotype(m5,sep=\"\")<-haplotype(m5,sep=\"\")<- heterozygote(g5)homozygote(g5)carrier(g5,\"D\")g5[9:10]g5 <-haplotype(m4,sep=\"\ gregorius g5[9:10] allele(g5[9:10],1)allele(g5,1)[9:10]#dropunusedallelesg5[9:10,drop=TRUE]h5[9:10,drop=TRUE] #Convertallele.countsintogenotype x<-c(0,1,2,1,1,2,NA,1,2,1,2,2,2) g<-as.genotype.allele.count(x,alleles=c(\"C\)g #UseofgenotypeOrderexample.data<-c(\"D/D\ \"D/D\ summary(genotype(example.data)) genotypeOrder(genotype(example.data)) summary(genotype(example.data,genotypeOrder=c(\"D/D\\"I/I\\"D/I\")))summary(genotype(example.data,genotypeOrder=c(\"D/I\")))summary(haplotype(example.data,genotypeOrder=c(\"I/D\\"D/I\")))example.data<-genotype(example.data) genotypeOrder(example.data)<-c(\"D/D\\"I/I\\"D/I\")genotypeOrder(example.data) 13 gregorius ProbabilityofObservingAllAlleleswithaGivenFrequencyinaSam-pleofaSpecifiedSize. Description Probabilityofobservingallalleleswithagivenfrequencyinasampleofaspecifiedsize.Usage gregorius(freq,N,missprob,tol=1e-10,maxN=10000,maxiter=100,showiter=FALSE)Arguments freqNmissprobtolmaxNmaxiter (Minimum)Allelefrequency(required)Numberofsampledgenotypes Desiredmaximumprobabilityoffailingtoobserveanallele.Omitcomputationfortermswhichcontributelessthanthisvalue.LargestvaluetoconsiderwhensearchingforN. MaximumnumberofiterationstousewhensearchingforN. 14 showiter gregorius Booleanflagindicatingwhethertoshowtheiterationsperformedwhensearch-ingforN. Details IffreqandNareprovided,butmissprobisomitted,thisfunctioncomputestheprobabilityoffailingtoobserveallalleleswithtrueunderlyingfrequencyfreqwhenNdiploidgenotypesaresampled.ThisisaccomplishedusingthesumprovidedinCorollary2ofGregorius(1980),omittingtermswhichcontributelessthantoltotheresult. Whenfreqandmissprobareprovide,butNisomitted.Abinarysearchontherangeof[1,maxN]isperformedtolocatethesmallestsamplesize,N,forwhichtheprobabilityoffailingtoobserveallalleleswithtrueunderlyingfrequencyfreqisatmostmissprob.Inthiscase,maxiterspecifiesthelargestnumberofiterationstouseinthebinarysearch,andshowitercontrolswhethertheiterationsofthesearcharedisplayed.Value Alistcontainingthefollowingvalues:callmethod Functioncallusedtogeneratethisobject. Oneofthestrings,\"ComputemissprobgivenNandfreq\or\"Determinemin-imalNgivenmissprobandfreq\indicatingwhichtypeofcomputationwasperformed. Specifiedallelefrequency. retval$freq retval$NSpecifiedorcomputedsamplesize.retval$missprob Computedprobabilityoffailingtoobserveallofthealleleswithfrequencyfreq.Note Thiscodeproducessamplesizesthatareslightlylargerthanthosegivenintable1ofGregorius(1980).Thisappearstobeduetoroundingofthecomputedmissprobsbytheauthorsofthatpaper.Author(s) CodesubmittedbyDavidDuffy Gregorius,H.R.1980.Theprobabilityoflosinganallelewhendiploidgenotypesaresampled.Biometrics36,3-652.Examples #Computetheprobabilityofmissinganallelewithfrequency0.15when#20genotypesaresampled:gregorius(freq=0.15,N=20) groupGenotype15 #Determinewhatsamplesizeisrequiredtoobserveallalleleswithtrue#frequency0.15withprobability0.95gregorius(freq=0.15,missprob=1-0.95) groupGenotypeGroupgenotypevalues Description groupGenotypegroupsgenotypeorhaplotypevaluesaccordingtogiven\"grouping/mapping\"in-formationUsage groupGenotype(x,map,haplotype=FALSE,factor=TRUE,levels=NULL,verbose=FALSE) Arguments xmaphaplotypefactorlevelsverbose genotypeorhaplotype list,mappinginformation,seedetailsandexamples logical,shouldvaluesinamapbetreatedashaplotypesorgenotypes,seedetailslogical,shouldoutputbeafactororacharacter character,optionalvectoroflevelnamesiffactorisproduced(factor=TRUE);thedefaultistousethesortorderofthegroupnamesinmap logical,printgenotypenamesthatmatchentriesinthemap-mainlyusedfordebugging Details Examplesshowhowmapcanbeconstructed.Thisarethemainpointstobeawareof:•namesoflistcomponentsareusedasnewgroupnames•listcomponentsholdgenotypenamespereachgroup •genotypenamescanbespecifieddirectlyi.e.\"A/B\"orabbreviatedsuchas\"A/*\"oreven\"*/*\where\"*\"matchesanypossibleallele,butreadalsofurtheron •allgenotypenamesthatarenotspecifiedcanbecapturedwith\".else\"(notethedot!)•genotypenamesthatwerenotspecified(and\".else\"wasnotused)arechangedtoNAmapisinspectedbeforegroupingofgenotypesisbeingdone.Thefollowingstepsaredoneduringinspection: 16groupGenotype •\".else\"mustbeattheend(ifnot,itismoved)tomatcheverythingthathasnotyetbeendefined•anyspecificationslike\"A/*\\"*/A\or\"*/*\"areextendedtoallpossiblegenotypesbasedonallelesinargumentalleles-incaseofhaplotype=FALSE,\"A/*\"and\"*/A\"matchthesamegenotypes •sinceuseof\"*\"and\".else\"cancauseduplicatesalongthewholemap,duplicatesareremovedsequentially(firstoccurrenceiskept) Using\".else\"or\"*/*\"attheendofthemapproducesthesameresult,duetoremovingduplicatessequentially. Value AfactororcharactervectorwithgenotypesgroupedAuthor(s) GregorGorjancSeeAlso genotype,haplotype,factor,andlevelsExamples ##---Setup---x<-c(\"A/A\\"B/B\\"A/B\\"B/A\\"C/C\\"B/C\\"C/D\\"C/B\\"A/C\\"D/C\\"B/D\\"C/A\\"D/B\\"A/D\\"D/A\ g##<-genotype(x,\"D/D\") reorder=\"yes\") ##\"A/A\"\"C/C\"\"A/B\"\"C/D\"\"A/B\"\"C/D\"\"A/C\"\"D/D\" \"A/C\"\"A/D\"\"A/D\"\"B/B\"\"B/C\"\"B/C\"\"B/D\"\"B/D\"h##<-##\"A/A\"haplotype(x) \"C/C\"\"A/B\"\"C/D\"\"B/A\"\"D/C\"\"A/C\"\"D/D\" \"C/A\"\"A/D\"\"D/A\"\"B/B\"\"B/C\"\"C/B\"\"B/D\"\"D/B\"##---Useof\"A/A\\"A/*\"and\".else\"---map<-list(\"homoG\"=c(\"A/A\\"heteroA*\"=c(\"A/B\\"B/B\\"C/C\\"D/D\"), \"heteroB*\"=c(\"B/*\"),\"A/C\\"A/D\"),\"heteroRest\"=\".else\") (tmpG(tmpH<-<-groupGenotype(x=g,groupGenotype(x=h,map=map,map=map,factor=FALSE)) factor=FALSE,haplotype=TRUE))##cbind(as.character(h),Showdifferencebetweengen=tmpG,genotypehap=tmpH,andhaplotypediff=!(tmpGtreatment ==tmpH)) homozygote ################################## gen\"homoG\"\"heteroA*\"\"heteroA*\"\"heteroA*\"\"heteroA*\"\"heteroA*\"\"heteroA*\"\"homoG\"\"heteroB*\"\"heteroB*\"\"heteroB*\"\"heteroB*\"\"homoG\" \"heteroRest\"\"heteroRest\"\"homoG\" hap\"homoG\"\"heteroA*\"\"heteroB*\"\"heteroA*\"\"heteroRest\"\"heteroA*\"\"heteroRest\"\"homoG\"\"heteroB*\"\"heteroRest\"\"heteroB*\"\"heteroRest\"\"homoG\" \"heteroRest\"\"heteroRest\"\"homoG\" diff\"FALSE\"\"FALSE\"\"TRUE\"\"FALSE\"\"TRUE\"\"FALSE\"\"TRUE\"\"FALSE\"\"FALSE\"\"TRUE\"\"FALSE\"\"TRUE\"\"FALSE\"\"FALSE\"\"FALSE\"\"FALSE\" 17 [1,][2,][3,][4,][5,][6,][7,][8,][9,][10,][11,][12,][13,][14,][15,][16,]\"A/A\"\"A/B\"\"B/A\"\"A/C\"\"C/A\"\"A/D\"\"D/A\"\"B/B\"\"B/C\"\"C/B\"\"B/D\"\"D/B\"\"C/C\"\"C/D\"\"D/C\"\"D/D\" map<-list(\"withA\"=\"A/*\\"rest\"=\".else\")groupGenotype(x=g,map=map,factor=FALSE) ##[1]\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"rest\"##[10]\"rest\"\"rest\"\"rest\"\"rest\"\"rest\"\"rest\"\"rest\"groupGenotype(x=h,map=map,factor=FALSE,haplotype=TRUE) ##[1]\"withA\"\"withA\"\"rest\"\"withA\"\"rest\"\"withA\"\"rest\"##[10]\"rest\"\"rest\"\"rest\"\"rest\"\"rest\"\"rest\"\"rest\"##---Useof\"*/*\"---\"rest\" \"rest\" \"rest\" map<-list(\"withA\"=\"A/*\withB=\"*/*\")groupGenotype(x=g,map=map,factor=FALSE) ##[1]\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withB\"\"withB\"##[10]\"withB\"\"withB\"\"withB\"\"withB\"\"withB\"\"withB\"\"withB\"##---MissinggenotypespecificationsproducesNAs---map<-list(\"withA\"=\"A/*\withB=\"B/*\")groupGenotype(x=g,map=map,factor=FALSE) ##[1]\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withA\"\"withB\"\"withB\"##[10]\"withB\"\"withB\"\"withB\"NANANANAgroupGenotype(x=h,map=map,factor=FALSE,haplotype=TRUE)##[1]\"withA\"\"withA\"\"withB\"\"withA\"NA\"withA\"NA##[10]NA\"withB\"NANANANANA \"withB\"\"withB\" homozygoteExtractFeaturesofGenotypeobjects 18Description homozygote homozygotecreatesanvectoroflogicalsthataretruewhentheallelesofthecorrespondingobser-vationaretheidentical. heterozygotecreatesanvectoroflogicalsthataretruewhentheallelesofthecorrespondingobservationdiffer. carriercreatealogicalvectorormatrixoflogicalsindicatingwhetherthespecifiedallelesarepresent. allele.countreturnsthenumberofcopiesofthespecifiedallelescarriedbyeachobservation.alleleextractthespecifiedallele(s)asacharactervectorora2columnmatrix.allele.namesextractthesetofallelenames.Usage homozygote(x,allele.name,...)heterozygote(x,allele.name,...)carrier(x,allele.name,...) ##S3methodforclassgenotype carrier(x,allele.name=allele.names(x), any=!missing(allele.name),na.rm=FALSE,...) allele.count(x,allele.name=allele.names(x),any=!missing(allele.name), na.rm=FALSE) allele(x,which=c(1,2))allele.names(x)Arguments x... allele.nameany genotypeobject optionalparameters(ignored) charactervalueorvectorofallelenames logicalvalue.WhenTRUE,asinglecountorindicatorisreturnedbycombiningtheresultsforalloftheelementsofallele.IfFALSEseparatecountsorindica-torsshouldbereturnedforeachelementofallele.DefaultstoFALSEifalleleismissing.OtherwisedefaultstoTRUE. logicalvalueindicatingwhethertoremovemissingvalues.Whentrue,anyNAvalueswillbereplacedby0orFALSEasappropriate.DefaultstoFALSE.selectswhichalleletoreturn.Forfirstalleleuse1.Forsecondalleleuse2.Forboth(thedefault)usec(1,2). na.rmwhich Details Whentheallele.nameargumentisgiven,heterozygoteandhomozygotereturnTRUEifexactlyoneorbothalleles,respectively,matchthespecifiedallele.name. homozygoteValue homozygoteandheterozygotereturnavectoroflogicals. 19 carrierreturnsalogicalvectorifonlyonealleleisspecified,orifanyisTRUE.Otherwise,itreturnsmatrixoflogicalswithonerowforeachelementofallele. allele.countreturnsavectorofcountsifonlyonealleleisspecified,orifanyisTRUE.Otherwise,itreturnsmatrixofcountswithonerowforeachelementofallele. allelereturnsacharactervectorwhenonealleleisspecified.When2allelesarespecified,itreturnsa2columncharactermatrix. allele.namesreturnsacharactervectorcontainingthesetofallelenames.Author(s) GregoryR.Warnes genotype,HWE.test,summary.genotype,locusgenemarkerExamples example.datag1<-c(\"D/D\g1 <-genotype(example.data)heterozygote(g1)homozygote(g1) carrier(g1,\"D\") carrier(g1,\"D\#allele.count(g1,\"D\") getcountofoneallele#allele.count(g1)getcountofeachallele.count(g1,c(\"D\#allele equivalentany=FALSE)to #allele.count(g1,c(\"I\getcombinedcountforbothalleles#allele(g1,2)getsecondallele#allele(g1) getbothalleles20HWE.chisq HWE.chisqPerformChi-SquareTestforHardy-WeinbergEquilibrium Description TestthenullhypothesisthatHardy-WeinbergequilibriumholdsusingtheChi-Squaremethod.Usage HWE.chisq(x,...) ##S3methodforclassgenotype HWE.chisq(x,simulate.p.value=TRUE,B=10000,...) Arguments x genotypeorhaplotypeobject. simulate.p.value alogicalvalueindicatingwhetherthep-valueshouldbecomputedusingsimu-lationinsteadofusingtheχ2approximation.DefaultstoTRUE.B...Details Thisfunctiongeneratesa2-waytableofallelecounts,thencallschisq.testtocomputeap-valueforHardy-WeinbergEquilibrium.Bydefault,itusesanunadjustedChi-Squareteststatisticandcomputesthep-valueusingasimulation/permutationmethod.Whensimulate.p.value=FALSE,itcomputestheteststatisticusingtheYatescontinuitycorrectionandtestsitagainsttheasymptoticChi-Squaredistributionwiththeapproproatedegreesoffreedom. Note:TheYatescontinutycorrectionisapplied*only*whensimulate.p.value=FALSE,sothatthereportedteststatisticswhensimulate.p.value=FALSEandsimulate.p.value=TRUEwilldif-fer.Value Anobjectofclasshtest.SeeAlso HWE.exact,HWE.test,diseq,diseq.ci,allele,chisq.test,boot,boot.ci Numberofsimulationiterationstousewhensimulate.p.value=TRUE.De-faultsto10000. optionalparameterspassedtochisq.test HWE.exactExamples example.datag1g1 <-c(\"D/D\ \"D/D\ <-genotype(example.data) 21 HWE.chisq(g1)#comparewithHWE.exact(g1)#and HWE.test(g1)three.data <-c(rep(\"A/A\ rep(\"C/A\rep(\"C/T\rep(\"C/C\rep(\"T/T\ g3g3 <-genotype(three.data) HWE.chisq(g3,B=10000) HWE.exactExactTestofHardy-WeinbergEquilibriumfor2-AlleleMarkers Description ExacttestofHardy-WeinbergEquilibriumfor2AlleleMarkers.Usage HWE.exact(x)Arguments xValue Objectofclass’htest’.Note Thisfunctiononlyworksforgenotypeswithexactly2alleles. Genotypeobject 22Author(s) HWE.test DavidDuffy EmighTH.(1980)\"ComparisonoftestsforHardy-WeinbergEquilibrium\Biometrics,36,627-2.SeeAlso HWE.chisq,HWE.test,diseq,diseq.ciExamples <-c(\"D/D\ \"D/D\ g1<-genotype(example.data)g1 HWE.exact(g1)#comparewithHWE.chisq(g1)example.data g2<-genotype(sample(c(\"A\100,p=c(100,10),rep=TRUE), sample(c(\"A\100,p=c(100,10),rep=TRUE)) HWE.exact(g2) HWE.testEstimateDisequilibriumandTestforHardy-WeinbergEquilibrium Description EstimatedisequilibriumparameterandtestthenullhypothesisthatHardy-Weinbergequilibriumholds.Usage HWE.test(x,...) ##S3methodforclassgenotype HWE.test(x,exact=nallele(x)==2,simulate.p.value=!exact, B=10000,conf=0.95,ci.B=1000,...) ##S3methodforclassdata.frame HWE.test(x,...,do.Allele.Freq=TRUE,do.HWE.test=TRUE)##S3methodforclassHWE.test print(x,show=c(\"D\\...) HWE.testArguments xexact genotypeorhaplotypeobject. 23 alogicalvalueindicatedwhetherthep-valueshouldbecomputedusingtheexactmethod,whichisonlyavailablefor2allelegenotypes. simulate.p.value alogicalvalueindicatingwhetherthep-valueshouldbecomputedusingsimu-lationinsteadofusingtheχ2approximation.DefaultstoTRUE.Bconfci.Bshow... Numberofsimulationiterationstousewhensimulate.p.value=TRUE.De-faultsto10000. ConfidenceleveltousewhencomputingtheconfidencelevelforD-hat.Defaultsto0.95,shouldbein(0,1). Numberofbootstrapiterationstousewhencomputingtheconfidenceinterval.Defaultsto1000. acharactervectorcontainingthenamesofHWEteststatisticstodisplayfromthesetof\"D\\"D’\\"r\and\"table\". optionalparameterspassedtoHWE.test(data.framemethod)orchisq.test(basemethod). logicialindicationwhethertoperformHWEtests do.Allele.Freqlogicialindicationwhethertosummarizeallelefrequencies.do.HWE.testDetails HWE.testcallsdiseqtocomputestheHardy-Weinberg(dis)equilibriumstatisticsD,D’,andr(correlationcoefficient).Nextitcallsdiseq.citocomputeabootstrapconfidenceintervalfortheseestimates.Finally,itcallschisq.testtocomputeap-valueforHardy-WeinbergEquilibriumusingasimulation/permutationmethod. Usingbootstrappingfortheconfidenceintervalandsimulationforthep-valueavoidsrelianceontheassumptionstheunderlyingChi-squareapproximation.Thisisparticularlyimportantwhensomeallelepairshavesmallcounts. FordetailsonthedefinitionofD,D’,andr,seethehelppagefordiseq.Value AnobjectofclassHWE.testwithcomponentsdiseqcitest Adiseqobjectprovidingdetailsonthedisequilibriumestimates. Adiseq.ciobjectprovidingdetailsonthebootstrapconfidenceintervalsforthedisequilibriumestimates. AhtestobjectprovidingdetailsonthepermutationbasedChi-squaretest. callfunctioncallusedtocreatthisobject.conf,B,ci.B,simulate.p.value valuesusedforthesearguments. 24Author(s) GregoryR.Warnes genotype,diseq,diseq.ci,HWE.chisq,HWE.exact,chisq.testExamples ##Markerwithtwoalleles: example.data<-c(\"D/D\ \"D/D\ g1<-genotype(example.data)g1 HWE.test(g1) ##Comparewithindividualcalculations:diseq(g1)diseq.ci(g1)HWE.chisq(g1)HWE.exact(g1) ##Markerwiththreealleles:A,C,andTthree.data<-c(rep(\"A/A\ rep(\"C/A\rep(\"C/T\rep(\"C/C\rep(\"T/T\g3<-genotype(three.data)g3 HWE.test(g3,ci.B=10000) LD LDPairwiselinkagedisequilibriumbetweengeneticmarkers. Description ComputepairwiselinkagedisequilibriumbetweengeneticmarkersUsage LD(g1,...) ##S3methodforclassgenotypeLD(g1,g2,...) LD ##S3methodforclassdata.frameLD(g1,...)Arguments g1g2...Details genotypeobjectordataframecontaininggenotypeobjectsgenotypeobject(ignoredifg1isadataframe)optionalarguments(ignored) 25 Linkagedisequilibrium(LD)isthenon-randomassociationofmarkerallelesandcanarisefrommarkerproximityorfromselectionbias. LD.genotypeestimatestheextentofLDforasinglepairofgenotypes.LD.data.framecomputesLDforallpairsofgenotypescontainedinadataframe.Beforestarting,LD.data.framecheckstheclassandnumberofallelesofeachvariableinthedataframe.Ifthedataframecontainsnon-genotypeobjectsorgenotypeswithmoreorlessthan2alleles,thesewillbeomittedfromthecomputationandawarningwillbegenerated.ThreeestimatorsofLDarecomputed: •DrawdifferenceinfrequencybetweentheobservednumberofABpairsandtheexpectednumber: D=pAB−pApB •D’scaledDspanningtherange[-1,1] D= where,ifD>0: Dmax=min(pApb,papB) orifD<0: Dmax=max−pApB,−papb •rcorrelationcoefficientbetweenthemarkers −D r=(pA∗pa∗pB∗pb) where •-pAisdefinedastheobservedprobabilityofallele’A’formarker1,•-pa=1−pAisdefinedastheobservedprobabilityofallele’a’formarker1,•-pBisdefinedastheobservedprobabilityofallele’B’formarker2,and•-pb=1−pBisdefinedastheobservedprobabilityofallele’b’formarker2,and•-pABisdefinedastheprobabilityofthemarkerallelepair’AB’. Forgenotypedata,AB/abcannotbedistinguishedfromaB/Ab.Consequently,weestimatepABusingmaximumlikelihoodandusethisvalueinthecomputations. DDmax 26Value LD.genotypereturnsa5elementlist:callDDprimecorrnobschisqp.value thematchedcall LinkagedisequilibriumestimateScaledlinkagedisequilibriumestimateCorrelationcoefficientNumberofobservations Chi-squarestatisticforlinkageequilibrium(i.e.,D=D’=corr=0)Chi-squarep-valueformarkerindependence LD LD.data.framereturnsalistwiththesameelements,buteachelementisamatrixwheretheupperoff-diagonalelementscontaintheestimateforthecorrespondingpairofmarkers.TheothermatrixelementsareNA.Author(s) GregoryR.Warnes genotype,HWE.testExamples g1<-genotype(c(T/A,NA,T/T,NA,T/A,NA,T/T,T/A, T/T,T/T,T/A,A/A,T/T,T/A,T/A,T/T,NA,T/A,T/A,NA))g2<-genotype(c(C/A,C/A,C/C,C/A,C/C,C/A,C/A,C/A,C/A,C/C,C/A,A/A,C/A,A/A,C/A,C/C,C/A,C/A,C/A,A/A))g3<-genotype(c(T/A,T/A,T/T,T/A,T/T,T/A,T/A,T/A, T/A,T/T,T/A,T/T,T/A,T/A,T/A,T/T,T/A,T/A,T/A,T/T))#ComputeLDonasinglepairLD(g1,g2) #ComputeLDtableforall3genotypesdata<-makeGenotypes(data.frame(g1,g2,g3))LD(data) locus27 locusCreateandManipulateLocus,Gene,andMarkerObjects Description locus,gene,andmarkercreateobjectstostoreinformation,respectively,aboutgeneticloci,genes,andmarkers. is.locus,is.gene,andismarkertestwhetheranobjectisamemberoftherespectiveclass.as.character.locus,as.character.gene,as.character.markerreturnacharacterstringcon-tainingacompactencodingtheobject. getlocus,getgene,getmarkerextractlocusdata(ifpresent)fromanotherobject.locus<-,marker<-,andgene<-addslocusdatatoanobject.Usage locus(name,chromosome,arm=c(\"p\\"q\\"long\\"short\NA), index.start,index.end=NULL)gene(name,chromosome,arm=c(\"p\\"q\\"long\\"short\"), index.start,index.end=NULL)marker(name,type,locus.name,bp.start,bp.end=NULL, relative.to=NULL,...)is.locus(x)is.gene(x)is.marker(x) ##S3methodforclasslocusas.character(x,...) ##S3methodforclassgeneas.character(x,...) ##S3methodforclassmarkeras.character(x,...)getlocus(x,...)locus(x)<-valuemarker(x)<-value 28 gene(x)<-value locus Arguments namechromosomearm index.startindex.endtypelocus.namebp.startbp.endrelative.to...x characterstringgivinglocus,gene,ormarkernameintegerspecifyingchromosomenumber(1:23forhumans). characterindicatinglongorshortarmofthechromosome.Longisbespecifiedby\"long\"or\"p\".Shortisspecifiedby\"short\"or\"q\". integerspecifyinglocationofstartoflocusorgeneonthechromosome.optionalintegerspecifyinglocationofendoflocusorgeneonthechromosome.characterstringindicatingmarkertype,e.g.\"SNP\" eitheracharacterstringgivingthenameofthelocusorgene(otherdetailsmaybespecifiedusing...)oralocusorgeneobject.startlocationofmarker,inbasepairsendlocationofmarker,inbasepairs(optional) location(optional)fromwhichbp.startandbp.endarecalculated. parametersforlocususedtofillinadditionaldetailsonthelocusorgenewithinwhichthemarkerislocated. anobjectofclasslocus,gene,ormarker,or(forgetlocus,locus<-,marker<-,andgene<-)anobjectthatmaycontainalocusattributeorfield,notablyagenotypeobject. locus,marker,orgeneobject valueValue Objectofclasslocusandgenearelistswiththeelements:namechromosomearm index.startindex.end characterstringgivinglocus,gene,ormarkernameintegerspecifyingchromosomenumber(1:23forhumans). characterindicatinglongorshortarmofthechromosome.Longisbespecifiedby\"long\"or\"p\".Shortisspecifiedby\"short\"or\"q\". integerspecifyinglocationofstartoflocusorgeneonthechromosome.optionalintegerspecifyinglocationofendoflocusorgeneonthechromosome. Objectsofclassmarkeraddtheadditionalfields:marker.namebp.startbp.endrelative.toAuthor(s) GregoryR.Warnes characterstringgivingthenameofthemarkerstartlocationofmarker,inbasepairsendlocationofmarker,inbasepairs(optional) location(optional)fromwhichbp.startandbp.endarecalculated. locusSeeAlso genotype,Examples ar2ar2par <-gene(\"AR2\<-locus(name=\"AR2Psedogene\ chromosome=1,arm=\"q\ index.start=32,index.end=42) <-marker(name=\"C-109T\ type=\"SNP\ locus.name=\"AR2\chromosome=7,arm=\"q\ index.start=35,bp.start=-109, relative.to=\"startofcodingregion\")<-marker(name=\"C-109T\ type=\"SNP\locus=ar2, bp.start=-109, relative.to=\"startofcodingregion\") 29 par c109t c109tc109t c109t example.datag1g1 <-c(\"D/D\ \"D/D\ <-genotype(example.data,locus=ar2) getlocus(g1)summary(g1)HWE.test(g1) g2<-genotype(example.data,locus=c109t)summary(g2)getlocus(g2)heterozygote(g2)homozygote(g1) 30makeGenotypes allele(g1,1)carrier(g1,\"I\")heterozygote(g2) makeGenotypesConvertcolumnsinadataframetogenotypesorhaplotypes Description Convertcolumnsinadataframetogenotypesorhaplotypes.Usage makeGenotypes(data,convert,sep=\"/ol=0.5,...,method=as.genotype)makeHaplotypes(data,convert,sep=\"/ol=0.9,...)Arguments dataconvertseptol...methodDetails ThefunctionsmakeGenotypesandmakeHaplotypesallowtheconversionofallofthegeneticvari-ablesinadatasettogenotypesorhaplotypesinasinglestep. Theparameterconvertmaybemissing,avectorofcolumnnames,indexesortrue/falseindictators,oralistofcolumnnameorindexpairs. Whentheargumentconvertisnotprovided,thefunctionwilllookforcolumnswhereatleasttol*100%oftherecordscontaintheseparatorcharactersep(’/’bydefault).Thesecolumnswillthenbeassumedtocontainbothofthegenotype/haplotypeallelesandwillbeconvertedin-placetogenotypevariables. Whentheargumentconvertisavectorofcolumnnames,indexesortrue/falseindictators,thecorrespondingcolumnswillbeassumedtocontainbothofthegenotype/haplotypeallelesandwillbeconvertedin-placetogenotypevariables. Whentheargumentconvertisalistcontainingcolumnnameorindexpairs,thetwoelementsofeachpairwillbeassumedtocontaintheindividualallelesofagenotype/haplotype.Thefirst Dataframecontainingcolumnstobeconverted Vectororlistofpairsspecifyingwhichcolumnscontaingenotype/haplotypedata.Seebelowfordetails.GenotypeseparatorSeebelow. Optionalargumentstoas.genotypefunctionFunctionusedtoperformtheconversion. makeGenotypes31 columnspecifiedineachpairwillbereplacedwiththenewgenotype/haplotypevariablenamedname1+sep+name2.Thesecondcolumnwillberemoved. Notethatthemethodargumentmaybeusedtosupplyanon-standardconversionfunction,suchasas.genotype.allele.count,whichconvertsfrom[0,1,2]to[’A/A’,’A/B’,’A/C’](orthespecifiedallelenames).Seetheexamplebelow.Value Dataframecontainingconvertedgenotype/haplotypevariables.Allothervariableswillbeun-changed.Author(s) GregoryR.Warnes genotypeExamples ##Notrun:#commoncase data<-read.csv(file=\"genotype_data.csv\")data<-makeGenotypes(data)##End(Notrun) #Createatestdatasetwherethereareseveralgenotypesincolumns#oftheform\"A/T\". test1<-data.frame(Tmt=sample(c(\"Control\replace=TRUE), G1=sample(c(\"A/T\replace=TRUE),N1=rnorm(20), I1=sample(1:100,20,replace=TRUE), G2=paste(sample(c(\"134\ replace=TRUE), sample(c(\"134\ replace=TRUE), sep=\"/\"), G3=sample(c(\"A/T\/T\/A\"),20,replace=TRUE),comment=sample(c(\"PossibleBadData/LabError\ rep=TRUE) ) test1 #nowautomaticallyconvertgenotypecolumnsgeno1<-makeGenotypes(test1)geno1 #Createatestdatasetwherethereareseveralhaplotypeswithalleles#inadjacentcolumns. test2<-data.frame(Tmt=sample(c(\"Control\replace=TRUE), 32 G1.1=sample(c(\"A\replace=TRUE),G1.2=sample(c(\"A\replace=TRUE),N1=rnorm(20), I1=sample(1:100,20,replace=TRUE), G2.1=sample(c(\"134\ replace=TRUE), G2.2=sample(c(\"134\ replace=TRUE), G3.1=sample(c(\"A\\\"),20,replace=TRUE),G3.2=sample(c(\"A\\\"),20,replace=TRUE), comment=sample(c(\"PossibleBadData/LabError\ rep=TRUE) ) order.genotype test2 #speciflythelocationsofthecolumnstobepairedforhaplotypesmakeHaplotypes(test2,convert=list(c(\"G1.1\ #Createatestdatasetwherethedataiscodedasnumericallele#counts(0-2). test3<-data.frame(Tmt=sample(c(\"Control\replace=TRUE), G1=sample(c(0:2,NA),20,replace=TRUE),N1=rnorm(20), I1=sample(1:100,20,replace=TRUE),G2=sample(0:2,20,replace=TRUE), comment=sample(c(\"PossibleBadData/LabError\ rep=TRUE) ) test3 #speciflythelocationsofthecolumns,andanon-standardconversion makeGenotypes(test3,convert=c(G1,G2),method=as.genotype.allele.count) order.genotypeOrder/sortgenotype/haplotypeobject Description Order/sortgenotypeorhaplotypeobjectaccordingtoorderofallelenamesorgenotypesUsage ##S3methodforclassgenotype order(...,na.last=TRUE,decreasing=FALSE, alleleOrder=allele.names(x),genotypeOrder=NULL)##S3methodforclassgenotype order.genotype sort(x,decreasing=FALSE,na.last=NA,..., alleleOrder=allele.names(x),genotypeOrder=NULL)genotypeOrder(x) genotypeOrder(x)<-value 33 Arguments ...xna.lastdecreasingalleleOrdergenotypeOrdervalueDetails ArgumentgenotypeOrdercanbeusefull,whenyouwantthatsomegenotypesappear\"together\whereastheyarenot\"together\"byalleleorder. Bothmethods(orderandsort)workwithgenotypeandhaplotypeclasses.IfalleleOrderisgiven,genotypeOrderhasnoeffect. Genotypes/haplotypes,withmissingallelesinalleleOrderaretreatedasNAandorderedaccordingtoorderargumentsrelatedtoNAvalues.Insuchcasesawarningisissued(\"Founddatavaluesnotmatchingspecifiedalleles.ConvertingtoNA.\")andcanbesafelyignored.Genotypespresentinx,butnotspecifiedingenotypeOrder,arealsotreatedasNA. ValueofgenotypeOrdersuchas\"B/A\"matchesalso\"A/B\"incaseofgenotypes. OnlyuniquevaluesinargumentalleleOrderorgenotypeOrderareusedi.e.firstoccurrenceprevails.Value ThesameasinorderorsortAuthor(s) GregorGorjancSeeAlso genotype,allele.names,order,andsort genotypeorhaplotypeinordermethod;notusedforsortmethodgenotypeorhaplotypeinsortmethodasindefaultorderorsortasindefaultorderorsort character,vectorofallelenamesinwantedorder character,vectorofgenotype/haplotypenamesinwantedorderthesameasinargumentorder.genotype 34Examples x<-c(\"C/C\\"A/C\\"A/A\NA,\"C/B\\"B/A\\"B/B\\"B/C\\"A/C\")alleles<-c(\"A\\"B\\"C\") g<-genotype(x,alleles=alleles,reorder=\"yes\")##\"C/C\"\"A/C\"\"A/A\"NA\"B/C\"\"A/B\"\"B/B\"\"B/C\"\"A/C\"h<-haplotype(x,alleles=alleles)##\"C/C\"\"A/C\"\"A/A\"NA\"C/B\"\"B/A\"\"B/B\"\"B/C\"\"A/C\"##---Standardusage---sort(g) ##\"A/A\"\"A/B\"\"A/C\"\"A/C\"\"B/B\"\"B/C\"\"B/C\"\"C/C\"NAsort(h) ##\"A/A\"\"A/C\"\"A/C\"\"B/A\"\"B/B\"\"B/C\"\"C/B\"\"C/C\"NA##---Reversedorderofalleles---sort(g,alleleOrder=c(\"B\\"C\\"A\")) ##\"B/B\"\"B/C\"\"B/C\"\"A/B\"\"C/C\"\"A/C\"\"A/C\"\"A/A\"NA ##notethatA/BcomesafterB/CsinceitistreatedasB/A; ##orderofalleles(notinalleleOrder!)doesnotmatterforagenotypesort(h,alleleOrder=c(\"B\\"C\\"A\")) ##\"B/B\"\"B/C\"\"B/A\"\"C/B\"\"C/C\"\"A/C\"\"A/C\"\"A/A\"NA##---Missingallele(s)inalleleOrder---sort(g,alleleOrder=c(\"B\\"C\")) ##\"B/B\"\"B/C\"\"B/C\"\"C/C\"\"A/C\"\"A/A\"NA \"A/B\"\"A/C\" order.genotype sort(g,alleleOrder=c(\"B\"))##\"B/B\"\"C/C\"\"A/C\"\"A/A\"NA\"B/C\"\"A/B\"\"B/C\"\"A/C\"##genotypeswithmissingallelearetreatedasNAsort(h,alleleOrder=c(\"B\\"C\")) ##\"B/B\"\"B/C\"\"C/B\"\"C/C\"\"A/C\"\"A/A\"NAsort(h,alleleOrder=c(\"B\"))##\"B/B\"\"C/C\"\"A/C\"\"A/A\"NA##---UseofgenotypeOrder---sort(g,genotypeOrder=c(\"A/A\\"C/C\\"B/B\\"A/B\\"A/C\\"B/C\"))##\"A/A\"\"C/C\"\"B/B\"\"A/B\"\"A/C\"\"A/C\"\"B/C\"\"B/C\"NAsort(h,genotypeOrder=c(\"A/A\\"C/C\\"B/B\ \"A/C\\"C/B\\"B/A\\"B/C\")) ##\"A/A\"\"C/C\"\"B/B\"\"A/C\"\"A/C\"\"C/B\"\"B/A\"\"B/C\"NA \"B/A\"\"A/C\" \"C/B\"\"B/A\"\"B/C\"\"A/C\" plot.genotype ##---Missinggenotype(s)ingenotypeOrder---sort(g,genotypeOrder=c(\"C/C\\"A/B\\"A/C\\"B/C\"))##\"C/C\"\"A/B\"\"A/C\"\"A/C\"\"B/C\"\"B/C\"\"A/A\"NA\"B/B\"sort(h,genotypeOrder=c(\"C/C\##\"C/C\"\"A/C\"\"A/C\"\"B/C\"\"A/A\"NA \"A/B\\"A/C\\"B/C\")) \"C/B\"\"B/A\"\"B/B\" 35 plot.genotypePlotgenotypeobject Description plot.genotypecanplotgenotypeorallelefrequencyofagenotypeobject.Usage ##S3methodforclassgenotypeplot(x,type=c(\"genotype\\"allele\"),what=c(\"percentage\\"number\"),...)Arguments xtypewhat...Value Thesameasinbarplot.Author(s) GregorGorjancSeeAlso genotype,barplotExamples set<-c(\"A/A\\"A/B\\"A/B\\"B/B\\"B/B\\"B/B\ \"B/B\\"B/C\\"C/C\\"C/C\") set<-genotype(set,alleles=c(\"A\\"B\\"C\"),reorder=\"yes\")plot(set) plot(set,type=\"allele\what=\"number\") genotypeobject,asgenotype. plot\"genotype\"or\"allele\"frequency,ascharacter.show\"percentage\"or\"number\ascharacterOptionalargumentsforbarplot. 36print.LD print.LDTextualandgraphicaldisplayoflinkagedisequilibrium(LD)objects Description Textualandgraphicaldisplayoflinkagedisequilibrium(LD)objectsUsage ##S3methodforclassLD print(x,digits=getOption(\"digits\"),...)##S3methodforclassLD.data.frameprint(x,...) ##S3methodforclassdata.frame summary.LD(object,digits=getOption(\"digits\"), which=c(\"D\\"D\\"r\\"X^2\\"P-value\\"n\\"\"),rowsep,show.all=FALSE,...) ##S3methodforclasssummary.LD.data.frameprint(x,digits=getOption(\"digits\"),...) ##S3methodforclassLD.data.frame plot(x,digits=3,colorcut=c(0,0.01,0.025,0.5,0.1,1), colors=heat.colors(length(colorcut)),textcol=\"black\marker,which=\"D\distance,...)LDtable(x,colorcut=c(0,0.01,0.025,0.5,0.1,1), colors=heat.colors(length(colorcut)),textcol=\"black\digits=3,show.all=FALSE,which=c(\"D\\"D\\"r\\"X^2\\"P-value\\"n\"),colorize=\"P-value\cex,...)LDplot(x,digits=3,marker,distance,which=c(\"D\\"D\\"r\\"X^2\ \"P-value\\"n\\"\"),...)Arguments x,objectdigitswhichrowsepcolorcutcolorstextcol LDorLD.data.frameobject Numberofsignificantdigitstodisplay Name(s)ofLDinformationitemstobedisplayedSeparatorbetweenrowsofdata,useNULLfornoseparator.P-valuecutoffspointsforcolorizingLDtable ColorsforeachP-valuecutoffgivenincolorcutforLDtableColorfortextlabelsforLDtable print.LD markerdistanceshow.allcolorizecex...Value None.Author(s) GregoryR.Warnes LD,genotype,HWE.testExamples 37 Markerusedas’comparator’onLDplot.Ifomittedseparatelinesforeachmarkerwillbedisplayed Markerlocation,usedforlocatingofmarkersonLDplot. IfTRUE,showallrows/columnsofmatrix.Otherwiseomitcompletelyblankrows/columns. LDparameterusedfordeterminingtablecellcolors Scalingfactorfortabletext.Ifabsent,textwillbescaledtofitwithinthetablecells. Optionalarguments(plot.LD.data.framepassesthesetoLDtableandLDplot) g1<-genotype(c(T/A,NA,T/T,NA,T/A,NA,T/T,T/A, T/T,T/T,T/A,A/A,T/T,T/A,T/A,T/T,NA,T/A,T/A,NA))g2<-genotype(c(C/A,C/A,C/C,C/A,C/C,C/A,C/A,C/A,C/A,C/C,C/A,A/A,C/A,A/A,C/A,C/C,C/A,C/A,C/A,A/A))g3<-genotype(c(T/A,T/A,T/T,T/A,T/T,T/A,T/A,T/A, T/A,T/T,T/A,T/T,T/A,T/A,T/A,T/T,T/A,T/A,T/A,T/T)) data<-makeGenotypes(data.frame(g1,g2,g3))#Compute&displayld<-LD(g1,g2)print(ld) LDforonemarkerpair #ComputeLDtableforall3genotypesldt<-LD(data)#displaytheresultsprint(ldt)LDtable(ldt) #textualdisplay #graphicalcolor-codedtable 38 LDplot(ldt,distance=c(124,834,927)) #LDplotvsdistance summary.genotype #moremarkersmakesprettierplots!data<-list()nobs<-1000ngene<-20 s<-seq(0,1,length=ngene) a1<-a2<-matrix(\"\nrow=nobs,ncol=ngene)for(iin1:length(s)){ rallele<-function(p)sample(c(\"A\1,p=c(p,1-p)) if(i==1){ a1[,i]<-sample(c(\"A\1000,p=c(0.5,0.5),replace=TRUE)a2[,i]<-sample(c(\"A\1000,p=c(0.5,0.5),replace=TRUE)}else{ p1<-pmax(pmin(0.25+s[i]*as.numeric(a1[,i-1]==\"A\"),1),0)p2<-pmax(pmin(0.25+s[i]*as.numeric(a2[,i-1]==\"A\"),1),0)a1[,i]<-sapply(p1,rallele)a2[,i]<-sapply(p2,rallele)} } data<-data.frame(data)data<-makeGenotypes(data) data[[paste(\"G\<-genotype(a1[,i],a2[,i]) ldt<-LD(data) plot(ldt,digits=2,marker=19)#doLDtable&LDplotoninasingle #graphicswindow summary.genotypeAlleleandGenotypeFrequencyfromaGenotypeorHaplotypeObject Description summary.genotypecreatesanobjectcontainingalleleandgenotypefrequencyfromagenotypeorhaplotypeobject.print.summary.genotypedisplaysasummary.genotypeobject.Usage ##S3methodforclassgenotypesummary(object,...,maxsum) ##S3methodforclasssummary.genotypeprint(x,...,round=2) summary.genotypeArguments object,x...maxsumroundDetails 39 anobjectofclassgenotypeorhaplotype(forsummary.genotype)oranobjectofclasssummary.genotype(forprint.summary.genotype) optionalparameters.Ignoredbysummary.genotype,passedtoprint.matrixbyprint.summary,genotype. specifyinganyvaluefortheparametermaxsumwillcausesummary.genotypetofallbacktosummary.factor. numberofdigitstousewhendisplayingproportions. Specifyinganyvaluefortheparametermaxsumwillcausefallbacktosummary.factor.Thisissothatthefunctionsummary.dataframewillgivereasonableoutputwhenitcontainsagenotypecolumn.(Hopefullywecanfigureoutsomethingbettertodointhiscase.)Value Thereturnedvalueofsummary.genotypeisanobjectofclasssummary.genotypewhichisalistwiththefollowingcomponents:locus. allele.namesallele.freq vectorofallelenames Atwocolumnmatrixwithonerowforeachallele,plusonerowforNAvalues(ifpresent).Thefirstcolumn,Count,containsthefrequencyofthecorrespondingallelevalue.Thesecondcolumn,Proportion,containsthefractionofalleleswiththecorrespondingallelevalue.Noteeachobservationcontainstwoalleles,thustheCountfieldsumstotwicethenumberofobservations. Atwocolumnmatrixwithonerowforeachgenotype,plusonerowforNAvalues(ifpresent).Thefirstcolumn,Count,containsthefrequencyofthecor-respondinggenotype.Thesecondcolumn,Proportion,containsthefractionofgenotypeswiththecorrespondingvalue.locusinformationfield(ifpresent)fromx genotype.freq print.summary.genotypesilentlyreturnstheobjectx.Author(s) GregoryR.Warnes genotype,HWE.test,allele,homozygote,heterozygote,carrier,allele.countlocusgenemarker 40Examples example.datag1g1 <-c(\"D/D\ \"D/D\ <-genotype(example.data) write.pop.file summary(g1) undocumentedUndocumentedfunctions Description Thesefunctionsareundocumented.Someareinternalandnotintendedfordirectuse.Somearenotyetreadyforendusers.Otherssimplyhaven’tbeendocumentedyet.Author(s) GregoryR.Warnes write.pop.fileCreategeneticsdatafiles Description write.pop.filecreatesa’pop’datafile,asusedbytheGenePop(http://wbiomed.curtin.edu.au/genepop/)andLinkDos(http://wbiomed.curtin.edu.au/genepop/linkdos.html)soft-warepackages. write.pedigree.filecreatesa’pedigree’datafile,asusedbytheQTDTsoftwarepackage(http://www.sph.umich.edu/statgen/abecasis/QTDT/). write.marker.filecreatesa’marker’datafile,asusedbytheQTDTsoftwarepackage(http://www.sph.umich.edu/statgen/abecasis/QTDT/).Usage write.pop.file(data,file=\"\digits=2,description=\"DatafromR\")write.pedigree.file(data,family,pid,father,mother,sex, file=\"pedigree.txt\") write.marker.file(data,location,file=\"marker.txt\") write.pop.fileArguments datafiledigits DataframecontaininggenotypeobjectstobeexportedOutputfilename Numberofdigitstouseinnumberinggenotypes,either2or3. 41 descriptionDescriptiontouseasthefirstlineofthe’pop’file.family,pid,father,mother Vectoroffamily,individual,father,andmotherid’s,respectively.sexlocationDetails Theformatof’Pop’filesisdocumentedathttp://wbiomed.curtin.edu.au/genepop/help_input.html,theformatof’pedigree’filesisdocumentedathttp://www.sph.umich.edu/csg/abecasis/GOLD/docs/pedigree.htmlandtheformatof’marker’filesisdocumentedathttp://www.sph.umich.edu/csg/abecasis/GOLD/docs/map.html.Value Noreturnvalue.Author(s) GregoryR.Warnes write.tableExamples #TBA Vectorgivingthesexoftheindividual(1=Make,2=Female)Locationofthemarkerrelativetothegeneofinterest,inbasepairs. Index ∗TopicIO write.pop.filehplot ,40∗Topicplot.genotype,35∗Topicmanip expectedGenotypesgroupGenotype,order.genotype,157misc ,32∗Topicci.balanceDepreciated,2diseqgenotype,4,4gregorius,8groupGenotype,13homozygote,15HWE.chisq,17HWE.exact,HWE.test,2021LD,22locus,24makeGenotypes,27 order.genotype,30print.LD,32summary.genotype,36 ==.genotypeundocumented==.haplotype(genotype,40,38)[.genotype(genotype,)8,8[.haplotype(genotype[<-.genotype(genotype),)8[<-.haplotype(genotype,8%in%(),8%in%,(11 genotype),genotype8),8alleleallele,11,20allele.count(homozygote,39 ),17allele.count.2.genotype,11,39 (undocumented), 40 allele.count.genotypeallele.namesas.character.gene,8,33 (genotype),8as.character.locus(locusas.character.marker(locus),)27,27as.factoras.genotype(undocumented(locus),40),27as.haplotype(genotype(genotype),)8,8barplotbootboot.ci,3,,635bootstrap,,620,,203 carriercarrier,chisq.test(11homozygote,39 ci.balance,,202,23,24),17Depreciateddiseqdiseq.ci,4,20,3,,,22420–,2422–24 expectedGenotypesexpectedHaplotypes,7(,expectedGenotypes10 ), 7factor,16 genegene,11,19,39gene<-(locus),27GeneticPower.Quantitative.Factor(locus),27 GeneticPower.Quantitative.Numeric,4geno.as.arraygenotype,6,8,8(,4,undocumented16,19,24,26,),2940 ,31,33, 35,39 genotypeOrdergenotypeOrder,genotypeOrder<-(11 order.genotype),getgene(locus),27 (order.genotype32),3242 INDEX getlocusgetmarker(locus),27GPCgregorius,4 (locus),27groupGenotype,13 ,15 haphapambig(undocumentedhapenum(undocumented),40 hapfreq(haplotype(undocumentedundocumented)),40),,4040haplotype,hapmcmc(16 genotype),hapshuffle(undocumented8),40heterozygote(undocumentedheterozygote,),40heterozygote.genotype(11homozygote,39 ),homozygote(genotype17 ),8homozygote.genotype,11,17,39 HWE.chisq(HWE.exact,HWE.test,,20genotype),8620,,11,2221,24,19,24 ,20,22,22,26,39is.geneis.genotype(locusis.haplotype(genotype),27 ),is.locus(genotype)8,8is.marker(locus(locus),)27,27LDLDplot,24 LDtable(print.LDlevels(print.LD),)36,36locuslocus<-,,1116 (,locus19,27),,27 39makeGenotypesmakeHaplotypes,30 marker(makeGenotypes),30marker,marker<-(11locus,19,39mknum(locus),27)mourant(undocumented,27(undocumented),40),40nallele(genotype),8orderorder,33 order.genotype(order.genotype,32 ),3243 plot.genotypeplot.LD.data.frame,35 power.casectrlprint.allele.count(Depreciated(print.LD),),436print.allele.genotype(genotypeprint.diseqprint.gene(diseq)(genotype),8),8print.genotype(locus),,27 4print.HWE.test(print.LD(genotypeHWE.test)),,822print.locus,36 print.marker(locusprint.summary.genotype (locus),)27,27(summary.genotype),38 print.summary.LD.data.frame(print.LD), 36shortsummary.genotypesort(undocumented),40sort.genotype,33 sort.genotype,summary.genotype(11 order.genotype),32summary.LD.data.frame,11,19(,print.LD38 ),36undocumented,40 write.marker.filewrite.pedigree.file(write.pop.filewrite.pop.file(write.pop.file),40),40write.table,41 ,40 因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- sarr.cn 版权所有 赣ICP备2024042794号-1
违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务