#Swets&Zeitlinger
AMultifactorialAnalysisofSyntacticVariation:ParticleMovementRevisitedÃ
StefanT.Gries
SouthernDenmarkUniversity,Denmark
ABSTRACT
ThepresentpaperinvestigatesthewordorderalternationofEnglishtransitivephrasalverbssuchas,e.g.,topickupthebookversustopickthebookup.Itbuildsontraditionalmono-factorialanalyses,butarguesthatpreviouslyusedmethodsofanalysisaregrosslyinadequatetodescribe,explainandpredictthewordorderchoicebynativespeakers.Ahypothesisintegratingvirtuallyallrelevantvariableseverpostulatedisproposedandinvestigatedfromamultifactorialperspective(usingGLM,lineardiscriminantanalysisandCART).Asaresult,morethan84%ofnativespeakers'choicescanbepredicted.Furtherimplications(linguisticandmetho-dological)arediscussed.
INTRODUCTION
Anotoriouslydif®cultproblemforsyntacticresearchistheexistenceofsyntacticvariation,i.e.closelyrelatedsyntacticvariantswithtruth-condition-allyequivalentmeanings.ExamplesinEnglishincludethewell-knownwordorderalternationsDativeMovement,PrepositionStrandingandParticleMovementin(1),(2)and(3)respectively.1(1)
(a)John[VPgave[NPthebook][PPto[NPBill]]].(b)John[VPgave[NPBill][NPthebook]].
*Addresscorrespondenceto:StefanT.Gries,SouthernDenmarkUniversity,Institutfor
Â150,DK-00Sùnderborg.ErhvervssprogligInformatikogKommunikation,GrundtvigsAlle
E-mail:stgries@sb.sdu.dk1Thegrammaticalnotationisnotcommittedtoanyparticulargrammaticalframeworkandservesexpositoryreasonsonly.Likewise,thechoiceofterminologyintermsofmovementprocessesisnotmeanttotrulyimplyanysuchprocessesÐitmerelyre¯ectsthatthesephenomenahavemostfrequentlybeendealtwithwithinthetransformational-generativeparadigm.
34(2)(3)
S.T.GRIES
(a)Whodidyou[VPseeBill[PPwithti]]?(b)[PPWithwhom]ididyou[seeBillti]?(a)Johnpickedup[NPthebook].(b)Johnpicked[NPthebook]up.
Severalinterrelatedquestionsarisewithrespecttotheseconstructionalalternations:
howdothetwoconstructionalvariantsofeachpairdifferfromeachother?whyandtowhatextentdodifferentvariablesin¯uencethesubconsciouschoiceofconstructionbynativespeakersinanaturalsetting?
howcannativespeakerchoicesofconstructionsinaparticulardiscoursesituationbepredicted?Morespeci®cally,whichtechniquesaremostsuitableforthepredictionofnativespeakers'choicesgiventhecomplexityofnaturaldiscoursesettings?
Thisstudyinvestigatesthesequestionsforthelastoftheabove-mentionedwordorderalternationsoftransitivephrasalverbs;theconstructionwherenoelementintervenesbetweenverbandparticlewillbereferredtoasconstruc-tionÐtheconstructionwhereadirectobjectNPintervenesbetweenverbandparticleisreferredtoasconstruction1:2(4)
(a)Johnpickedupthebook.(b)Johnpickedthebookup.
construction0construction1Section2isconcernedwithabriefsummaryofpreviousanalysesofParticleMovement.Atthesametime,severalmethodologicallymotivatedpointsofcritiqueareraised.Section3outlinesthemethodsbymeansofwhichtheabovethreeobjectivesarepursued.Section4dealswiththeresultsofthestudy:monofactorialaswellasmultifactorialresultswillbepresentedinsomedetail.Finally,Section5concludesbysituatingthestudywithinabroaderpsycholinguisticframeworkandbybrie¯ydiscussingfurtherimplicationsgoingbeyondtheimmediatescopeofParticleMovement.
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐ2Thereasonforthisseeminglyarbitraryandcounter-intuitivelabellingwillbeaddressedlater;asamnemonichelp,considertheindextonamethenumberofcontituentsinterveningbetweenverbandparticle.
AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
35
PREVIOUSANALYSES
Variablesthatpurportedlygovernthealternation
Thepositionofparticlesintransitivephrasalverbshasbeeninvestigatedtimeandagainwithinthelast100years.Theapproachescomefromwidelydi-verginglinguisticschoolsofthoughtsuchasChomskyantransformational-generativegrammar(cf.,e.g.,Fraser,1974,1976;DenDikken,1992,1995;Rohrbacher,1994,tonamebutafew),traditionalgrammarians(Sweet,12;Jespersen,1928;Kruisinga&Erades,1953),cognitivegrammar(Yeagle,1983),discourse-functionalorientedapproaches(Chen,1986),psycholinguis-tically-orientedapproaches(Hawkins,1994)etc.Overthetime,variousvariableshavebeenproposedinordertoaccountforboththeoptimalstructuralcon®gurationofthetwoconstructionsandthequestionofwhichconstructionischosenbynativespeakers.Table1providesanoverviewover
Table1.
Variablesthatallegedlygovernthealternation.
Variable
Valueforconstruction1Valueforconstruction0LongDOLongDOComplex
LengthoftheDOinwords(LengthW)LengthoftheDOinsyllables(LengthS)ComplexityoftheDO(Complex)NP-TypeoftheDO:semi-pronominal
(Type)
Inde®niteDetermineroftheDO(Det)NoPreviousmentionoftheDO(Lm)Low3ÐÐÐTimesofprecedingmentionoftheDO(Topm)ÐÐÐÐ\"High3ÐÐDistancetolastmentionoftheDO(Dtlm/ActPC)ÐÐÐÐ\"High3ÐÐÐÐÐÐNewsValueoftheDOÐÐÐÐÐÐÐÐÐÐÐ\"Yes(Contrastive)StressoftheDOYesSubsequentmentionoftheDO(NM)High3ÐÐÐTimesofsubsequentmentionoftheDO(Tosm)ÐÐÐÐ\"How3ÐDistancetonextmentionoftheDO(Dtnm/ClusSC)ÐÐÐÐ\"
OverallfrequencyoftheDO(OM)followingdirectionaladverbial(PP)
YesPrepofthefollowingPPisidenticaltothe
particle(PartPrep)
Register
Idiomatic3ÐÐÐÐÐMeaningoftheVP(Idiomaticity)ÐÐÐÐÐÐÐ\"Low3ÐÐÐÐÐCognitiveEntrenchmentoftheDOÐÐÐÐÐÐÐ\"InanimateAnimacyoftheDO(Animacy)AbstractConcretenessoftheDO(Concreteness)
pronominalde®nite
yeshighlowlownolowhighyes
literalhighanimateconcrete
36
S.T.GRIES
themultitudeofvariablesproposedsofar.Thecentralcolumnnamesthevariableproposedwhereastheleftandrightcolumnnamethevalues/levelsofeachvariableassociatedwithaparticularpreferenceforaconstruction.Commentsandpointsofcritique
Thislistofvariablesmayseemquiteimpressiveat®rst.Itisespeciallyinterestingtonotethatata®rstsuper®cialglancequitesimplewordorderalternationisinfactin¯uencedbyvariablesfrommanydifferentsub-branchesoflinguistics:phonology,syntax,semantics,pragmaticsandothervariables.Unfortunately,however,therearealsoseveralshortcomingsthathavehin-deredprogressconsiderably.
Firstofall,mostvariablesarebasedonintrospectiveanalysis(i.e.,acceptabilityjudgements)andnon-authenticexamplesentences.Whiletherearesomelinguisticframeworkswhichconsiderthistobearewardingwayofgatheringdata,Iwouldcontendthat(i)acceptabilityjudgementsveryoftendonotnecessarilyconstituteobjective,reliableandvaliddata(cf.,e.g.,Labov,
Ètze,1996);(ii)itisquestionablethatananalysissolelybasedon1975;Schu
dreamt-upsentencescaninfactobtainrepresentativeresults;and(iii)`noonehaseverpresentedevenahintofevidencethatanypartofthehuman'slinguisticcompetenceistheabilitytoevaluatesentencesproducedarti®cially,outofcontext'(Prince,1991,p.80).
Second,mostanalysesonlyconsidermonofactorialresults(i.e.,theeffectonevariablehasonthealternationinisolation)althoughforthespeakerallvariablesaregivensimultaneously.Forinstance,Fraser(1974)arguedthatverbswithoutinitialstresspreferconstruction1,offeringthefollowingsentencesaswhatheclaimstobesupportingevidence.(5)(6)(7)
(a)?Iwillinsultbacktheman.(b)Iwillinsultthemanback.
(a)?Weconvertedovertheheatingtosteam.(b)Weconvertedtheheatingovertosteam.(a)?Theyattachedupthetagonthewall.(b)Theyattachedthetaguponthewall.
Butwhatisproblematicaboutthisapproach?Isthisnotanexampleofoneofthemosttraditionalandwell-establishedmethodsinlinguistics,namelytheminimal-pairtest?Theproblemliesinthefactthattheexamplesdonot
AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
37
warrantthisclaimatall:thepreferenceforconstruction1intheseexamples(ifthereisoneatall,recallthescepticismexpressedaboveconcerningsuchisolatedacceptabilityjudgements)neednotberelatedtoFraser'sclaimatallandmightaswellderivefromthefactthatshortandsimpledirectobjectsalreadyfavourconstruction1,asdode®nitedeterminersandliteralVPmean-ings(cf.Table1).Ifwegeneralizefromthisphenomenontootheranalyses(whichwecando:nearlyallpreviousanalysesaremonofactorial)we®ndthat,giventhecomplexityof20orsointeractingvariables,wecannotrelyonmonofactorialanalysestodescribeParticleMovementadequately.
Finally,itisgenerallyacceptedthat,normally,sciencetriestodescribe,explainandpredictphenomena.WithParticleMovement(andmanyothercasesofsyntacticvariation),however,themostrigoroustestofone'stheory,namelytheactualpredictionofspeakers'behaviour,hasneverbeenattempted.Everyanalysishasaimedatdescribingparticleplacementatleasttosomeextent;someanalyseshaveaimedatexplainingparticleplacement,butthereareonlyfewanalysesaimingatsubsumingthevariablesunderacommon(setof)factor(s);noanalysishasaimedatpredictingparticleplacementinnaturaldiscoursesituations.Thisisofcourseaconsequenceofthepreviouslymentionedshortcomings:If,forinstance,oneisnotabletoquantifytheimportanceoftheindividuallyproposedvariables,thentradi-tionalaccountswouldalreadyfailtopredictconstructionalchoiceswhenonlytwovariableshavecon¯ictingpreferences.ConsiderJohnpickedupabook.Theshortdirectobjectprefersconstruction1whereastheinde®nitedeterminerprefersconstruction0.Evidently,withoutawaytoweighindividualvariables'preferences,traditionalaccountscannotevenpredictspeakers'preferencesinthesimplecaseswhereonlytwovariablesareconcerned,whichiswhysofarnobodyhasmanagedtopredictspeakers'choicessimultaneouslyaccountingformorethanadozenvariables.
Asisevidentfromthethreeabove-mentionedresearchquestions(cf.section1)Iintendtoovercometheseshortcomings.ThefollowingsectionisconcernedwiththemethodsIusetothatend.
METHODS
TheProcessingHypothesis(PH)
Inordertoexplainwhyspeakerschoosetheconstructiontheydo,Iproposethefollowinghypothesis:themultitudeofvariables(mostofwhichare
38
S.T.GRIES
concernedwiththedirectobjectNP)thatseemstoberelatedtoParticleMovementcanallberelatedtotheprocessingeffortoftheutterance.3Myideaofthenotionofprocessingeffortisafairlybroadone:itencompassesnotonlypurelysyntacticdeterminants,butalsofactorsfromotherlinguisticlevels.Morespeci®cally,Iassumethatvirtuallyalllevelsoflinguisticdescriptionmentionedabovecancontributetoprocessingeffort:
phonologicallyindicatedprocessingcost:contrastivestressonalinguisticexpressionincreasestheamountofprocessingeffortbecausethespeakerfocuses(whichisitselfnotaneffortlesstask)thehearer'sattentiononthereferentofthecontrastively-stressedexpression;
morphosyntacticallydeterminedprocessingcost:thelongerandthemorecomplexthedirectobjectNPis,themoreeffort(andworkingmemory)isneededtoprocesstheutterancecorrectly;
semanticallyconditionedprocessingcost:ifthemeaningoftheVPisidiomatic,thenthemeaningofthewholeofthetransitivephrasalverbisnotcomputablefromthemeaningoftheindividualpartssothatthepartsoftheidiomaticphrasalverbsrelymoreononeanotherthanwithtotallyliteralphrasalverbs(whichmostlydesignatecausedmotion).Following,sayinHawkins(2000),wemayassumethatthereisatendencytominimizewhathereferstoaslexicaldependencydomains(LPDs),i.e.(slightlysimpli®ed)thedistancebetweenexpressions(attimesmutually)dependentononeanotherfortheirinterpretation.Withidiomaticphrasalverbs(e.g.,toekeout),thesemanticdependencybetweenverbandparticleishigherthanthedependencyforliteralverbs(e.g.,tobringback),sowewouldaccordinglyexpectconstruction0,minimizingthedistancebetweenthecomponentpartsofthephrasalverb,whereaswithliteralphrasalverbs,nopreferenceforaconstructionistobeexpectedbecausethelowdegreeofinterdependencedoesnotrequireaparticularlysmalldistancebetweenthecomponentpartsand,thus,licensesbothwordorders.Inconnectionwiththat,concreteDOsaremorelikelytocorrelatewithaliteralinterpretationoftheverb-particleconstructionsincethesereferentscanundergothecausedmotionthatverb-particleconstructionscommonlydenoteÐabstractDOs,ontheotherhand,
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐ3ThequestionarisesastowhetherIrefertotheprocessingeffortofthespeakerorthehearer.Myownfocusisonthespeaker'sprocessingeffortÐwithParticleMovement,however,we®ndthatwhatmakesprocessingef®cientforspeakersisalsobene®cialtohearers.Thus,nostrictdifferentiationbetweenthetwointerlocutorsisnecessaryhere.
AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
39
giverisetolessliteralinterpretations(tobringbackpeaceisnotacaseofliteralcausedmotion).Thus,concreteness/abstractnessoftheDO'sreferentscorrelateswiththeliteralness/idiomaticityoftheverb-particleconstructionsandyieldsidenticalpredictions.
discourse-functionallydeterminedprocessingcost:ifthereferentofthedirectobjectNPisdiscourse-givenorcanbeinferredfromtheprecedingcontext,thenitsactivationandproductionincurslessprocessingcostthantheactivationandproductionofsomediscourse-neworevencompletely
Ân,1992).Amorphosyntacticphenomenonunknownreferent(cf.,e.g.,Givo
stronglycorrelatingwithdegreeofgivennessofNPreferentsisthechoiceofdeterminer.Itiswidelyacknowledgedthatinde®nitedeterminersaretypicallyusedfornewreferentswhilede®nitedeterminersaremoreoftenfoundwithgivenreferents.
Since(i)speakersstrivetocommunicatewhatevertheyintendtocommunicatewithaslittleeffortaspossibleand(ii)construction0isinherentlyeasiertoprocess(cf.Hawkins,1994;Rohdenburg,1996),theywilltendtouseconstruction0insituationswheretheprocessingeffortassociatedwiththeutteranceisalreadyhigh.Inotherwords,iftheVPdoesnotrequirealotofprocessingeffort(duetoitslimitedlength,thedegreeofactivationoftheDO'sreferent,etc.)thenconstruction1ischosenÐiftheVPrequiresalotofprocessingeffort(duetotheprocessingcostfortheDO'sreferent)thenconstruction0ischoseninordertofacilitatecommunicationbyminimizingthestructurallydeterminedprocessingeffort.Note,however,thatthishypoth-esisimpliesthatsomeofthevariablesmentionedabovewillnotberelevantforthechoiceofconstructionsincethereisnoreasonwhyvariablesconcernedwiththediscoursefollowingtheverb-particleconstructionshouldplayarolejustasthereisnoreasonwhytheanimacyofthedirectobject'sreferentshouldbeimportant.Finally,contrarytoapreviousanalysis(cf.Gries,1999),thevariableofentrenchmentisnotconsideredtoberelevantsincethevariablesthat,takentogether,constitutetheentrenchmenthierarchyusedpreviouslyareinvestigatedhereseparatelyandthusmuchmoreprecisely(cf.Gries,2000foramoredetailedstatisticalanalysisofthesevariables).
Thedata
Inordernottorelyonmade-upsentencesandtheir(attimes)doubtfulacceptabilityjudgements,Iadvocatetheuseofnaturally-occurringdata.Ihave,therefore,compiledasampleof403utteranceswithverb-particle
40
Table2.
S.T.GRIES
DatafromtheBritishNationalCorpus.
Construction0Construction113376209
Rowtotals
200203403
SpokenWritten
Columntotals67127194
constructionsfromtheBritishNationalCorpus.Theverb-particleconstruc-tionschosenmainlyconsistofcombinationsofthemostfrequentverbsandparticlesenteringintotransitivephrasalverbs.4Table2showsthedistributionofmydata.
Toeachofthesesentencesthe10precedingandsubsequentclauses(withoutfalsestartsordiscoursemarkerssuchasYouknoworImean)wereadded.Then,eachsentencewasinvestigatedwithrespecttothevariableslistedaboveinTable1.5Thetableresultingfromtheseprocesseswasthebasisfortheanalysistofollow.
Statisticaltechniques
Firstofall,foreachvariableamonofactorialcorrelationwascomputed.Dependingonthemeasurementscaleoftheindependentvariables,thecoef®cientsgiveninTable3weredetermined.6Itshallbenoted,however,thatthemonofactorialcorrelationsareonlydeterminedinordertotestpreviousmonofactorialanalysesempirically,mostofwhichhavenotbeenempiricallytestedbefore.Theprimarypurposeofthispaper,i.e.,thepredictionofspeakers'choices,canonlybeachievedwithmoreelaboratetechniques.Themultifactorialtechniquesthatwillbeusedarethe
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐ
Ihavedeterminedthemostfrequentverbsandparticlesintransitivephrasalverbsonthebasisofmyowncollectionof1,357transitivephrasalverbsfromseveraldictionaries.5Thedegreesofcomplexityandidiomaticityweremeasuredonordinalscales:simpleNPs,intermediateNPs(NPswithmodi®cationbyadjectivesand/orgenitives)andcomplexNPs(containingembeddedclauses)forcomplexityandsimple/literal,metaphorical/®gurativeandidiomatic/opaqueVPs.6Therankingisroughlyaccordingtothesizeofthecorrelation(butcf.below).Whilethetwovaluesoffandldonotalwayscoincide,thesizesofallcorrelationcoef®cients(oncewiththefcoef®cient,oncewithl)correlatehighlysigni®cantly(g0.85;z6.923;p`0X001ÃÃÃ),whichiswhytheseminorrankingdifferenceswillnotbedealtwith.
4AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
41
Table3.Monofactorialcorrelationscomputedforthecorpusdata.
Correlationcoef®cient
Phi/Cramer'sIandLambda
g(equallingKendall'stwithcorrectionforties)Pearsonproduct±momentcorreation
Measurementscaleoftheindependentvariable
Nominal/categoricalOrdinalInterval
generallinearmodel(GLM),lineardiscriminantanalysis(LDA)andclassi-®cationandregressiontrees(CART).
RESULTS
Monofactorialresults
Table4summarizesthemonofactorialcorrelationcoef®cientsofeveryindependentvariable;fornominalandordinalvariables,thecorrelationsbetweenindividuallevelsandthechoiceofconstructionarealsoprovided.7ThePHisstronglysupportedby
thosevariables/valuesthatindicatealowdegreeofprocessingcost(e.g.,literalVPswithpronominalDOsorshortlexicalDOswithde®nitedeterminerswheretheDOhasbeenmentionedbeforefrequently)favourconstruction1;
thosevariables/valuesthatindicateahighdegreeofprocessingcost(e.g.,idiomaticVPswithdiscourse-newlexicalreferentsoflongDONPswithinde®nitedeterminers).
Onthewhole,themorphosyntacticvariablesaremostin¯uential,semanticanddiscourse-functionalvariablesarelesspowerfulindeterminingthechoiceofconstruction.However,threecommentsarenecessary.First,giventhe
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐ
Therankingisroughlyaccordingtothesizeofthecorrelation(butcf.below).Whilethetwovaluesoffandldonotalwayscoincide,thesizesofallcorrelationcoef®cients(oncewiththefcoef®cient,oncewithl)correlatehighlysigni®cantly(g0.85;z6.923;p`0X001ÃÃÃ),whichiswhytheseminorrankingdifferenceswillnotbedealtwith.
742
Table4.
S.T.GRIES
Monofactorialcorrelationsbetweenvariablesandthechoiceofconstruction.
Correlationcoef®cientgÀ0.85***gÀ0.6***
f0.522***(l0.49)f0.492***(l0.366)rpbisÀ0.481***
f0.47***(l0.366)f0.468***(l0.32)f0.455***(l0.412)rpbis0.452***rpbis0.429***rpbisÀ0.423***rpbis0.414***
f0.411***(l0.387)rpbis0.357***
f0.339***(l0.314)fÀ0.328***(l0.253)f0.319***(l0.206)f0.314***(l0.268)f0.291***(l0.263)fÀ0.288***(l0.206)f0.284***(l0.16)f0.232***(l0.191)fÀ0.193***(l0.077)rpbis0.191***
f0.166***(l0.057)rpbis0.142**
f0.104*(l0.072)rpbis0.1*
f0.092***(l0)
fÀ0.047ns(l0.016)f0.023ns(l0)fÀ0.018ns(l0)f0.003ns(l0)
Variable/Variable:Value
ComplexityoftheDOIdiomaticityoftheVPComplex:simpleNPNPTypeoftheDO
LengthofthedirectobjectinsyllablesType:lexicalNPType:pronominalNP
Complex:intermediateNP
DistancetolastmentionoftheDO
CohesivenessoftheDOtotheprecedingdiscourseLengthoftheDOinwords
TimesofprecedingmentionoftheDOLastmentionoftheDOOverallmentionoftheDOConcretenessoftheDOIdiomaticity:idiomaticVPDetermineroftheDOIdiomaticity:literalVPRegister
DET.inde®nitedeterminer
DirectionaladverbialfollowingtheDODET.nodeterminerComplex:complexNP
TimesofsubsequentmentionoftheDOAnimacyoftheDO
CohesivenessoftheDOtothesubsequentdiscourseNextmentionoftheDO
DistancetonextmentionoftheDOType:semi-pronominalNPIdiomaticity:metaphoricalNPType:propername
DET:de®nitedeterminer
ParticleequalstheprepositionofthefollowingPP
mathematicallydifferentwaysofcalculatingthesecoef®cients,itisnotpossibletosimplycomparethevariables'powerbysimplycomparingtheabsolutevaluesofthecorrelationcoef®cients.Second,somevariablesthatwerepredictednottocorrelatesigni®cantlywithchoiceofconstructiondoinfactdisplayasigni®cantcorrelationsofurtherinvestigationiscalledfor.
AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
43
Lastly,aswasarguedabove,amonofactorialperspective(i)doesnotdojusticetothecomplexityofthephenomenonand(ii)de®esanycognitivelyrealisticaccountofthealternation.
Multifactorialresults:GLMandLinearDiscriminantAnalysis(LDA)ThemultiplecorrelationbetweenallvariablesincludedbythePHandthechoiceofconstructionasdeterminedbytheGLMishighlysigni®cant:r0X786;F71Y3317X512;p`0X001ÃÃÃ.Giventhelargenumberofinter-correlationsbetweenthepredictorvariables,however,thiscorrelationcoef®-cientneedstobeadjustedforshrinkageusingWherry'sformula;radjusted0X732.Still,thisvalueisstillquitehighandstronglysupportsthePH.Itdoessoespeciallywhenweconsiderthefollowingtwopoints:
RobtainedonthebasisofthePHisevenlargerthanRobtainedwhenweincludeallvariablesmentionedinTable1(namely,Radjusted0X718;F126Y2764X4;p`0X001ÃÃÃ),whichshowsthatthevariablesIhavechosentoeliminateonlyaddrandomnoisetotheanalysisanyway;
multiplecorrelationsthatareobtainedinotherbehaviouralsciencesareoftenmuchsmallersotheaccountofvarianceaccountedforinthepresentstudyiscomparativelylarge.
Butwhataboutthepredictivepowerofmyhypothesis?AnLDAshowsthatthevariablesincludedinthePHmakeitpossibletopredictwhichconstructionaspeakerwillchooseinaparticulardiscoursesituation.Thediscriminantfunctionishighlysigni®cant(canonicalr0X73;w2297X37;dXfX20Yp0ÃÃÃ).Moreover,thediscriminantfunctioncanclassify86.1%oftheconstructionalchoiceswithinthesample.However,itismoreimportanttoalsocross-validatethisresultinordertoavoidcircularityofreasoningbyusingcasesfortheirown`prediction'.Twomeasureswerethereforetakentoimprovetheanalysis:
(a)theleave-one-outmethodforcross-validation,yieldingaprediction
accuracyof84.1%;aresultthatisvirtuallyimpossibletoobtainbypurechance,accordingtotheexactbinomialtest,thechancefor339correcthitsin403trialsapproacheszero;
(b)thesplit-sampletechnique,whereIdividedthecorpusdataintoa
learningsampleandapredictionsampleinordertoderiveadiscriminantfunctionfromthelearningsamplewhichwassubsequentlyappliedtothepredictionsample.Inordernottobeaccusedofapossiblybiasedchoice
44
Table5.
S.T.GRIES
Cross-validatedpredictionaccuracyofLDAforsplitsamples.
Predictionsample53writtensentences53spokensentences26spokensentencesand27writtensentences
Correctpredictionsforpredictionsample
84.9%;
pbinomialtest%1X184Â10À7ÃÃÃ.2%;
pbinomialtest%0X027Ã75.5%;
pbinomialtest%1X343Â10À4ÃÃÃ74.8%
Learningsample200spokensentencesand150writtensentences150spokensentencesand200writtensentences174spokensentencesand176writtensentencesAverage
ofsamples,thiswasdonethreetimeswithrandomlychosensentencesfromthedifferentregisters.Table5givesanoverviewoftheresults.Obviously,theresultsarequiterobust:thepredictionaccuraciesareallsigni®cant,asdeterminedbytestingthecorrecthitrateagainsttheoneexpectedbypurechanceusingtheexactbinomialtest.8Thebestpredictionresultsareachievedforwrittendata,theworstfororaldata,whichistobeexpected,giventhemorespontaneousandinteractivenatureofnaturaldiscourseasopposedtoplannedwriting.
Letusnowtryto®ndoutwhichvariablesareresponsibleforthegooddiscriminationbetweenthetwoconstructions.ThepreviousresultswereconcernedwithanLDAwhere,fortheoreticalreasons,onlythevariablesofthePHwereincluded.Butwealsoneedto®ndoutwhetheritisempiricallyplausibletoexcludesomevariablesfromfurtherconsideration,beitonlytosupporttheresultsobtainedbytheGLM.Thus,asecondLDAwascompu-tedwhereallvariableswhereincluded;Table6summarizesitsresults:thehighertheabsolutevalueofafactorloadingforavariablethemoreimportantitisforthechoiceofconstructionintheonlycognitivelyrealisticanalysisofthesituation,namelywhenallofthevariablesareconsideredsimultaneously.
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐ
Onemightwonderwhythesplit-sampletechniqueyieldsslightlyworseresultsthantheleave-one-outmethod.Thisisduetothefactthatthelearningsamplesfortheleave-one-outmethodcontain52sentencesmorethanthoseofthesplit-sampletechnique.
8AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
45
Table6.Variable
Factorloadingsofthediscriminantanalysis.
Factorloading
0.5220.4980.4790.4470.3250.2810.1840.0440.0160.0060.002À0.021À0.086À0.094À0.098À0.135À0.157À0.183À0.223À0.278À0.309À0.337À0.358À0.422À0.427À0.445À0.474À0.496À0.573
KindofvariableMorphosyntactic
ChoiceofconstructionHighvariablevaluesAconstruction0LowvariablevaluesAconstruction1Duetothelowfactorloadings(`0X22)thesevariablesdonotdiscriminatewellbetweenthetwoconstructions
LengthSType:lexical
Complex:intermediateLengthW
Idiom:idiomaticDET:inde®niteComplex:complexIdiom:metaphoricalDET:de®niteDis¯uencyPartPrep
Type:propernameType:semipronominalClUSSCNM
COHSCAnimacyTOSM
DET:nodeterminerPP
Idiom:literalConcreteOMLMTOPMCOHPCACTPC
Type:pronominalComplex:simple
Semantic
MorphosyntacticMorphosyntacticSemantic
MorphosyntacticOtherOther
MorphosyntacticDiscourse-functional(subsequentcontext)Semantic
Discourse-functional(s.c.)
MorphosyntacticOtherSemantic
Discourse-functionalDiscourse-functional(precedingcontext)Morphosyntactic
HighvariablevaluesAconstruction1LowvariablevaluesAconstruction0Obviously,thePHisagainstronglysupported.Notonlydowe®ndthatallvariablesincludedinthePHcorrelatewiththechoiceofconstructionaspredictedÐinthemultifactorialanalysis,weseethatthevariablesconcernedwiththesubsequentcontextareindeedirrelevant,aswaspredictedbythePH.Onthewhole,we®ndthefollowingrankingofstrengthofvariablegroups:
46
S.T.GRIES
discourse-functionalvariables(precedingcontext),syntacticvariables,semanticvariablesandothervariables.9Multifactorialresults:Classi®cationandRegressionTrees(CART)
WhiletheresultsoftheLDAarequiteconvincing,thereisoneobjectionconcerningtheapplicationofanLDAthatmightberaised.ItisconcernedwiththestandardassumptionthatanLDArequiresamultivariatenormaldistributionofthedata,andonemight(correctly)claimthatitisdoubtfulthatmydatadoindeedmeetthisdemandand,thus,thattheaboveresultsaretobetakenwithagrainofsalt.However,thereareseveralargumentssupportingtheaboveanalysis,resultsandinterpretationeventhoughthedistributionalassumptionsofLDAsarenotmet.
First,whilemanyresearcherstendtoemphasizetheimportanceofdistributionalassumptions(suchasnormality,homogeneityofvariancesandthelike),anumberofscholarsarguethat,inpractice,theseassumptionsarenotasessentialastheymightseemonapurelymathematicalbasis(cf.Wineretal.,1991,p.5).Second,ithasevenbeenclaimedthatthereisnotestthatreliablyidenti®esmultivariatenormaldistributions(cf.Bortz,1999,p.435).Third,thedifferencebetweenLDAandCARTisofcoursenotjustastatistical/mathematicaloneÐrather,thereisalsoaconceptualdifference:whileanLDAincludesallvariablessimultaneouslyinthecalculationtocomputeapredictionforoneofthetwoconstructionalchoices,thetreeresultingfromCARTanalysesincludesvariablessequentially.Foranativespeakerhowever,IbelievethatthemodelunderlyingLDAismorerealistic.Itisintuitivelymoreplausibletoassumethatallthevariables'values/levelsIhavediscussedaresomehowsetatthepointoftimethespeakerchoosesthewordorderratherthanthatthevalues/levelsareincludedonebyoneinasequentialfashion.Moreover,whilethereisstillconsiderabledebatewhetherpsycholinguistictheoriesofspeechproductionshouldincorporateparallelorserialmodelsofprocessing,I,followingBerg(1998),considerparallelprocessingtheoriesmorerewardingfrombothatheoreticalandapracticalpointofview.Ihavedecided,forthesereasons,topredictnativespeakers'choiceswithanLDAwhich,asopposedtoCART,comesclosesttopredictingchoicesonthebasisofasimultaneous/parallelinclusionoftherelevantdata.
ÐÐÐÐÐÐÐÐÐÐÐÐÐÐ9Thiswasdeterminedbycalculating(i)theAMsoftheabsolutevaluesofthefactorloadingsforeachvariablegroupand(ii)themediansoftheranksofallvariablesinagroupwhenthevaiablesareorderedaccordingtotheirfactorloadings.Bothresultswereidentical.
AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
47
Table7.Parameter
ParametersandsettingsoftheCARTanalysis.
Setting
CART-styleexhaustivesearchforunivariatesplitsFACT-styledirectstopping:fractionofobjects0.5identical:construction0:p%0X5
construction1:p%0X5
Ginimeasure
MethodStoprule
PriorprobabilitiesGoodness-of-®tindex
Nevertheless,itmightverywellbethecasethatthesereasonsdonotsatisfytrulymathematically-orientedresearchers.Ihave,therefore,alsoanalyzedmydatausingtheCARTmoduleofStatistica5.5;thealgorithmsusedthereinarebasedonCARTbyBreimanetal.(1984),whereCARTandQUESTalgorithmsareusedtoclassifyandpredictdataintheabsenceofdistributionalassumptions.MyCARTanalysisofthedatawasbasedontheparametersandsettingsgiveninTable7.
Theresultoftheanalysiscanbesummarizedasfollows:outofall403sentences,349(86.6%)wereclassi®edcorrectlywhile(13.4%)wereclassi®edwrongly,againaresultthatisextremelyunlikelytobeobtainedrandomly.Again,however,wemustalsodeterminethepredictionaccuracybyacross-validationtechnique.Sincetheleave-oneoutmethodforCARTisnotavailableinStatistica5.5,Iusedonlythesplit-sampletechniqueanalogoustotheLDA;thesamplesandtheresultsarelistedinTable8.
Admittedly,thecross-validatedpredictionaccuracyofCARTisnotashighastheLDAresults,but,apartfromthepredictionsamplefororaldataalone,
Table8.
Cross-validatedpredictionaccuracyofCARTforsplitsamples.
Predictionsample53writtensentences53spokensentences26spokensentencesand27writtensentences
Correctpredictionsforpredictionsample
81.1%;
pbinomialtest%2X775Â10À6ÃÃÃ56.6%;
pbinomialtest%0X205ns77.4%;
pbinomialtest%4X086Â10À5ÃÃÃ71.7%
Learningsample200spokensentencesand150writtensentences150spokensentencesand200writtensentences174spokensentencesand176writtensentencesAverage
theyarestillwaybetterthanwhatmightbeexpectedbypurechance.Moreover,thereisareasonfortheseminordifferences.Giventheaboveparametersettings,theCARTtechniquedoesnotutilizeallvariablesforthepredictionofachoiceofconstructionbutonlythemostimportantonesasdeterminedbytheanalysis.Thus,forcaseswherevariablesofanoverallminorimportancearedecisive,falsepredictionsaremorelikely.
Asfarastheimportanceoftheindividualvariablesisconcerned,theoverallpicturedoesnotdifferstronglyfromtheresultsoftheLDA.Theoverallran-kingofthevariablegroupsfromCARTisidenticaltothatoftheLDA;forthesakeofcompleteness,Figure1showstheresultsfortheindividualvariables.
SUMMARY
Foreachrelevantvariableeverinvestigated,itwasshownhowitcontributestoparticleplacementinisolation.Moreover,itwasshownhowallofthesevariablestogetheryieldapreferenceforaconstructioninparticulardiscoursesituations.Itisnowpossibletopredictwithquiteahighpredictionaccuracywhataspeakerwillsayifthediscoursesituationhe/sheisengaginginisknowntotheanalyst.
Ahypothesiswasproposedandsupportedthatincludesallrelevantvariablesandthatcorrectlypredictedsomevariablesnottoberelevant.It
AMULTIFACTORIALANALYSISOFSYNTACTICVARIATION
49
couldbeshownthatacognitivelyrealisticapproachtolanguageusagemadeitpossibletosummarizeandextendthepreviousknowledgeonparticleplacement.
Onamethodologicallevel,wehaveseenthattheanalysisofsyntacticvariationcanbene®tconsiderablyfromtheuseofmultifactorialtechniquesjustastheanalysisofregistervariationhaspro®tedfromBiber's(1988)groundbreakingwork.Personally,Iwouldgoasfarastosaythatonlybysuchtechniquescanwestarttoreallydetecthithertounknownpatternsthatarenotalreadyknownfromearlytraditionalgrammarians'works(aswas,unfortu-nately,thecasewithmanyworksonParticleMovement).Wehavealsoseenthatdifferentmultifactorialtechniques,althoughquitedifferentfromoneanotherwithrespecttotheirdistributionalassumptions,yieldcomparableresults.BothLDAandCARTachieveconvincingclassi®cationandpredictionaccuracies,whichalsostronglysupportthePH.Moreover,theindividualvariables'importanceratingsofthetwoproceduresarestrikinglysimilar,so,atleastforthecaseathand,thedifferentmathematicalrequirementsofbothkindsofanalysesdonotseemtoplayavitalrole.
Lastly,itwasatleastbrie¯yhintedatthewealthofinformationthatcanbeobtainedonthelinguisticdataandthewayspeakerspresumablyorganizetheirknowledgeinordertoarriveatconstructionalchoices.ThisisnottosaythatspeakersactuallyperformLDAorCARTanalyses,butitismeanttoimplythatwecanlearnsomethingabouttheimportanceof(groupsof)variablesintheprocessofonlineproductionandanymodeloflanguageproductionshouldbeabletoaccommodatethesefactsincognitively/psychologicallyrealways.Onepossiblemodelthatcanaccommodatethe®ndingsreportedabovenaturallyistheCompetitionmodelbyBatesandMacWhinney(1982,19),wheredifferentvariables'cuestrengthscompeteinordertogettheirpreferencesrecognized.Moreover,thepresent®ndingscanbereadilyinte-gratedintoactivationmodelswherevariableweightings(beitintermsoffactorloadingsorimportancevalues)correspondtoassociationstrengthsorsimilarconcepts.Inthisrespect,theinvestigationoffurthercasesofsyntacticvariationcanprobablyshedlightonthenatureofactivationnetworks.
REFERENCES
Bates,E.,&MacWhinney,B.(1982).Functionalistapproachestogrammar.InE.Wanner&
L.R.Gleitman(Eds.),LanguageAcquisition:TheStateoftheArt.Cambridge:CambridgeUniversityPress.
50
S.T.GRIES
Bates,E.,&MacWhinney,B.(19).Functionalismandthecompetitionmodel.InE.Bates&
B.MacWhinney(Eds.),Thecrosslinguisticstudyofsentenceprocessing.Cambridge:CambridgeUniversityPress.
Berg,T.(1998).Linguisticstructureandchange:Anexplanationfromlanguageprocessing.
Oxford:ClarendonPress.
Biber,D.(1988).Variationacrossspeechandwriting.Cambridge:CambridgeUniversityPress.
ÈrSozialwissenschaftler.5.Ausgabe.Berlin,Heidelberg,NewYork:Bortz,J.(1999).Statistikfu
Springer.
Breiman,L.,Friedman,J.H.,Olshen,R.A.,&Stone,C.J.(1984).Classi®cationandregression
trees.Monterey,CA:Wadsworth&Brooks/ColeAdvancedBooks&Software.
Chen,P.(1986).DiscourseandparticlemovementinEnglish.StudiesinLanguage,10,79±95.DenKikken,M.(1992).Particles.HollandInstituteofLinguisticsDissertations.TheHague:
HollandAcad.Graphics.
DenDikken,M.(1995).Particles:Onthesyntaxofverb-particle,triadic,andcausative
constructions.Oxford:OxfordUniversityPress.
Fraser,B.(1974).ThephrasalverbinEnglish,byDwightBolinger.Language,50,568±575.Fraser,B.(1976).Theverb-particlecombinationinEnglish.NewYork:AcademicPress.
Ân,T.(1992).Thegrammarofreferentialcoherenceasmentalprocessinginstructions.Givo
Linguistics,30,5±55.
Gries,S.T.(1999).Particleplacement:Acognitiveandfunctionalaccount.Cognitive
Linguistics,10,105±146.
Gries,S.T.(2000).Towardsmultifactorialanalysesofsyntacticvariation:Thecaseofparticle
placement.DoctoralDissertation,UniversityofHamburg,FacultyofLanguageSciences.
Hawkins,J.A.(1994).Aperformancetheoryoforderandconstituency.Cambridge:Cambridge
UniversityPress.
Hawkins,J.A.(2000).Therelativeorderingofprepositionalphrasesinenglish:Goingbeyond
manner±place±time.LanguageVariationandChange,11,231±266.
Jespersen,O.(1928).AmodernEnglishgrammaronhistoricalprinciples.London:George
AllenandUnwinLtd.
Kruisinga,E.,&Erades,P.A.(1953).AnEnglishGrammar.Vol.I.Groningen:P.Noordhoff.Labov,W.(1975).Empiricalfoundationsoflinguistictheory.InR.Austerlitz(Ed.),TheScope
ofAmericanLinguistics(pp.77±133).Lisse:ThePeterdeRidderPress.
Prince,E.F.(1991).Onfunctionalexplanationinlingisticsandtheoriginsoflanguage.
LanguageandCommunication,11,79±82.
Rohdenburg,G.(1996).Cognitivecomplexityandincreasedgrammaticalexplicitnessin
English.CognitiveLinguistics,7,149±182.
Rohrbacher,B.(1994).Englishmainverbsmovenever.ThePennReviewofLinguistics,18,
145±159.
Ètze,C.T.(1996).Theempiricalbaseoflinguistics:GrammaticalityjudgementsandSchu
linguisticmethodology.Chicago:UniversityofChicagoPress.Sweet,H.A.(12).AnewEnglishgrammar.Oxford:ClarendonPress.
Winer,B.J.,Brown,D.R.,&Michels,K.M.(1991).Statisticalprinciplesinexperimental
design.3rded.NewYork:McGraw-Hill.
Yeagle,R.(1983).ThesyntaxandsemanticsofEnglishverb-particleconstructionswithoff:A
spacegrammaranalysis.UnpublishedM.A.Thesis,SouthernIllinoisUniversityatCarbondale.
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- sarr.cn 版权所有 赣ICP备2024042794号-1
违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务