您好,欢迎来到飒榕旅游知识分享网。
搜索
您的当前位置:首页sift

sift

来源:飒榕旅游知识分享网
AnimplementationofSIFTdetectoranddescriptor

AndreaVedaldi

UniversityofCaliforniaatLosAngeles

Contents

1Introduction

2Userreference:thesiftfunction

2.1Scalespaceparameters.......2.2Detectorparameters........2.3Descriptorparameters.......2.4DirectaccesstoSIFTcomponents

........................................................................................................................

1123344456

AInternals

A.1Scalespaces...........................................A.2Thedetector..........................................A.3Thedescriptor.........................................

1Introduction

ThesenotesdescribeanimplementationoftheScale-InvariantTransformFeature(SIFT)detectoranddescriptor[1].TheimplementationisdesignedtoproduceresultscompatibletoLowe’sversion.1DesignedfortheMATLABenvironment,itisbrokendownintoseveralMandMEXfilesthatenablerunningonlyportionofthealgorithm.

TheSIFTdetectorextractsfromanimageacollectionofframesorkeypoints.Theseareorienteddisksattachedtoblob-alikestructuresoftheimage.Astheimagetranslates,rotatesandscales,theframestracktheseblobsandthusthedeformation.Bycanonization,i.e.bymappingtheframestoareference(acanonicaldisk),theeffectofsuchdeformationonthefeatureappearanceisremoved.

TheSIFTdescriptorisacoarsedescriptionoftheedgefoundintheframe.Duetocanonization,descriptorsareinvarianttotranslations,rotationsandscalingsandaredesignedtoberobusttoresidualsmalldistortions.

TheSIFTdetectoranddescriptorarediscussedindepthin[1].Hereweonlydescribetheinterfacetoourimplementationand,intheAppendix,sometechnicaldetails.

2Userreference:thesiftfunction

TheSIFTdetectorandtheSIFTdescriptorareinvokedbymeansofthefunctionsift,whichprovidesaunifiedinterfacetoboth.

Example1(Invocation).ThefollowinglinesruntheSIFTdetectoranddescriptorontheimagedata/test.jpg.

1See

http://www.cs.ubc.ca/~lowe/keypoints/

1

I=imread(’data/test.png’);I=double(rgb2gray(I)/256);

[frames,descriptors]=sift(I,’Verbosity’,1);

Thepairoption-value’Verbosity’,1causesthefunctiontoprintadetailedprogressreport.Thesiftfunctionreturnsa4×KmatrixframescontainingtheSIFTframesanda128×Kmatrixdescriptorscontainingtheirdescriptors.Eachframeischaracterizedbyfournumberswhichareinorder(x1,x2)forthecenteroftheframe,σforitsscaleandθforitsorientation.Thecoordinates(x1,x2)arerelativetotheupper-leftcorneroftheimage,whichisassignedcoordinates(0,0),andmaybefractionalnumbers(sub-pixelprecision).Thescaleσisthesmoothinglevelatwhichtheframehasbeendetected.Thisnumbercanalsobeinterpretedassizeoftheframe,whichisusuallyvisualizedasadiskofradius6σ.Eachdescriptorisavectordescribingcoarselytheappearanceoftheimagepatchcorrespondingtotheframe(furtherdetailsarediscussedinAppendixA.3).Typicallythisvectorhasdimension128,butthisnumbercanbechangedbytheuserasdescribedlater.

OnceframesanddescriptorsoftwoimagesI1andI2havebeencomputed,siftmatchcanbeusedtoestimatethepairsofmatchingfeatures.ThisfunctionusesLowe’smethodtodiscardambiguousmatches[1].Theresultisa2×Mmatrix,eachcolumnofwhichisapair(k1,k2)ofindicesofcorrespondingSIFTframes.

Example2(Matching).LetusassumethattheimagesI1andI2havebeenloadedandprocessedasinthepreviousexample.Thecode

matches=siftmatch(descriptors1,descriptors2);storesinmatchesthematchingpairs,onepercolumn.

Thepackageprovidessomeancillaryfunctions;youcan•useplotsiftframetoplotSIFTframes;

•useplotsiftdescriptortoplotSIFTdescriptors;•useplotmatchestoplotfeaturematches;

•usesiftreadtoreadfilesproducedbyLowe’simplementation.

Example3(Visualization).LetI1,I2andmatchesbeasinthepreviousexample.Tovisualizethematchesissue

plotsiftmatches(I1,I2,frames1,frames2,matches)

Thesiftfunctionhasmanyparameters.ThedefaultvalueshavebeenchosentoemulateLowe’soriginalimplementation.Althoughourcodedoesnotresultinframesanddescriptorsthatare100%equivalent,ingeneraltheyarequitesimilar.

2.1Scalespaceparameters

TheSIFTdetectoranddescriptorareconstructedfromtheGaussianscalespaceofthesourceimageI(x).TheGaussianscalespaceisthefunction

G(x;σ)=(gσ∗I)(x)

wheregσisanisotropicGaussiankernelofvarianceσ2I,xisthespatialcoordinateandσisthescalecoordinate.Thealgorithmmakeuseofanotherscalespacetoo,calleddifferenceofGaussian(DOG),whichis,coarselyspeaking,thescalederivativeoftheGaussianscalespace.

SincethescalespaceG(x;σ)representsthesameinformation(theimageI(x))atdifferentlevelsofscale,itissampledinaparticularwaytoreduceredundancy.Thedomainofthevariableσis

2

discretizedinlogarithmicstepsarrangedinOoctaves.EachoctaveisfurthersubdividedinSsub-levels.Thedistinctionbetweenoctaveandsub-levelisimportantbecauseateachsuccessiveoctavethedataisspatiallydownsampledbyhalf.Octavesandsub-levelsareidentifiedbyadiscreteoctaveindexoandsub-levelindexsrespectively.Theoctaveindexoandthesub-levelindexsaremappedtothecorrespondingscaleσbytheformula

σ(o,s)=σ02o+s/S,

o∈omin+[0,...,O−1],

s∈[0,...,S−1]

(1)

whereσ0isthebasescalelevel.

ThesiftfunctionacceptsthefollowingparametersdescribingtheGaussianscalespacebeingused:

•NumOctaves.ThisisthenumberofoctavesOin(1).

•FirstOctave.Indexofthefirstoctaveomin:theoctaveindexovariesinomin,...,omin+O−1.Itisusuallyeither0or−1.Settingominto−1hastheeffectofdoublingtheimagebeforecomputingtheGaussianscalespace.•NumLevels.Thisisthenumberofsub-levelsSin(1).•Sigma0.Basesmoothing:Thisistheparameterσ0in(1).

•SigmaN.Nominalpre-smoothing:Thisisthenominalsmoothingleveloftheinputimage.Thealgorithmassumesthattheinputimageisactually(gσn∗I)(x)asopposedtoI(x)andadjuststhecomputationsaccording.Usuallyσnisassumedtobehalfpixel(0.5).

2.2Detectorparameters

TheSIFTframes(x,σ)arepointsoflocalextremumoftheDOGscalespace.Theselectionofsuchpointsiscontrolledbythefollowingparameters:

•Threshold.Localextremathreshold.Localextremawhosevalue|G(x,;σ)|isbelowthisnumberarerejected.•EdgeThreshold.Localextremalocalizationthreshold.Ifthelocalextremumisonavalley,thealgorithmdiscardsitasitistoounstable.Extremaareassociatedwithascoreproportionaltotheirsharpnessandrejectedifthescoreisbelowthisthreshold.•RemoveBoundaryPoints.Boundarypointsremoval.Ifthisparameterissetto1(true),frameswhicharetooclosetotheboundaryoftheimagearerejected.

2.3Descriptorparameters

TheSIFTdescriptorisaweightedandinterpolatedhistogramofthegradientorientationsandlocationsinapatchsurroundingthekeypoint.Thedescriptorhasthefollowingparameters:

•Magnif.Magnificationfactorm.Eachspatialbinofthehistogramhassupportofsizemσ,whereσisthescaleoftheframe.•NumSpatialBins.Numberofspatialbins.Togetherwiththenextparameter,thisnumberdefinestheextensionanddimensionofthedescriptor.Thedimensionofthedescriptor(thetotalnumberofbins)isequaltoNumSpatialBins2×NumOrientBinsanditsextension(thepatchwherethegradientstatisticiscollected)hasradiusNumSpatialBins×mσ/2.•NumOrientBins.Numberoforientationbins.

3

Symbolo∈[omin,omin+O−1]s∈[smin,smax]σ(o,s)=σ02o+s/Sσ0

M0,N0

M00No=󰀇N2o󰀈,Mo=󰀇2o󰀈xo∈[0,...,No]×[0,...,Mo]x=2oxoF(·,σ(o,·))G(x,σ)D(x,σ)

DescriptionOctaveindexandrangeScaleindexandrangeScalecoordinateformulaBasescaleoffset

Basespatialresolution(octaveo=0)OctavelatticesizeformulasSpatialindexesandragesSpatialcoordinateformulaOctavedataGaussianscalespaceDOGscalespace

Inthecodeo,O,omins,smin,smaxsigma0

octavegssdogss

Figure1:Scalespaceparameters.TheSIFTdescriptorsusestwoscalespaces:aGaussianscalespaceandaDifferenceofGaussianscalespace.Botharedescribedbytheseparameters.

2.4DirectaccesstoSIFTcomponents

TheSIFTcodeisdecomposedinseveralMandMEXfiles,eachimplementingaportionofthealgo-rithm.Theseprogramscanberunontheirownorreplaced.AppendixAcontainsinformationusefultodothis.

Example4(ComputingtheSIFTdescriptordirectly).Sometimesitisusefultorunthedescriptorcodealone.Thiscanbedonebycallingthefunctionsiftdescriptor(whichisactuallyaMEXfile.)Seethefunctionhelpforfurtherdetails.

References

[1]D.G.Lowe.Distinctiveimagefeaturesfromscale-invariantkeypoints.InternationalJournalof

ComputerVision,2(60):91–110,2004.

A

A.1

Internals

Scalespaces

HereascalespaceisafunctionF(x,σ)∈Rofaspatialcoordinatex∈R2andascalecoordinateσ∈R+.SinceascalespaceF(·,σ)typicallyrepresentsthesameinformationatvariousscalesσ∈R,itsdomainissampledinaparticularwayinordertoreducetheredundancy.

Thescalecoordinateσisdiscretizedinlogarithmicstepsaccordingto

σ(s,o)=σ02o+s/S,

o∈Z,

s=0,...,S−1

whereoistheoctaveindex,sisthescaleindex,S∈Nisthescaleresolutionandσ0∈R+isthebasescaleoffset.Notethatitispossibletohaveoctavesofnegativeindex.

Thespatialcoordinatexissampledonalatticewitharesolutionwhichisafunctionoftheoctave.Wedenotexothespatialindexforoctaveo;thisindexismappedtothecoordinatexby

x=2oxo,

o∈Z,

xo∈[0,...,No−1]×[0,...,Mo−1].

4

where(No,Mo)isthespatialresolutionofoctaveo.If(M0,N0)isthetheresolutionofthebaseoctaveo=0,theresolutionoftheotheroctavesisobtainedas

No=󰀇

N0

󰀈,2oMo=󰀇

M0

󰀈.2oItwillbeusefultostoresomescalelevelstwice,acrossdifferentoctaves.WedothisbyallowingtheparameterstobenegativeorgreaterthanS.Formally,wedenotetherangeofsas[smin,smax].Wealsodenotetherangeoftheoctaveindexoas[omin,omin+O−1],whereO∈Nisthetotalnumberofoctaves.SeeFigure1forasummaryofthesesymbols.

TheSIFTdetectormakesuseofthetwoscalespacesdescribednext.GaussianScaleSpace.TheGaussianscalespaceofanimageI(x)isthefunction

G(x,σ)=(gσ∗I)(x)

wherethescaleσ=σ02o+s/Sissampledasexplainedintheprevioussection.Thisscalespaceiscomputedbythefunctiongaussianss.Inpractice,itisassumedthattheimagepassedtothefunctiongaussianssisalreadypre-smoothedatanominallevelσn,sothatG(x,σ)=(g√σ2−σ2∗I)(x).Assuggestedin[1],thepyramidiscomputedincrementallyfromthebottombysuccessiveconvolutionswithsmallkernels.

DifferenceofGaussiansscalespace.ThedifferenceofGaussians(DOG)scalespaceisthescale“derivative”oftheGaussianscalespaceG(x,σ)alongthescalecoordinateσ.Itisgivenby

D(x,σ(s,o))=G(x,σ(s+1,o))−G(x,σ(s,o)).

ItisobtainedfromtheGaussianscalespacebythediffssfunction.

Remark1(Lowe’sparameters).Lowe’simplementationusesthefollowingparameters:

σn=0.5,

σ0=1.6·21/S,

omin=−1,

S=3

n

Inordertocomputetheoctaveo=−1,theimageisdoubledbybilinearinterpolation(fortheenlargedimageσn=1).Inordertodetectextremaatallscales,theDifferenceofGaussianscalespacehass∈[smin,smax]=[−1,S].SincetheDifferenceofGaussianscalespaceisobtainedbydifferentiatingtheGaussianscalespace,thelatterhass∈[smin,smax]=[−1,S+1].TheparameterOissettocoveralloctaves(i.e.asbigaspossible.)

A.2Thedetector

TheSIFTframes(or“keypoints”)areaselectionof(sub-pixelinterpolated)points(x,σ)oflocalextremumoftheDOGscale-spaceD(x,σ),togetherwithanorientationθderivedfromthespatialderivativeoftheGaussianscale-spaceG(x,σ).Forwhatconcernsthedetector(andbeingingeneraldifferentforthedescriptor),the“support”ofakeypoint(x,σ)isaGaussianwindowH(x)ofdeviationσw=1.5σ.Inpractice,thewindowistruncatedat|x|≤4σw.

TheGaussianandDOGscalespacesarederivedasinSectionA.1.InthisSection,theparametersS,O,smin,smax,omin,σ0refertotheDOGscalespace.TheGaussianscalespacehasexactlythesameparametersoftheDOGscalespaceexceptforsDOGmaxwhichisequaltosmax−1.

Theextractionofthekeypointsiscarriedoneoctavepertimeandarticulatedinthefollowingsteps:

•Detection.KeypointsaredetectedaspointsoflocalextremumofD(x,σ)(SectionA.1).Intheimplementationthefunctionsiftlocalmaxextractssuchextremabylookingat9×9×9neighborhoodsofsamples.

5

x2

Npx1

-2-1.5-1-0.50.511.52mσFigure2:SIFTdescriptorlayout.Theactualsizeofaspatialbinismσwhereσisthescaleofthekeypointandm=3.0isanominalfactor.

Astheoctaveisrepresentedbya3Darray,thefunctionsiftlocalmaxreturnsindexesk(inMatlabconvetion)thataretobemappedtoscalespaceindexes(x1,x2,s)by

k−1=x2+x1Mo+(s−smin)MoNo.

Alternatively,onecanuseind2subtomaptheindexktoasubscript(i,j,l)andthenuse

x1=j−1,

x2=i−1,

s=l−1+smin.

Becauseofthewaysuchmaximaaredetected,onehasalways1≤x2≤Mo−2,1≤x1≤No−2andsmin+1≤s≤smax−1.

Sinceweareinterestedbothinlocalmaximaandminima,theprocessisrepeatedfor−G(x,σ).(Ifonlypositivemaximaandnegativeminimaareofinterest,anotheroptionistotakethelocalmaximaof|G(x,σ)|directly,whichisquicker.)

•Sub-pixelrefinement.Afterbeingextractedbysiftlocalmax,theindex(x1,x2,s)isfittedtothelocalextremumbyquadraticinterpolation.Atthesametime,athresholdonthe“intensity”D(x,σ)andatestonthe“peakedness”oftheextremumisappliedinordertorejectweakpointsorpointsonedges.Theseoperationsareperformedbythesiftrefinemxfunction.Theedgerejectionstepisexplainedindetailinthepaper[1].Thesub-pixelrefinementisaninstanceofNewton’salgorithm.

•Orientation.Theorientationθofakeypoint(x,σ)isobtainedasthepredominantorien-tationofthegradientinawindowaroundthekeypoint.Thepredominantorientationisob-tainedasthe(quadraticallyinterpolated)maximumofthehistogramofthegradientorientations∠∇G(x1,x2,σ)withinawindowaroundthekeypoint.Thehistogramisweightedbothbythemagnitudeofthegradient|∇G(x1,x2,σ)|andaGaussianwindowcenteredonthekeypointandofdeviation1.5σ(theGaussianwindowdefinestheregionofinterestaswell).Aftercollectingthedatainthebinsandbeforecomputingthemaximum,thehistogramissmoothedbyamovingaveragefilter.Inadditiontotheglobalmaximum,eachlocalmaximumwithavalueabove0.8%ofthemaxiumisretainedaswell.ThusforeachlocationandscalemultipleSIFTframesmightbegenerated.Thesecomputationsarecarriedbythefunctionsiftormx.

A.3Thedescriptor

TheSIFTdescriptorofakeypoint(x,σ)isalocalstatisticoftheorientationsofthegradientoftheGaussianscalespaceG(·,σ).

6

Histogramlayout.TheSIFTdescriptor(Figure2)isanhistogramofthegradientorientationsaugmentedand2-DlocationsinthesupportoftheSIFTframe.Formally,thedomainofthehistogramarethetuples(x,θ)∈R2×R/Z.ThebinsformathreedimensionallatticewithNp=4binsfor

2

eachspatialdirectionandNo=8binsfortheorientationforatotalofNpNo=128components(thesenumberscanbechangedbysettingtheappropriateparameters).Eachspatialbinissquarewithunitaryedge.ThewindowH(x)isGaussianwithdeviationequaltohalftheextensionofthespatialbinrange,thatisNp/2.

Keypointnormalization.Inordertoachieveinvariance,thehistogramlayoutisprojectedontheimagedomainaccordingtotheframeofreferenceofthekeypoint.Thespatialdimensionsaremultipliedbymσwhereσisthescaleofthekeypointandmisanominalfactor(equalto3.0bydefault).Thelayoutisalsorotatedsothattheaxisx1isalignedtothedirectionθofthekeypoint.Weighting.ThehistogramisweightedbythegradientmodulusandaGaussianwindowedandtri-linearlyinterpolated.Moreindetail,eachsample(x1,x2,∠∇G(x,σ))is

•weightedby|∇G(x,σ)|;

•weightedbythegaussianwindowH(x);

•projectedonthecentersoftheeightsurroundingbins;

•summedtoeachofthisbinsproportionallytoitsdistancefromtherespectivecenter.

Remark2.(Lowe’simpelmentation)InordertoachievefullcompatibilitywithLowe’soriginalimple-mentation,oneneedstopayattentiontomanylittledetailsasthememorylayoutofthedescriptorandtheconventionforthegradientorientations.Thesearedetailedinthesourcecode.

7

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- sarr.cn 版权所有

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务