Toboggan- Based Intelligent Scissors with a Four- Parameter Edge Model
Eric N. Mortensen ( enm@ cs. byu. edu) William A. Barrett ( barrett@ cs. byu. edu)
Brigham Young University
Abstract
Intelligent Scissors is an interactive image segmentation
tool that allows a user to select piece- wise globally optimal
contour segments that correspond to a desired object bound-ary.
We present a new and faster method of computing the
optimal path by over- segmenting the image using toboggan-ing
and then imposing a weighted planar graph on top of the
resulting region boundaries. The resulting region- based
graph is many times smaller than the previous pixel- based
graph, thus providing faster graph searches and immediate
user interaction. Further; tobogganing provides an new sys-tematic
and predictable framework for computing edge
model parameters, allowing subpixel localization as well as
a measure of edge blu,:
1. Introduction
All general purpose image segmentation techniques
require some amount of human guidance due to the large
variety of image sources, content and complexity. Conse-quently,
a goal of any general image segmentation algorithm
should be to accurately define the desired object boundary
or region with minimal user input. A variety of segmenta-tion
tools exist ranging from user intensive lassoing to semi-automatic
initialization schemes such as magic wands and
active contour models. Unfortunately, lassoing or other
primitive manual tracing tools are still widely used when a
non- homogeneous image component must be extracted
from a complex background. For this reason intelligent seg-mentation
tools which exploit high level visual expertise but
require minimal user interaction become appealing.
Recently, we presented a unique segmentation algorithm
called Intelligent Scissors [ 12,13] which allows a human
operator to interactively select an optimal boundary seg-ment,
called a “ live- wire” [ 11,7], corresponding to a portion
of the desired object edge. When the mouse position comes
in proximity to an object edge, the live- wire “ snaps” to, and
wraps around the object of interest. Thus. Intelligent Scis-sors
allows objects to be extracted quickly and accurately
using simple gesture motions with a mouse.
This paper presents improvements to our Intelligent
Scissors tool based on an oversegmentation method known
as tobogganing that allows for faster optimal path computa-tion
in the interactive live- wire environment. Further, tobog-ganing
provides a new framework for computing edge
model parameters based on well- defined foregroundhack-ground
regions and edge position. This provides sub- pixel
localization, blur, and foregroundhackground color.
2. Previous Work
Among the boundary based segmentation strategies,
active contours or snakes have received considerable atten-tion
over the last few years [ 1,3,8,9,18]. Active contours are
initialized with an approximate boundary and then itera-tively
minimize the contour’s energy functional to achieve
an optimal boundary. The functional combines external and
internal forces in an attempt to yield a final smooth contour
that conforms to the desired object boundary. Since snakes
are typically initialized with an approximate contour, the
human operator is often not quite sure what the final bound-ary
will be until the active contour settles into a minimum.
Some implementations allow the user to interactively niod-ify
the energy landscape and thus nudge a snake [ 3,8], but
still the user does not typically know exactly what the final
boundary will be during interaction.
Intelligent Scissors presents a different approach tot the
Segmentation problem. Rather than optimizing a user- initial-ized
approximate contour, we allow the user to interactively
select a boundary from a collection of optimal solutions.
Thus the user knows exactly what the resulting contour will
be during interaction. Intelligent Scissors achieves this goal
by interactively computing the optimal path from a user
selected “ seed” point to all other points in the image. A!; the
cursor moves, the optimal path from the pointer position to
the seed point is displayed, allowing the user to select an
optimal contour segment which visually corresponds to a
portion of the desired object boundary.
3. Toboggan- Based Intelligent Scissors
Intelligent Scissors, as presented in [ 131, formulates the
boundary detection problem as a weighted graph search
where each pixel is a node and edges are created between
each pixel and its 8 neighbors. In this paper, we first parti-tion
the image into a collection of regions and then impose a
weighted planar graph onto the resulting region boundaries.
This greatly reduces the graph size which speeds up the
graph search and improves interactive responsiveness.
3.1. Tobogganing
Tobogganing [ 6,19] oversegments an image into small
regions by sliding in the derivative terrain. Given the gradi-ent
magnitude of an image, each pixel determines a slide
direction by finding the pixel in a neighborhood with the
lowest gradient magnitude. Generally, several pixels will
represent local minima in the gradient terrain within their
own neighborhood due to either image features or noise.
Pixels that “ slide” to the same local minimum are grouped
together, thus segmenting the image into a collection of
small regions ( Fig. 1).
The regions produced by tobogganing are effectively
identical to the catchment basins produced by applying the
popular watershed algorithm to the gradient image[ 2,14,17].
However, tobogganing is much more computationally effi-cient
than the watershed algorithm.
452
0- 7695- 0149- 4/ 99 $ 10.00 0 1999 IEEE
( d
Figure 1: ( a) Original image ( b) Expanded section of ( a) showing
toboggan region boundaries, slide directions, and local minima. ( E) Numer-ical
illustration of ( b) showing gradient magnitudes, region boundaries
(: hick lines). slide directions ( arrow heads), and local minima ( circled).
Nodes ( shaded circles) are created where 3 or 4 regions meet at a pixel cor-ner
while region boundary segments between two nodes become edges. The
shadedptxels show an example gradient index vector ( Section 3 2 2).
3.1.1. Multi- Scale Gradient
puted using multi- scale derivatives of Gaussian kernels. If
The gradient magnitude used for tobogganing is com-is
a 2- D normal distribution with a standard deviation of a,
the multi- scale gradient magnitude is given by
where
G( xj, Y- 4) = ( 2)
( 3)
3.1.2. Sliding and Grouping
Given a gradient magnitude image, we next segment the
image into regions by sliding in the gradient terrain. Pixels
that slide into the same local minimum are efficiently
grouped into regions by assigning them a unique label. The
image is scanned in row major order. If a pixel is not
labelled, its four- connected neighborhood is checked to
determine which neighboring pixel has the smallest gradient
magnitude. The pixel then “ slides” to the minimum gradient
neighbor by setting a slide direction and moving to that
neighbor. The process is repeated until it slides into a pixel
that is already labelled or until it slides into a local mini-mum.
If sliding reaches an unlabeled local minimum, that
minimum is assigned a unique label. Now that a region label
is known, unlabeled pixels encountered during sliding are
labelled. Thus, the tobogganing algorithm is as follows:
Algorithm 1: Toboggan Segmentation.
Input:
G ( p )
N ( p )
“( 9)
L ( 9)
regionsco; ( Initialize # of regions.)
for each p do begin
( Gradient magnitude at pixel position vector p.)
( 4 connected neighborhood of p.)
( Toboggan direction vector at p.)
( Region label at p ( initialized to nil for all p).)
Data Structures:
OUtpUC:
Algorithm:
( Scan image in row major order.)
[ Slide to labelled pixel or local minimum.)
mintG( u) ; u’cu;
for each r c N ( q ) do begin ( Find lowest gradient neighbor.)
U+ P ;
repeat
if G( r) Smin then begin
mintG( r) ; u‘ cr;
end
end
T ( u ) t u ‘ - U ; W- U‘; ( Set slide direction and slide.)
until L( q)# nil or T( u)= O
if L( p)= nil then begin ( If local minimum is unlabeled.)
L ( 9) c regions; { assign a unique label.)
reg i onscregi ons + 1 ;
end
XcR ;
repeat
until L( r)# nil
( Repeat slide to label unlabeled pixels.)
L ( r ) t L ( u ) ; r e r + T ( r ) ;
end
Note that we use a 4- connected neighborhood as opposed to
the 8- connected neighborhood of [ 6,19]. It has been our
experience that using a 4- connected neighborhood, while
producing more regions, isolates fine detail better than
tobogganing with an 8- connected neighborhood. Further, 8-
connected regions pose more topological problems when
mapping region boundaries to a planar graph.
3.2. Weighted Graph
Having partitioned the image into a collection of small
regions, we now convert the tobogganed regions into a
weighted planar graph. The resulting graph has weighted
edges and is used to compute an optimal path corresponding
to a lowest cumulative cost boundary segment in the image.
is the squared gradient magnitude summed over each color
band ( Ib) of the original image 1.
453
3.2.1. Graph Creation
Since tobogganing produces regions that partition the
image, there are no pixel- based boundaries between neigh-boring
regions. Rather, two adjacent regions share a con-nected
sequence of one or more “ cracks” between two 4-
connected pixels. As such, edges map to the boundary seg-ment
between two neighboring regions and nodes are cre-ated
where three or four regions meet at a pixel corner.
Since nodes occur only at pixel corners, every node is
uniquely identified by the pixel whose upper- left corner is
the node position. Thus, nodes are specified with a pixel
position vector’, q, and edges are represented as an ordered
quadruple, ( qsrcq, d s,, I,, I,), indicating the source and desti-nation
nodes as well as the region labels on its left and right
side, respectively.
The graph is created by tracking the boundary of each
region using an “ over- the- shoulder’’ contour following algo-rithm.
Given the definitions of L@) and N@) in Algorithm 1
and pixels p and q such that q E N ( p ) and L ( p ) # L( q) -
i. e., neighboring pixels p and q span the boundary between
adjacent regions- then the crack betweenp and q is defined
as the ordered pair @, q) and
( 4)
is the crack direction vector pointing clockwise relative top.
Thus, if 1 = L@), q d = q + dPvqa nd Pd = p + dp, qt hen
( p,,, qd); if L( qd) # 1 and L( pd) = I
( 4@ 4 ) ; if L ( Q ) = I
t? exr( p, 4 ) = 1 ( 5) ( p, p d ) ; otherwise
is the next clockwise pixel crack on the boundary of region I
relative to the crack ( p, 4).
Defnirion: B( I) is the ordered n- tuple, (( pl, qI), ( p2, q2),
. . ., @,,, q,)), of pixel cracks that represent, in clockwise
order, the boundary of a 4- connected region with label I
such that all of the following properties hold:
I. L@$= lforall 1 S i l n
2. L( q;) # I for all 1 I i l n
3. ( pi+] q, ;+,) = nexf@ i, q i) for all 1 I i < n
4. @ 1. q 1 ) = next@,, qn)
5. L( q,,) = L( q1) iff L( qi) = L( q,) for all 1 I i, j 5 n
Processing each region, I , we note that there is a node on
the boundary whenever L( q;) # L( q;+,), corresponding to the
node position
Further, if L( qj) # L( qj+,) fori > i such that L( qk) = L( qi+ l)
for i e k then there is an edge, e = ( vi, q p L( q,), I), defined
between nodes q; and q, that separates regions L( qj) and I
where ecrack = (@;+ 17q;+ ih @;+ 24i+ 2) 9 . . .. @ pqj)) is the
ordered sequence of pixel cracks defining the edge.
I . We use q to represent a position vector corresponding to a node in order to
avoid confusion with a regular position vector such asp or q.
Fig. I( c) illustrates how the tobogganed regions form a
planar graph. Graph nodes are indicated with shaded circles,
0, while edges, shown as thick lines, correspond to the por-tion
of a region boundary connecting two nodes.
3.2.2. Edge Weights
Since region boundaries already localize object edges,
there is no need for the Laplacian zero- crossing cost usedl in
[ 131. Rather, edge costs are currently computed using only
the gradient magnitude in a two step process. The first step
creates a gradient index vector for each edge and the second
step computes an edge’s gradient cost by remapping and
summing the index vector. The gradient index, given by
( 7)
scales the gradient magnitude to integer values between 0
and nc- 1 ( which is the domain of the remapping function
used in the second step).
Given an edge e such that ecrack = (@ l, qI), ( p2, q2), . . .,
@,,, q,,)), the gradient index vector of e, = ( idxc( vl),
idx&), . . ., idxC( r,)), is an ordered m- tuple of gradient
indices where m In and ( r l , r 2, . . ., r,) is the 8- connected
pixel sequence defined algorithmically as follows:
I Init. previous pixel position & j.)
if i%( pi) > idx,( ui) then rtmptpi; lSetr,,,,,, tocrackpir. el)
else I with largest gradient.)
if rtmp # rpm then begin ( If new pixel position, )
1 add it to path.)
end
r p r v t ( - l , - l ) ; j t l ;
for i = 1 to n do begin
rtmptq; i
rj+ rtmp; j + j + l ; rprv+ rtmp;
end
m c j - 1 ; 1 Set path length. )
Fig l( c) highlights, in the upper- left, the gradient index
vector for an example edge. Notice that even though the
edge is composed of 5 pixel cracks, its gradient index vector
is only 4 gradient values ( since the maximum gradient posi-tions
for two of the cracks are identical). Consequently, the
gradient index vector for diagonal edges is typically shorter
than the crack count, thus providing a mechanism to scale
the edge cost based on estimated Euclidean length.
As mentioned, an edge’s final cost function is computed
by remapping and summing its gradient index vector. ’ The
remapping function is initially defined as
which is an inverse linear ramp ranging from M down to 0
where M is the maximum cost per boundary unit. mupc is
implemented as a lookup table in order to adapt dynamically
and thereby accommodate on- the- fly training ( as described
in [ 131).
The final edge cost function is given by
rn
( 9)
j = l
and is simply the summation of the edge’s remapped gradi-ent
index vector.
454
3.3. Optimal Graph Search
Computation of optimal paths corresponding to mini-mum
cost contour segments is accomplished using Dijk-stra’s
[ 4] optimal graph search in much that same manner as
in [ 131. However, graph edges in [ 13] had unit length costs,
thereby bounding the cost range on the “ active” list and
allowing for an efficient bin sort algorithm. Since graph
edges in this paper can be several units long, the active list
now employs an efficient two- level bin sort algorithm to
allow for a much wider cost range.
Given a user selected “ seed” node, qs, the optimal graph
search algorithm is as follows:
Algorithm 2: Optimal Graph Search.
Input:
Data Structures:
? S
L
eage( q)
done( q)
g( q)
Output:
opt ( U)
Algorithm:
[ Seed node. 1
1 List of active nodes sorted by total cost ( initially empty). ]
1 Edge set containing edges emanaling from q. 1
{ Boolean function indicating if q has been expanded.)
[ Total cost function from qs lo q. 1
[ Edge from q indicating optimal path back 10 qs)
g ( qs) C O; ~ c q ~ ;
while L # 0 do begin ( While still nodes Io expand:]
( Initialize active list with zero cost seed node.]
[ Remove minimum cost node q from active list.]
I Mark q as expanded ( i. e.. processed). 1
qcminlL) ;
done ( q) + TRUE;
for each e t edge( q) do begin
q n t e d s t ( q) ;
if not done( qn) then begin
[ Get neighbor connected to edge.)
g ’ t g ( q ) + f( e); [ Compute total cos[ to neighbor.)
if qnc L and g’ < g( qn) then [ Removehighercost 1
q ” t L ; [ neighbor from list.)
if qn e L then begin
g ( qn) c g ’ ;
L c q , ;
( If neighbor not on list, ]
{ assign total cost, set opt. edge]
[ and place on ( or return to) ]
opt ( qn) te;
end [ active list.)
end
end
end
While the optimal graph search algorithm presented here is
similar to our previous work in [ 131, the reduced graph size
allows this technique to compute optimal paths anywhere
from two to 10 or moretimes faster, depending on the aver-age
size of the tobogganed regions, despite the fact that we
now use a more complex two- level bin sort.
3.4. Four- Parameter Edge Model
’ One advantage of using toboggan region boundaries as a
basis for the Intelligent Scissors graph search is the ability
for subpixel localization of an object edge by applying an
edge model to the region boundaries. Previous work [ 5]
assumes that an edge can be modeled as a Gaussian blurred
step edge ( Eq. IO). If 0: is the step amplitude and p is the
base ( pedestal offset), then the ideal edge model is given by
where ( T is the standard deviation of the Gaussian blur
and xo is the step edge location. However, because of the
complexity of the partials of Eq. 10, especially when
embedded in a summed squared residuals function ( i. e., a X2
function), we choose to approximate Eq. 10 with a sigmoid
( Eq. 11)- since its partials are much more tractable and it
very closely approximates the error function. Thus, the edge
model we use is
+ P ( 1 1) a
+ - x ) / s
where s is the spread or blur ( similar to ( T in Eq. ( IO)). These
parameters are extracted from profile data for each edge
crack from adjacent tobogganed regions ( Fig. 2). The profile
is then fit to Eq. 11 using a modified 4- parameter Marquardt
minimization [ 15,161.
Note that Eq. ( 1 1) only models the transition between
two gray levels ( or colors). As such, we would like to avoid
including transition information from edges other than the
one being modeled. Since tobogganing tends to slide away
from edges and stops sliding before climbing another edge,
the slide information provides a reasonable mechanism for
extracting transition data.
We extract the edge profile following the slide path from
both sides of an edge crack and projecting the position and
color data onto a domain and range vector respectively.
Given an edge crack @, q) we define a slide path forp as
( 12)
where p1 = p , pi+{ = p, + T@,) for 1 I j < n, pn is the local
minima for the tobogganed region, and T@,) is the toboggan
slide direction defined in Algorithm 1 . The slide path for q is
defined in like manner and the two slide paths are concate-nated
into a profile sample vector
s l W p ) = ( pl, p2, .... P,)
sarnple( p, q) = ( sl, s2,. .., s,,+,,,)
( 13)
= ( Pn> ”.? PZ’PI. 41, q. 2, ... 9 qm)
by reversing slide@) and appending slide( q).
We define the domain projection vector, vx, with origin,
ox, as the normalized gradient direction vector originating at
the crack @, q)-- the normalized sum of the gradient direc-tion
vectors at p and q. Each domain value, xi, is computed
by projecting the corresponding relative sample vector,
s i - o x , onto vx. Thus, x; = ( si - ox) . vx. Figure 2 illustrates
the edge model computation for a sample edge crack.
For grayscale images, the obvious yi’s are simply the
pixel gray- levels for each corresponding xi ( see Fig. 2( b)).
However, no such mapping is obvious for color images. We
thus define the range vector, vy = I( qm) - oy where I( q, J is
the image color vector at the pixel position qm and
o = I@, J Each range value, yi, is then computed in a simi-lar
fashion as the x;’ s. Specifically, y; = ( I( s;) - op) . v,, / llv,. ll.
One advantage of the Marquardt method 1s that upon
reaching an acceptable minimum, the curvature matrix can
be recalculated and used to estimate the covariance matrix
of standard errors in the fitted parameters. The standard
errors can be used to evaluate the “ reliability” of each
parameter. For example, we use o( xo)- the estimated stan-dard
deviation in the fit of xo- to smooth the displayed
object path, as noted below.
J
455
Figure 2: ( a) Expanded
6 view of rhe upper- lefr quad-runr
of Fig. I( c) showing
each path project onto rhe
domain vector v . p~ r oduc-ing
rhe domain values. x;.
each corresponding parh
posirion. ( b) The resulting
profile dura points und 4-
uirh the paramerers indi-
P
4.7 \ I, 4s , paramerer f i t of Eq. ( 12)
1
V p11 x cared.
3.5. Live- WGe Boundary Selection
Once the optimal path to every node is computed, the
live- wire allows interactive selection of the optimal path
corresponding to a segment of the desired object boundary.
As a pointing device moves, the cursor position is used to
select a node and display the optimal path from that node
back to the seed node. The optimal path is displayed as a
polyline fit to the subpixel midpoints of the path edges. The
midpoint for each edge crack is defined as ox + xovx where
ox and vx are the domain origin and vector ( as defined previ-ously)
and xo is the subpixel edge position given in Eq. ( I 1).
The displayed polyline is fit to the sequence of midpoints by
incrementally determining the maximum k such that a line
between midpoints pi and is at most c standard devia-tions
from all intermediate midpoints. That is, having deter-
, mined the polyline fit up top;, the next line segment finds
the maximum k such that for all midpointspi, i < j < i+ k, the
line connectingp; andpi+ k is at most a distance of c. O( xoj)
away from each pj where xoj is the subpixel edge position
for pi. The smoothing constant, c, can be interactively
adjusted to allow a tighter or looser fit of the polyline.
Due to the underlying optimal nature of the displa. yed
live- wire path, as the cursor position moves away from the
seed node in proximity to an object edge, the live- wire snaps
to the object edge. When further movement of the cuirsor
causes the live- wire path to depart from the desired object
boundary, placement of a new seed node prior to the point of
departure reinitiates the optimal graph search. This allows
the user to continue defining the object boundary, thereby
creating a piece- wise optimal boundary.
Since it is impossible to exactly specify a node located
on the interpixel grid using a pixel- based pointing device
( and since it would be impractical even if it were possible),
edge snapping is provided to select the edge that is “ nearest”
to the current pointer position [ 10,12,13]. The nearest edge
is determined by computing the weighted Euclidean dis-tance
to each edge around the region containing the cursor.
An edge’s weight is proportional to the average of the
remapped gradient index vector. The edge “ nearest” to the
cursor is selected and used to select the nearest node.
Since there is only a single optimal path from any given
node back to a seed node, pixel- based Intelligent Scissors
requires a minimum of two seed points to define a closed
boundary. However, the region- based version has the capa-bility
of selecting an edge and then displaying the path back
to a seed node from both nodes connected by that edge. If
the two optimal paths from each node do not overlap, then
the edge and the two optimal paths define a closed bound-ary.
Consequently, in some cases a single seed node i. s all
that is needed to define the desired closed boundary. For
example, Figure 3( a) shows the block from Fig. ]( a) and the
resulting closed boundary defined with a single seed point.
If an image is free of noise, then it is possible to have a sin-gle
graph edge that completely encloses an object. In such
cases, the object boundary can be selected without any : seed
nodes. This is demonstrated in Fig. 3( b) where all of the
boundaries in this synthetic image are specified without any
seed points since each boundary is completely defined with
a single graph edge. The edge that is “ nearest” to the cursor
position is displayed even when there is no optimal , path
information. Thus, any given boundary in Fig. 3( b) can be
defined by moving the cursor closer to it than any other
boundary and specifying that the boundary is complete.
Figure 3: ( a) Live- wire boundary ( in white) defined wirh only a single
seed poinr. ( b) Boundaries defined without any seed points since each
boundary corresponds ro a single graph edge.
456
drawn with a zero error tolerance, meaning that the smooth-ing
factor c is set to zero and no smoothing occurs. Of note
is that the boundaries exhibit reasonable consistency even
though the edge models are computed independently for
each boundary element. For example, the transition cutoff
lines indicate that the horizontal point spread of the imaging
device is larger than the vertical point spread. This is in fact
the case with the video camera used to acquire this image.
Also note that there are a couple of spots, such as on the ser-ifs
of the ‘ U’ or near the top corner of the block, where, due
to the proximity of another edge, there was insufficient or
unreliable data points for the parameter fit to reliably com-pute
xo andor s. However, since these values are not reli-able,
they will be smoothed out when drawn with a nonzero
error tolerance ( such as c = 1).
Figures 5 and 6 illustrate some simple object boundaries
defined within a couple of seconds2 and requiring only one
or two seed nodes while Fig. 7 shows a more complex
boundary requiring several seed nodes. In general, the
boundaries correspond well with the desired object edge.
Preprocessing requires approximately IO seconds for a
51 2x5 12 grayscale image on a 99 MHz HP 735 workstation
and involves computing the image gradient using a multi-
Figure 4: Block in Fig. ]( a) pixel replicated by a facror cf8 10 show the
zero rolerance subpixel localization of rhe block and ‘ U’ boundaries ( black)
as well as the transition cutoff ( whire) of rhe block boundary
4. Results and Discussion
We have not yet performed any comprehensive quantita-tive
analysis comparing the accuracy, reproducibility, and
boundary definition times between toboggan- based Intelli-gent
Scissors and that in [ 13]. However, as the authors of
[ 131, we have extensive experience and familiarity with both
techniques and submit the following observations:
The seed point boundary definition range ( i. e., the
amount of an object boundary defined with a single
seed point) appears to be nearly identical for both tech-niques.
Hence, both methods typically require close to
the same number of seed points ( though in some cases,
this technique has the advantage of edge selection to
define a boundary with only a single seed point, or in
some cases no seed points).
In general, boundary definition times depend on the
time required to manually position seed points. The
region- based graph search permits faster optimal path
computation and interactive responsiveness within the
live- wire environment. This allows quicker mouse
movements and faster seed point positioning. Further,
the improved efficiency of this technique permits other
computations during interaction, such as computing
edge models and fitting midpoints with a polyline.
Reproducibility seems to be at least as good for this
technique since errors in reproducibility occur mostly
in the vicinity of seed points.
We feel, due to the subpixel boundary positioning and
based on visual results, such as those shown here ( i. e.,
Figures 4, 5( b), and 6( b)), that this technique is poten-tially
more accurate than the pixel- based approach [ 131.
Figure 4 shows the estimated subpixel boundary loca-tions
for the block of Fig. l( a) and the “ U” on top of the
block. Also shown are the transition cutoff on either side of
the block boundary. The cutoff is set at 4s where s is the
spread or blur parameter in Eq. ( 1 I). These boundaries are
scale kernel, tobogganing in the gradient terrain to define
the regions, and transforming the region boundaries into a
weighted graph. The edge model computation is performed
during live- wire interaction on an “ as need” basis. Also,
since we only store cost information for the edges of a
region based graph instead of every pixel, the memory
requirements are smaller than that of [ 133.
Finally, the region- based graph provides a framework for
a variety of future refinements that would be difficult to
achieve with the pixel- based tool. Some of the improve-ments
include incorporating reliable region information into
the cost function such as foregroundhackground color and
multiple path exclusion with contour untangling to facilitate
definition of long, thin image features ( such as hair, sticks,
etc.) by allowing the live- wire to snap to the strong side of
such objects only once. We are in the process of adding
these refinements and believe they will improve the effec-tiveness
of Intelligent Scissors even more.
5. Conclusions
Compared to the original Intelligent Scissors tool, tobog-gan-
based Intelligent Scissors reduces computational
requirements during the critical interactive optimal bound-ary
selection process. Further, the region- based graph pro-vides
a well- defined framework to compute an edge model,
allowing for subpixel boundary localization and edge blur.
In addition to the improvements mentioned previously, other
extensions could include hierarchical graph construction
with dynamic detail adjustment for improved computational
performance and texture based cost functions.
2. Times staled are for the boundaries shown and are based on the best time
achieved by an experienced user measured from the lime the first seed node is placed
to completion of boundary definition.
457
Figure 5: Grayscale image of ha). Size: 256x256. ( a) Boundary defini-tion:
1.4 seconds with I seed node. ( b) Magnified section of hat showing
zero tolerance ( i. e., no smoothing) subpixel boundary lr~ calizotion.
( b)
Figure 6: Full color image ( printed grayscale) of a tulip. Size: 256x256.
( a) Boundary definition: 1.6 seconds with 2 seed nodes. ( b) Magnified sec-riorr
of tulip showing zero tolerance subpixel boundary localization.
References
A. A. Amini, T. E. Weymouth, and R. C. Jain, “ Using Dynamic Pro-gramming
for Solving Variational Problems in Vision,” IEEE PAMI,
12( 2): 855- 866, Sept. 1990.
S. Beucher and C. Lantu6joul. “ Use of Watersheds in Contour Detec-tion;‘
in Pruc. In[. Workchop on Image Processing. Real- Time Edge and
Motion Derectio~ E. crimarion, Sept. 1979.
D. Daneels. et al.. “ Interactive Outlining: An Improved Approach
Using Active Contours,” in SPlE Pruc. Srorage and Retrieval for Image
and Video Databases, 1908: 226- 233, Feb. 1993.
E. W. Dijkstra, “ A Note on Two Problems in Connexion with Graphs,’’
Numerische Mathematik. 1: 269- 270, 1959.
Figure 7: Full color image ( printed grayscale) of a family. Size:
340x560. Boundary definition: 18.5 seconds with several seed nodes.
[ SI J. H. Elder and S. W. Zucker, “ Scale Space Localization. Blur, and Con-tour-
Based Image Coding,” in CVPR, 27- 34. June, 1996.
161 J. Fairfield, “ Toboggan Contrast Enhancement for Contrast Segmenta-tion,”
inlEEEProc. 10‘ hInr. Conf: on PatternRecog., 1: 712- 716, 1990.
[ 7] A. X. FalcBo, et al., “ User- Steered Image Segmentation Paradigms:
Live Wire and Live Lane,” GMIP, 60( 4): 233- 260, July 1998.
( 81 M. Kass, A. Witkin, and D. Terzopoulos, “ Snakes: Active Contour
Models,” Int. Journal of Computer Msion, l( 4): 321 - 331. Jan. 1’ 388.
[ 9] D. Geiger, A. Gupta, L. A. Costa, and J. Vlontzos, “ Dynamic Program-ming
for Detecting. Tracking. and Matching Deformable Contours,”
IEEE PAMI, 17( 3): 294- 302, Mar. 1995 ( Correction in PAMI, 18( 5):
575, May 1996)
[ 101 M. Gleicher, “ Image Snapping,” in Proc. ACM SICGRAPH 95: Com-puter
Graphics and Interactive Techniques. 183- 190, Aug. 1995.
[ Ill E. N. Monensen, et al., “ Adaptive Boundary Detection Using
‘ Live- Wire’ Two- Dimensional Dynamic Programming,” in IEEE Proc.
Computers in Cardiology, 635- 638, Oct. 1992.
[ 121 E. N. Monensen and W. A. Barren, “ Intelligent Scissors for Image
Composition,” in Pruc. of the ACM SICGRAPH 95: Computer Graph-ics
and Interactive Techniques, pp. 191- 198, Aug. 1995.
[ 131 E. N. Monensen and W. A. Barren. ” Interactive Segmentation with
Intelligent Scissors,” GMIP, 60( 5): 349- 384, Sept. 1998.
[ 141 L. Najman and M. Schmitt, “ Geodesic Saliency of Watershed Contours
and Hierarchical Segmentation,” IEEE PAMI, 18( 12): 1163- 1773, Dec.
1996.
[ 151 J. C. Nash, Compact Numerical Methods for Computers, ch. 17: 207-
217, Adam Hilger, 1990.
I 161 W. H. Press, et al., NumericalRecipesin C, ch. 15: 681- 688, Cambridge
University Press, 2” d ed., 1992.
[ 171 L. Vincent and P. Soille, “ Watersheds in Digital Spaces: An Efficient
Algorithm Based on Immersion Simulations,” IEEE PAMI, 13( 6:,: 583-
598, June 1991.
[ I81 D. J. Williams and M. Shah, “ A Fast Algorithm for Active Contours and
Curvature Estimation,” CVGIP: Image Understanding, 55( I ) : 14- 26,
Jan. 1992.
1191 X. Yao and Y. P. Hung, “ Fast Image Segmentation by Sliding in the
Derivative Terrain,” in SPlE Proc. Intelligent Robots and Computer
Vision X: Algorithms and Techniques. 1607: 369- 379, Nov. 1991.
458