1
Terrain Classification and Classifier Fusion
for Planetary Exploration Rovers
Ibrahim Halatci, Christopher A. Brooks, Karl Iagnemma
Massachusetts Institute of Technology
Department of Mechanical Engineering
77 Massachusetts Avenue, Room 3-472m
Cambridge MA 02139
617-253-2334
ihalatci@alum.mit.edu ,{cabrooks; kdi}@mit.edu
Abstract—Knowledge of the physical properties of terrain
surrounding a planetary exploration rover can be used to
allow a rover system to fully exploit its mobility
capabilities. Here a study of multi-sensor terrain
classification for planetary rovers in Mars and Mars-like
environments is presented. Two classification algorithms
for color, texture, and range features are presented based on
maximum likelihood estimation and support vector
machines. In addition, a classification method based on
vibration features derived from rover wheel-terrain
interaction is briefly described. Two techniques for merging
the results of these “low-level” classifiers are presented that
rely on Bayesian fusion and meta-classifier fusion. The
performance of these algorithms is studied using images
from NASA’s Mars Exploration Rover mission and through
experiments on a four-wheeled test-bed rover operating in
Mars-analog terrain. It is shown that accurate terrain
classification can be achieved via classifier fusion from
visual and tactile features
12
.
T
ABLE OF
C
ONTENTS
1.
I
NTRODUCTION
......................................................1
2.
D
ESCRIPTION OF
L
OW
L
EVEL
C
LASSIFIERS
........2
3.
D
ESCRIPTION OF
H
IGH
L
EVEL
C
LASSIFIERS
.......3
4.
E
XPERIMENTAL
R
ESULTS
.....................................4
5.
C
ONCLUSION
.........................................................9
R
EFERENCES
...........................................................10
B
IOGRAPHY
.............................................................11
1.
I
NTRODUCTION
Near-term scientific goals for Mars surface exploration are
expected to focus on understanding the planet’s climate
history, surface geology, and potential for past or present
life. To accomplish these goals, rovers will be required to
safely access rough terrain with a significant degree of
autonomy. Terrain areas of interest might include impact
craters, rifted basins, and water-carved features such as
gullies and outflow channels [1]. Such regions are in
1
1
1-4244-0525-4/07/$20.00 ©2007 IEEE.
2
IEEEAC paper #1166, Version 2, Updated December 8, 2006
general highly uneven and sloped, and may be covered with
loose drift material that causes rover wheel slippage and
sinkage.
Terrain physical properties can strongly influence rover
mobility, particularly on sloped, natural terrain [2]. For
example, a rover might easily traverse a region of packed
soil, but become entrenched in loose drift material. The
effect of terrain properties on rover mobility was
exemplified in April–June, 2005 and again in May–June,
2006 when NASA's Mars Exploration Rover (MER)
Opportunity became entrenched in loose drift material and
was immobilized for several weeks. Knowledge of terrain
properties could allow a system to adapt its control and
planning strategies to enhance performance, by maximizing
wheel traction or minimizing power consumption.
Related Work
Terrain classification methods provide semantic
descriptions of the physical nature of a given terrain region.
These descriptions can be associated with nominal
numerical physical parameters, and/or nominal
traversability estimates, to improve traversability prediction
accuracy. Numerous researchers have proposed terrain
classification methods based on features derived from
remote sensor data such as color, image texture, and range
(i.e. surface geometry). Most of these algorithms have been
developed in the context of terrestrial unmanned ground
vehicles where the visual features have wide variance. It
should be noted that a planetary surface presents a difficult
challenge for classification since scenes are often near-
monochromatic, terrain surface cover consists mainly of
sands of varying composition and rocks of diverse shapes,
and sandy “crusts” can form on (and therefore obscure)
rocks.
Color-based methods for classification and segmentation of
natural terrain have been developed that are accurate and
computationally inexpensive. For these methods,
researchers have utilized multi-spectral imaging [3],
different color spaces and their distribution statistics [4]
along with mixture of Gaussians modeling for classifying
outdoor scenes [5] because many major terrain types such
as soil, vegetation, and rock possess distinct color
signatures. Color-based classification is also attractive for
2
planetary exploration rover applications since most past,
current, and planned rovers have included multi-spectral
imagers as part of their sensor suites [6].
Texture is also an extensively used feature in this domain.
Gabor filters [7], Fast Fourier Transform [4] and histogram-
based methods [8] demonstrated effective results at
segmenting natural scenes although they are generally
computationally expensive.
A standard approach for detecting obstacles relies on stereo
cameras or range finders. Algorithms that use such sensors
generally exploit elevation points [5], [9]; statistical
distributions of 3D data points [10]; or disparity maps [11].
Note that such methods allow for detection of “geometric”
hazards or terrain features such as rocks, however they
cannot easily detect “non-geometric” hazards or terrain
classes that are not characterized by geometric variation.
Although nearly all terrain classification methods rely on
features derived from remote sensor data, recently methods
have been proposed to classify terrain based on “tactile”
features. A method for terrain classification based on
analysis of vibrations arising from robot wheel-terrain
interaction was first proposed in [2] and developed by [12].
Similar work was presented in [13] and [14]. It was shown
that data from various sensor modalities can be fused to
produce reliable class estimates.
Classifier fusion methods attempt to combine the results
from “low-level” classifiers into class assignments that are
(ideally) of higher accuracy than those attainable from any
individual classifier. Recent work in classifier fusion
includes algorithms that fuse intensity and elevation data to
identify scientifically interesting targets [15], [16]; color,
texture, spatial dependence, and elevation data for rock
detection [17]; and color and texture histograms for
geological target detection [18]. Note that several methods
exist that employ a larger set of visual features such as
texture and infrared imaging in addition to range data;
however, their focus is detecting relatively structured roads
and obstacle detection rather than terrain classification [7],
[19].
This paper presents a study of multi-sensor terrain
classification for planetary rovers in Mars and Mars-like
environments. Two “low-level” classification algorithms for
color, texture, and range features are presented based on
maximum likelihood estimation and support vector
machines. In addition, classification of terrain based on
features derived from rover wheel-terrain interaction is
briefly described. Two techniques for merging the results of
these low level classifiers are presented that rely on
Bayesian fusion and meta-classifier fusion. The
performance of these algorithms is studied using images
from NASA’s Mars Exploration Rover mission and through
experiments on a four-wheeled test-bed rover operating in
Mars-analog terrain. It is shown that accurate terrain
classification can be achieved via classifier fusion from
visual and tactile features.
2.
D
ESCRIPTION OF
L
OW
L
EVEL
C
LASSIFIERS
Classifier Architectures
Two low-level classifiers are defined that rely solely on a
single feature type. As noted in Section 1, such classifiers
have been studied extensively for terrain classification. Here
we study the performance of two distinct classification
methods: a maximum likelihood classifier based on mixture
of Gaussians modeling (MoG), and a support vector
machine (SVM) classifier.
MoG Method—The MoG method models the distribution of
data points in the feature space as a mixture of Gaussians
(MoG) [20]. The likelihood of the observed feature y given
the terrain class x is computed as a weighted sum of k
Gaussian distributions:
(
)
∑
=
Σ
=
k
j
j
j
j
i
y
G
x
y
f
1
,
,
)
|
(
μ
α
(1)
Here, α is the weight of the Gaussian component whose
mean and variance is defined by µ and Σ, respectively.
Parameters of the model are learned through off-line
training using the Expectation Maximization algorithm [20],
[21]. Similar to [5] good results were obtained using three
to five Gaussian modes, with a greater number of modes
often leading to over-fitting.
SVM Method—The second classification method was based
on a Support Vector Machine (SVM) framework [22]. This
approach builds a binary classifier for each pair of classes
and is constructed as a linear combination of similarity
measures between the point to be classified y and the
training points x
j
:
( )
∑
=
=
n
j
j
j
x
y
K
y
f
1
,
)
(
α
. (2)
The similarity measure, K, is the kernel function. For this
work linear, polynomial, and Gaussian kernels were
evaluated. Values for the α
j
are calculated during training by
minimizing a loss function over the training data set.
Complexity of the function f(y) is limited by restricting the
values of α
j
to lie in the range [0,C], and for the Gaussian
kernel by controlling the width of the Gaussian using a
parameter γ. Cross-validation over a training data set was
used to determine an appropriate choice of kernel and
reasonable values for the regularization parameters C and γ.
The SVM algorithms used in this work were implemented
with the LIBSVM library with additional optimization for
3
linear classification [23]. Binary classifiers were combined
into multi-class classifiers using a voting scheme.
Feature Selection
Color—Color is an obvious distinguishing characteristic of
many terrain types and color-based classification has
yielded accurate results in natural terrain [5], [9]. It should
be noted, however, that color variation is somewhat limited
for the surface of Mars. Mars’ lack of moisture (and,
therefore, vegetation) leads to a narrow distribution of
colors for distinct terrain types. In this work red, green and
blue channel intensity values were selected as the 3D color
feature vector for every image pixel. Construction of this
feature vector for MER imagery was slightly different due
to the nature of the rover imaging system, and is detailed in
Section 4.
Texture—Texture is a measure of the local spatial variation
in image intensity. For our present work, the texture length
scale of interest is on the order of tens of centimeters. This
scale allows us to observe textural appearances of surfaces
in the range of four to thirty meters, which corresponds to
the range of interest for local planetary rover navigation
[24]. In this work we employ a wavelet-based fractal
dimension signature method, which yields robust results in
natural texture segmentation as demonstrated by [25]. For
this work, three levels of transformation were applied using
the Haar wavelet kernel and neighborhood windows of 7, 9,
and 11 pixels. This feature extraction method yields a 3D
feature vector for every pixel.
Range—Surface geometry information can be used to
distinguish between terrain classes that possess inherent
geometric dissimilarity. An example of two such classes is
rock and cohesionless sand. Since cohesionless sand can
never attain a slope greater than its angle of repose (whereas
rock, of course, can), features related to terrain slope were
applied for range feature selection. In this work, range data
was acquired from stereo imaging techniques. To compute
range features in a scene, a 20 cm x 20 cm grid-based patch
representation of the terrain surface was constructed. This
patch size was selected to be similar to one rover wheel
diameter. Best-fit planes were found within every patch
using least-squares estimation, and the surface normal
vector was extracted. The 3D range feature vector was then
composed of the surface normal vector, along with the step
height within the patch.
Vibration—Analysis of vibrations propagating through a
rover’s wheel/suspension structure can be used to
distinguish between various types of terrain the rover is
traversing [12]. This classification mode is unique among
the low-level classifiers described here in that it relies on a
“tactile” sensor signal that is modulated by physical rover-
terrain interaction. The performance of such a classifier is
not degraded by illumination variation, making it a
potentially attractive complement to vision-based
classification techniques. The general classification
framework employed here is identical to that in [12].
Vibration signals were processed as the log power spectral
density for every one-second time step at 557 frequencies in
the frequency range 20.5 Hz to 12 kHz. For this work, a
support vector machine with a linear kernel was used as the
classifier.
3.
D
ESCRIPTION OF
H
IGH
L
EVEL
C
LASSIFIERS
Low-level classifiers can yield poor results when applied
individually in certain problem domains. Due to sensitivity
to environmental changes (i.e. illumination) and
measurement specifications (i.e. feature distance) poor
classification performance is possible for low-level
classifiers in some scenarios. Classifier fusion attempts to
yield a robust class estimate despite the shortcomings of
individual low level classifiers.
It should also be noted that since certain class distinctions
are unobservable by individual low level classifiers,
classifier fusion aims to overcome this problem by
combining different sensing modes. Although this
difference makes it more difficult to directly compare
classifier performance, such increase in the number of
detectable classes is a performance boost in itself.
Bayesian Classifier Fusion
Bayesian fusion was applied to merge the results of low-
level classifiers. This technique has been proposed for
classification of natural scenes with promising results [26].
Here, the low level MoG classifiers’ outputs yield
conditional class likelihoods. Posterior distributions of
conditional class assignments are computed by Bayes’ Rule,
using the assumption that prior likelihoods are equal.
Assuming that the visual features are conditionally
independent, simple classifier fusion is applied as in
Equation 3. Here P(x
i
|y
j
) is the posterior probability of
terrain class (x
j
) given the sensing mode (y
j
).
∏
=
=
=
n
j
j
j
i
n
i
y
x
P
y
y
x
P
1
1
)
|
(
)
,
,
|
(
K
(3)
However, this formulation implicitly requires that all
classifiers function in the same class space (i.e the set x
j
is
same for all sensing modes). In the absence of this
assumption, the class space of the final fusion is formed as
the Cartesian product of the low-level class spaces, which
yields a high number of non-physical terrain classes.
Although previous researchers have addressed this problem
with an unsupervised dimensionality reduction algorithm
[26], this method did not exploit physical class knowledge
that could be inherited from supervised classifiers. In this
work the fusion class space was manually grouped into a
4
lower-dimensional space of physically meaningful terrain
classes based on physical class knowledge of the Mars
surface. Such a grouping explicitly encodes physical
knowledge in the final class decisions.
Meta-classifier Fusion
A second approach to high-level classifier fusion is meta-
classifier fusion. Meta-classifier fusion is a patch-wise
classifier with features extracted from the outputs of low
level classifiers. Specifically, it employs as features the
continuous class likelihood outputs of the low-level
classifiers
Meta-classifier fusion is very similar to stacked
generalization (SG) presented by [27] and applied for road
detection in [4]. In the method described here, low level
classifiers described in Section 2 correspond to the “level-0
generalizer” where meta-classifier corresponds to “level-1
generalizer” of SG architecture. However, in the current
work, the data points may not have the same resolution for
all low-level classifiers. As described in Section 2, color-
and texture-based classification was performed on a pixel-
wise basis while range-based classifier was performed on a
patch-wise basis. A trivial solution to this data association
problem is addressed by a pixel to patch conversion. This
conversion computes the continuous class likelihood of a
patch by averaging the class likelihood values of every pixel
in a particular patch. This high-level classifier is also a
supervised classifier which requires training with a distinct
set of training data than that employed by the low-level
classifiers.
Data Fusion
A simple data fusion method was employed as a baseline to
compare the performance of the Bayesian and meta-
classifier fusion techniques and as a method for combining
wheel vibration and vision data. Feature vectors from the
various visual sensing modes are combined to form a single
feature vector, which are then mapped to a probability
distribution function using a MoG model. An SVM
classifier was also applied to the data fusion framework.
Note that the class space for data fusion included all
observable classes, and SVM was implemented as a multi-
class classifier.
Data fusion was also applied as an approach to combine
vibration and vision data for improved local terrain
classification accuracy. Here, images captured using a
camera pointed at a rover wheel provided visual data
corresponding to the terrain being sensed by a wheel-
mounted vibration sensor, as seen in Figure 1. Visual data
was represented as the mean RGB value of the pixels in a
small region below the wheel. This 3-element vector was
appended to the 557-element vibration vector using the data
fusion framework, producing a 560-element combined
Figure 1: Image of wheel and terrain from
belly-mounted camera
vision/vibration vector. An SVM classifier was used to
identify the local terrain class.
4.
E
XPERIMENTAL
R
ESULTS
The performance of the low- and high-level classifiers was
studied using images from NASA’s Mars Exploration
Rover mission and through experiments on a four-wheeled
test-bed rover operating in Mars-analog terrain. These
results are described below.
MER Imagery
Publicly available images from the MER mission’s Spirit
and Opportunity rovers were used to assess the performance
of the low-level and high-level classifiers. Fifty-five images
from the rovers’ panoramic camera stereo pairs were
selected from the Mars Analysts’ Notebook database [28].
Ten images were used for classifier training and identifying
meta-parameters. An additional five images were used for
meta-classifier fusion and data fusion in addition to the
training set to overcome data scaling problem. The
remaining forty images were used to evaluate algorithm
accuracy and computation time. For MER imagery, the
vibration-based classification approach was not employed
since only image data was available.
The MER panoramic camera pair has eight filters per
camera; left filters mostly in the visible spectrum and right
filters in the infrared region (with the exception of filter R1
at 430 nm). For color feature extraction, the combination of
4
th
filter at 601 nm, 5
th
filter at 535 nm, and 6
th
filter at 482
nm intensities were chosen since they are near to the red,
green and blue wavelengths, respectively. Texture feature
extraction was performed on the intensity image from the
2
nd
filter of the left camera at 753 nm. Range data was
extracted by processing stereo pair images using stereo
libraries developed at JPL [29].
5
Figure 2: Class distinctions: color- and geometry based classes (left), texture-based classes (middle), fusion classes (right)
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Color−based MoG Classifier
Rock
Sand
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Texture−based MoG Classifier
Sand
Mixed
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Geometry−based MoG Classifier
Rock
Sand
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Color−based SVM Classifier
Rock
Sand
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Texture−based SVM Classifier
Mixed
Sand
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Geometry−based SVM Classifier
Rock
Sand
Figure 3: ROC curves of the low level classifier, MoG (top row), SVM (bottom row).
For Mars surface scenes, three primary terrain types that are
believed to possess distinct traversability characteristics
were defined: rocky terrain, composed of outcrop or large
rocks; sandy terrain, composed of loose drift material and
possibly crusty material; and mixed regions, composed of
small loose rocks partially buried or lying atop a layer of
sand. Examples of these terrains are shown in Figure 2
(right). High-level classifiers are expected to distinguish
these three terrain classes; however, low-level classifiers
can distinguish only a subset of them (Figure 2 left, middle).
For instance, the color space of mixed terrain class, since it
is composed of small rocks scattered on sand, overlaps with
the color spaces of rock and sand terrain classes, so a color-
based classifier cannot identify a distinct “mixed” terrain.
Similarly, texture on the rock surfaces are not observable
given the range of observation is 4 to 20 meters, so rock and
sand both fall in the “smooth” class.
Low-level Classifier Results—Quantitative results of low
level classifier are presented in Table 1 as average
performances over the test set. The color-based classifiers
produced results close to expectation of random choice
between two classes on average. This might be expected
due to the monochromatic nature of Martian surface.
Texture-based classifier performed better than color since
the discrimination between mixed and sandy terrain is more
apparent. However, the performance for texture-based
classification is still not sufficiently robust since texture
classification accuracy is sensitive to the scaling of the
image. Poor performance was observed in classifying
6
terrain outside a 4 to 20 meter range. The range-based
classifier demonstrated the best performance, with 75%
average classification accuracy, although variance was quite
high. Failures in range-based classification were observed
when sand was steeply sloped, forming ridges and dunes.
Table 1: Low-level classifier performance
Average
Accuracy
(%)
95%
Confidence
Interval
Standard
Deviation
(%)
MoG 57.2 [52.4
62.1] 15.6
Color-
based
SVM 68.1 [63.4
72.7] 15.0
MoG 60.9 [56.1
65.7] 15.6
Texture-
based
SVM 66.7 [61.4
71.9] 16.8
MoG 75.5 [69.0
82.1] 21.2
Geometry-
based
SVM 70.2 [63.0
77.3] 23.0
Figure 3 shows ROC curves for each low-level classifier,
illustrating the accuracy of the MoG and SVM classifiers
across a range of confidence thresholds. These results
demonstrate the weaknesses of the low-level classifiers.
Besides being unable to distinguish between the three
terrain classes of interest, low classification accuracy is
exhibited due to the challenging nature of the classes. It
should be observed that SVM and MoG classifiers
demonstrated similar performance for each of the low-level
sensing modes.
High-level Classifier Results—As described in Section 3,
classifier fusion methods combine the data from multiple
sensing modes to compute a class label. By merging the
results of color- and range-based classifiers, fusion
algorithms aim to compensate the weaknesses of low-level
classifiers (e.g., to decrease the false positives of rock vs.
sand detection). Moreover, inclusion of texture data enabled
the observation of roughness and allows the definition of a
“mixed” class.
Error! Reference source not found. shows ROC curves
for the data fusion method applied with SVM and MoG as a
multi-class classifier. As expected, data fusion performed
poorly. This may be due to the difficulty of modeling in a
high-dimensional feature space. In each case, it was
observed that the classifier tend to have a bias towards a
certain terrain class which yields poor average performance.
These results also demonstrate the need for high-level
classifier fusion for robust classification performance. Table
2 shows the comparison between the data fusion and
classifier fusion methods in terms of global performance
results.
Regarding the comparison between low- and high-level
classifiers, note that high-level classifiers distinguish
between three classes, whereas the low-level classifiers each
distinguish between only two. Therefore the performance in
terms of average accuracy is not directly comparable.
However, it should me remembered that color- and texture-
based classifiers perform close to the expectation of random
choice whereas classifier fusion performance is much more
robust.
Table 2: High-level classifier performances
Average
Accuracy
(%)
95%
Confidence
Interval for
A
Standard
Deviation
(%)
MoG 38.0 [32.5
43.5] 17.8
Data
Fusion
SVM 47.0 [41.6
52.3] 17.3
Bayesian
Fusion
64.7 [59.9
69.5] 15.5
Meta-classifier
Fusion
59.6 [55.3
63.7] 13.6
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
SVM Data Fusion
Rock
Sand
Mixed
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
MoG Data Fusion
Rock
Sand
Mixed
Figure 4: Data fusion ROC curves using SVM classifier
(upper) and MoG classifier (lower)
Comparing high-level classifiers based on the ROC curves
presented in Figure 5, it can be observed that Bayesian and
meta-classifier fusion were much more accurate than data
fusion. Although scaling of data (from pixel to patch)
7
potentially affects both data fusion and meta-classifier
fusion, classifier fusion demonstrates better results than data
fusion given the same amount of training data. For this data
set, Bayesian fusion demonstrated similar accuracy to meta-
classifier fusion. However, meta-classifier fusion requires
more training data for the second level of classifier, besides
the training set of low level classifiers. Bayesian fusion, on
the contrary, does not require extra training for the second
level, but the relationship between low-level classes and
high-level classes has to be manually defined based on the
environment setting. In short, there is a trade-off between
predefining the class space and supplying additional
training data for these fusion methods.
Wingaersheek Beach Experiments
Experimental Setup—Additional experiments were
performed using a four wheeled mobile robot developed at
MIT, named TORTOISE (all-Terrain Outdoor Rover Test-
bed for Integrated Sensing Experiments), shown in Figure
6. TORTOISE is an 80-cm-long x 50-cm-wide x 90-cm-tall
robot with 20 cm diameter wheels. The TORTOISE sensor
suite includes the following: a forward looking mast-
mounted Videre Design “dual DCAM” stereo pair with 640
x 480 resolution; a belly-mounted color monocular camera
with 320 x 240 resolution to observe local terrain; and a
Signal Flex SF-20 contact microphone mounted on the
rover suspension near the front right wheel assembly to
sense vibrations. During experiments, TORTOISE traveled
at an average speed of 6 cm/sec. It captured monocular
images at 2Hz, and vibration data at 44.1 kHz. Stereo
images were captured every 1.5 seconds.
Experiments were performed at Wingaersheek Beach in
Gloucester, MA. This is an oceanfront environment
dominated by large (i.e. meter-scale) rock outcrops and
distributions of rover-sized and smaller rocks over sand.
Neighboring areas exhibit sloped sand dunes and sandy flats
mixed with beach grass. Figure 7 shows a typical scene
from the experiment site. This scene shows a large rock in
the foreground and scattered, partially buried rocks in the
middle range. Sand appears grayish in color while rock
features vary from gray to light brown and dark brown. This
test site was chosen because of its visual and topographical
similarities to Mars surface scenes.
For the following experiments, the terrain classes of interest
were “rock,” “sand,” and “beach grass.” The “mixed” class
was not defined due to lack of scattered small sized rocks;
dry beach grass was used to reflect a distinct texture
signature in an effort to maintain a consistent number of
classes with MER results.
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Bayesian Fusion
Rock
Sand
Mixed
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Meta−classifier Fusion
Rock
Sand
Mixed
Figure 5: ROC curves for Bayesian fusion (upper) and
meta-classifier fusion (lower)
Figure 6: TORTOISE experimental rover (left), local sensing
suite (right)
Figure 7: Sample scene from Wingaersheek Beach
8
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Color−based MoG Classifier
Rock
Sand
BeachGrass
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Texture−based MoG Classifier
Sand
BeachGrass
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Geometry−based MoG Classifier
Rock
Sand
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Data Fusion
Rock
Sand
BeachGrass
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Bayesian Fusion
Rock
Sand
BeachGrass
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Meta−classifier Fusion
Rock
Sand
BeachGrass
Figure 8: ROC curves: Low-level classifiers (top row), high-level classifiers (bottom row)
Low-level Classifier Results—Six days of experiments were
conducted with a total of approximately 50 traverses and a
total distance traveled of 500 meters. Every traverse
included approximately 250 images. Every 20
th
image was
included in the test set to minimize overlap. Data from the
first traverse of the final day was used for training data.
Classifier accuracy was assessed using images from the
remaining traverses on the final day. The performance of
the low-level classifiers is shown in Figure 8 as series of
ROC curves.
It was observed that the performance of the color-based
classifier was improved over that observed in experiments
on MER imagery. This was likely due to the greater color
variation present in an average beach scene. Relatively poor
results were observed from the range-based classifier. The
reason for this decrease in performance may be related to
the poor accuracy and resolution of stereo-based range data
for these experiments relative to MER imagery data, which
used state-of-the-art JPL stereo processing software
operating on high-quality images. This performance decline
illustrates the sensitivity of range-based classification to
data quality, and strengthens the motivation for classifier
fusion.
High-level Classifier Results—High-level classifier
performance is shown in Figure 8. In keeping with the MER
results, the classifier fusion methods perform significantly
better than the data fusion approach. Data fusion exhibits a
bias towards the “rock” class yielding high false positives
and degrading the detection rate for other classes. In this
experiment setting, use of high-level classifiers does not
increase the number of observable terrain classes since the
color-based classifier is able to distinguish all terrain classes
present in the setting. However, the ROC curves show a
performance increase as a result of merging texture- and
range- based classifiers with color-based results. In the
meta-classifier fusion results, it is clear that although
individual performances of other low-level classifiers are
below color-based results, they contribute to the training of
meta-classifier yielding improved results.
Data Fusion for Local Terrain—Local classification of
terrain based on fusion of vibration and color features was
tested using data captured by the vibration sensor and belly-
mounted camera. These data were collected while the rover
traversed sand, beach grass, and rock. A total of 21 minutes
of vibration data were collected (1260 one-second
segments), with over 2500 associated local images. Half of
the data was used for establishing the meta-parameters and
training each SVM classifier. The other half was used to test
the classifiers.
The results for local terrain classification are shown in
Figure 9. The left plot shows results for pure vibration-
based classification. It can be seen that all terrains are
moderately well distinguished, with an average accuracy of
65% at full classification. The center plot shows results for
9
pure color-based classification. Here “beach grass” is nearly
all detected, with very few false positives. “Rock” and
“sand” are also well distinguished. The average accuracy is
77% at full classification. Finally, the right plot shows the
results for data fusion of color and vibration. An
improvement over vibration-only and color-only classifiers
was exhibited, with an average accuracy of 84%. This result
suggests improved classification performance can be
derived from fusion of visual and tactile information. This is
likely due to the insensitivity of tactile features to variations
in illumination.
Computation Times
All algorithms in this work except SVM classification were
implemented in Matlab. On a Pentium 1.8 GHz desktop
computer, pixel-wise MoG classification of a 512 x 512
image took an average of 5.2 seconds. Patch-wise MoG
classification (for range-based, data fusion and meta-
classifier fusion) required an average of 2.4 seconds.
Bayesian fusion took 1.2 seconds to form classifier
decisions. The most computationally expensive element of
the algorithms is texture feature extraction, requiring
approximately 14.8 seconds of computation time for three
levels of Haar wavelet transforms and computing the pixel-
wise texture signature of 512 x 512 grayscale image. In
total, classifying a 512 x 512 frame takes approximately
29.0 sec/frame. These times could be significantly reduced
in a C-code implementation.
SVM classification was implemented with C++, using the
LIBSVM library, with additional optimization for linear
kernels (Chih-Chung & Chih-Jen, 2001). Classification of a
512x512 color image took an average of 0.61 seconds using
a linear kernel. Classification using a Gaussian kernel took
an average of 77.5 seconds for a 512x512 color image.
After feature extraction, texture classification times were
identical to those for color classification. Patch-wise
classification (for range and data fusion) averaged less than
0.01 seconds per patch for the linear SVM, and less than
0.04 seconds per patch for the Gaussian SVM. The number
of patches in each image varied from 10 to 400.
5.
C
ONCLUSION
Knowledge of the physical properties of terrain surrounding
a planetary exploration rover can be used to allow a rover
system to fully exploit its mobility capabilities. The ability
to detect or estimate terrain physical properties would allow
a rover to predict its mobility performance and knowledge
of terrain properties could allow a system to adapt its
control and planning strategies to enhance performance.
This paper has compared the performance of various
methods for terrain classification based on the fusion of
visual and tactile features. It was shown that classifier
fusion methods can improve overall classification
performance in two ways compared to low-level methods.
First, classifier fusion yielded a more descriptive class set
than any of the low-level classifiers can attain individually.
Second, the rate of false positives decreased significantly
while the rate of true positives increased. This shows that in
challenging planetary surfaces, stand alone visual features
are may not be sufficiently robust for mobile robot sensing;
however, classifier fusion techniques improve sensing
performance significantly.
Future research will focus on integrating additional tactile
sensing modes such as wheel sinkage and torque with visual
classifier fusion algorithms.
0
20
40
60
80
100
0
20
40
60
80
100
% False Positive
% True Positive
Vibration−based SVM Classifier
Rock
Sand
Beach Grass
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Color−based SVM Classifier
Rock
Sand
Beach Grass
0
50
100
0
20
40
60
80
100
% False Positive
% True Positive
Data Fusion
Rock
Sand
Beach Grass
Figure 9: Classifier results for local vibration-based classification (left), color-based classification (middle), and data fusion of
color and vibration (right)
10
A
CKNOWLEDGMENT
This work was supported by the NASA Jet Propulsion
Laboratory (JPL) through the Mars Technology Program.
R
EFERENCES
[1] Urquhart, M. and Gulick, V (2003). “Lander detection
and identification of hydrothermal deposits,” abstract
presented at First Landing Site Workshop for MER.
[2] Iagnemma, K. and Dubowsky, S. (2002, March). “Terrain
estimation for high speed rough terrain autonomous
vehicle navigation,” Proceedings of the SPIE Conference
on Unmanned Ground Vehicle Technology IV.
[3] Kelly, A., et al. (2006, June). “Toward Reliable Off Road
Autonomous Vehicles Operating in Challenging
Environments,” The International Journal of Robotics
Research. 25(5/6).
[4] Dima, C.S., Vandapel, N., and Hebert, M. (2004).
“Classifier fusion for outdoor obstacle detection,”
Proceedings of the IEEE International Conference on
Robotics and Automation (ICRA), 1, 665-671, doi:
10.1109/ROBOT.2004.1307225.
[5] Manduchi, R., Castano, A., Thalukder, A., and Matthies,
L. (2005, May). “Obstacle detection and terrain
classification for autonomous off-road navigation,”
Autonomous Robots, 18, 81-102.
[6] Squyres, S. W., et al., (2003). “Athena Mars rover science
investigation,” J. Geophys. Res., 108(E12), 8062,
doi:10.1029/2003JE002121.
[7] Rasmussen, C., (2001, December). “Laser Range-, Color-,
and Texture-based Classifiers for Segmenting Marginal
Roads,” in Proceedings of. Conference on Computer
Vision & Pattern Recognition Technical Sketches, Kauai,
HI.
[8] Angelova, A., Matthies, L., Helmick, D., Sibley, G.,
Perona, P. (2006). “Learning to predict slip for ground
robots,” Proceedings of the IEEE International
Conference on Robotics and Automation (ICRA),
Orlando, Florida. May, 2006.
[9] Bellutta, P., Manduchi, R., Matthies, L., Owens, K. and
Rankin,K. (2000, October). “Terrain perception for Demo
III,” Proceedings of the Intelligent Vehicles Symposium,
326-331, doi: 10.1109/IVS.2000.898363
[10] Vandapel, N., Huber, D.F., Kapuria, A., Hebert, M.
(2004). Natural Terrain Classification using 3-D Ladar
Data. Proceedings of the International Conference on
Robotics and Automation (ICRA), 5, 5117- 5122.
[11] Mandelbaum, R., McDowell, L., Bogoni, L., Reich, B.,
and Hansen M. (1998). “Real-Time Stereo Processing,
Obstacle Detection And Terrain Estimation Form
Vehicle-Mounted Stereo Cameras,” Proceedings of the
4th IEEE Workshop on Applications of Computer Vision,
288, Princeton, New Jersey.
[12] Brooks, C. and Iagnemma, K. (2005). “Vibration-based
Terrain Classification for Planetary Rovers,” IEEE
Transactions on Robotics, 21, 6, 1185-1191.
[13] Sadhukhan, D., Moore, C., and Collins, E. (2004).
“Terrain Estimation Using Internal Sensors,” in
Proceedings of International Conference on Robotics and
Applications (IASTED), 84(11), 1684-1704, doi:
10.1109/5.542415.
[14] Ojeda, L., Borenstein, J., Witus, G., and Karlsen, R.
(2006). “Terrain characterization and classification with a
mobile robot,” Journal of Field Robotics, 23(2), 103-122,
doi: 10.1002/rob.20113.
[15] Castano, R., et al. (2005). “Current Results from a
Rover Science Data Analysis System,” Proceedings of
2005 IEEE Aerospace Conference, Big Sky. 356-365, doi:
10.1109/AERO.2005.1559328.
[16] Gor,V., Castaño, R., Manduchi, R., Anderson, R., and
E. Mjolsness (2001). “Autonomous Rock Detection for
Mars Terrain,” Space 2001, AIAA.
[17] Thompson, D. R., Niekum, S., Smith, T. and
Wettergreen, D. (2005). “Automatic Detection and
Classification of Features of Geologic Interest,”
Proceedings. of IEEE Aerospace Conference, 366-377,
doi: 10.1109/AERO.2005.1559329.
[18] McGuire, P. C., et al., (2005). “The Cyborg
Astrobiologist: scouting red beds for uncommon features
with geological significance,” International Journal of
Astrobiology, 4, 101-113.
[19] Dima, C.S., Vandapel, N., and Hebert, M. (2003).
“Sensor and classifier fusion for outdoor obstacle
detection: an application of data fusion to autonomous
road detection,” Applied Imagery Pattern Recognition
Workshop, 255- 262, doi: 10.1109/AIPR.2003.1284281.
[20] Bishop, C.M., (1995). Neural networks for pattern
recognition. New York: Oxford University Press.
11
[21] Bilmes, J. (1997). “A Gentle Tutorial on the EM
Algorithm and its Application to Parameter Estimation for
Gaussian Mixture and Hidden Markov Models,”
Technical Report, University of Berkeley.
[22] Vapnik, V.N. (1995). The Nature of Statistical Learning
Theory. New York: Springer.
[23] Chih-Chung C. and Chih-Jen L, (2001). LIBSVM: a
library for support vector machines. Software retrieved
January, 2006 available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm
[24] Goldberg, S., Maimone, M., and Matthies, L. (2002).
“Stereo vision and rover navigation software for planetary
exploration,” IEEE Aerospace Conference, Big Sky, 5,
2025-2036, doi: 10.1109/AERO.2002.1035370.
[25] Espinal, F., Huntsberger, T.L., Jawerth, B., and Kubota
T. (1998). “Wavelet-based fractal signature analysis for
automatic target recognition,” Optical Engineering,
Special Section on Advances in Pattern Recognition,
37(1), 166-174.
[26] Manduchi, R. (1999). “Bayesian Fusion of Color and
Texture Segmentations,” In Proceedings. of International
Conference on Computer Vision (ICCV), 2, 956-962, doi:
10.1109/ICCV.1999.790351.
[27] Wolpert, D. H. (1990).Stacked generalization,. Los
Alamos, NM, Tech. Rep. LA-UR-90-3460, 1990.
[28] Mars Analyst’s Notebook (2006). Retrieved May 24,
2006, from
http://anserver1.eprsl.wustl.edu/
.
[29] Ansar, A., Castano, A., and Matthies, L. (2004,
September). “Enhanced real-time stereo using bilateral
filtering,” 2nd International Symposium on 3D Data
Processing, Visualization, and Transmission, 455-462.,
doi: 10.1109/TDPVT.2004.1335273
B
IOGRAPHY
Ibrahim Halatci is a technical support
engineer in the Engineering
Development Group at the Mathworks
Inc. He has recently received his MS
degree from the Mechanical
Engineering department of the
Massachusetts Institute of Technology.
He received his B.S. degree with honor
in Mechatronics Engineering from Sabanci University in
2004. His research interests include control systems, its
application to robotics and learning for mobile robots
Christopher Brooks is a graduate
student in the Mechanical Engineering
department of the Massachusetts
Institute of Technology. He received his
B.S. degree with honor in engineering
and applied science from the California
Institute of Technology in 2000, and his
M.S. degree from the Massachusetts
Institute of Technology in 2004. Hs
research interests include mobile robot control, terrain
sensing, and their application to improving autonomous
robot mobility. He is a member of Tau Beta Pi.
Karl Iagnemma is a principal research
scientist in the Mechanical Engineering
department of the Massachusetts
Institute of Technology. He received his
B.S. degree summa cum laude in
mechanical engineering from the
University of Michigan in 1994, and his
M.S. and Ph.D. from the Massachusetts
Institute of Technology, where he was a
National Science Foundation graduate fellow, in 1997 and
2001, respectively. He has been a visiting researcher at the
Jet Propulsion Laboratory. His research interests include
rough-terrain mobile robot control and motion planning,
robot-terrain interaction, and robotic mobility analysis. He
is author of the monograph Mobile Robots in Rough
Terrain: Estimation, Motion Planning, and Control with
Application to Planetary Rovers (Springer, 2004). He is a
member of IEEE and Sigma Xi.
Dostları ilə paylaş: |