Figure 10.13
Working on the segmentation data with the User Classifier:
(a) the data visualizer and (b) the tree visualizer.
390
Figure 10.14
Configuring a metalearner for boosting decision
stumps.
391
Figure 10.15
Output from the Apriori program for association rules.
392
Figure 10.16
Visualizing the Iris dataset.
394
Figure 10.17
Using Weka’s metalearner for discretization: (a) configuring
FilteredClassifier, and (b) the menu of filters.
402
Figure 10.18
Visualizing a Bayesian network for the weather data (nominal
version): (a) default output, (b) a version with the
maximum number of parents set to 3 in the search
algorithm, and (c) probability distribution table for the
windy node in (b).
406
Figure 10.19
Changing the parameters for J4.8.
407
Figure 10.20
Using Weka’s neural-network
graphical user
interface.
411
Figure 10.21
Attribute selection: specifying an evaluator and a search
method.
420
Figure 11.1
The Knowledge Flow interface.
428
Figure 11.2
Configuring a data source: (a) the right-click menu and
(b) the file browser obtained from the Configure menu
item.
429
Figure 11.3
Operations on the Knowledge Flow components.
432
Figure 11.4
A Knowledge Flow that operates incrementally: (a) the
configuration and (b) the strip chart output.
434
Figure 12.1
An experiment: (a) setting it up, (b) the results file, and
(c) a spreadsheet with the results.
438
Figure 12.2
Statistical test results for the experiment in
Figure 12.1.
440
Figure 12.3
Setting up an experiment in advanced mode.
442
Figure 12.4
Rows and columns of Figure 12.2: (a) row field, (b) column
field, (c) result of swapping the row and column selections,
and (d) substituting Run for Dataset as rows.
444
Figure 13.1
Using Javadoc: (a) the front page and (b) the weka.core
package.
452
Figure 13.2
DecisionStump: A class of the
weka.classifiers.trees
package.
454
Figure 14.1
Source code for the message classifier.
463
Figure 15.1
Source code for the ID3 decision tree learner.
473
x x
L I S T O F F I G U R E S
P088407-FM.qxd 5/3/05 2:24 PM Page xx
List of Tables
Table 1.1
The contact lens data.
6
Table 1.2
The weather data.
11
Table 1.3
Weather data with some numeric attributes.
12
Table 1.4
The iris data.
15
Table 1.5
The CPU performance data.
16
Table 1.6
The labor negotiations data.
18
Table 1.7
The soybean data.
21
Table 2.1
Iris data as a clustering problem.
44
Table 2.2
Weather data with a numeric class.
44
Table 2.3
Family tree represented as a table.
47
Table 2.4
The sister-of relation represented in a table.
47
Table 2.5
Another relation represented as a table.
49
Table 3.1
A new iris flower.
70
Table 3.2
Training data for the shapes problem.
74
Table 4.1
Evaluating the attributes in the weather data.
85
Table 4.2
The weather data with counts and probabilities.
89
Table 4.3
A new day.
89
Table 4.4
The numeric weather data with summary statistics.
93
Table 4.5
Another new day.
94
Table 4.6
The weather data with identification codes.
103
Table 4.7
Gain ratio calculations for the tree stumps of Figure 4.2.
104
Table 4.8
Part of the
contact lens data for which astigmatism
= yes.
109
Table 4.9
Part of the contact lens data for which
astigmatism
= yes and
tear production rate
= normal.
110
Table 4.10
Item sets for the weather data with coverage 2 or
greater.
114
Table 4.11
Association rules for the weather data.
116
Table 5.1
Confidence limits for the normal distribution.
148
x x i
P088407-FM.qxd 4/30/05 10:55 AM Page xxi
Table 5.2
Confidence limits for Student’s distribution with 9 degrees
of freedom.
155
Table 5.3
Different outcomes of a two-class prediction.
162
Table 5.4
Different outcomes of a three-class prediction: (a) actual and
(b) expected.
163
Table 5.5
Default cost matrixes: (a) a two-class case and (b) a three-class
case.
164
Table 5.6
Data for a lift chart.
167
Table 5.7
Different measures used to evaluate the false positive versus the
false negative tradeoff.
172
Table 5.8
Performance measures for numeric prediction.
178
Table 5.9
Performance measures for four numeric prediction
models.
179
Table 6.1
Linear models in the model tree.
250
Table 7.1
Transforming a multiclass problem into a two-class one:
(a) standard method and (b) error-correcting code.
335
Table 10.1
Unsupervised attribute filters.
396
Table 10.2
Unsupervised instance filters.
400
Table 10.3
Supervised attribute filters.
402
Table 10.4
Supervised instance filters.
402
Table 10.5
Classifier algorithms in Weka.
404
Table 10.6
Metalearning algorithms in Weka.
415
Table 10.7
Clustering algorithms.
419
Table 10.8
Association-rule learners.
419
Table 10.9
Attribute evaluation methods for attribute selection.
421
Table 10.10
Search methods for attribute selection.
421
Table 11.1
Visualization and evaluation components.
430
Table 13.1
Generic options for learning schemes in Weka.
457
Table 13.2
Scheme-specific options for the J4.8 decision tree
learner.
458
Table 15.1
Simple learning schemes in Weka.
472
x x i i
L I S T O F TA B L E S
P088407-FM.qxd 5/3/05 2:24 PM Page xxii