5 1 2
I N D E X
Experimenter, 437–447
advanced panel, 443–445
Analyze panel, 443–445
analyzing the results, 440–441
distributing processing over several
machines, 445–447
running an experiment, 439–440
simple setup, 441–442
starting up, 438–441
subexperiments, 447
Explorer, 369–425
ARFF, 370, 371, 380–382
Associate panel, 392
association-rule learners, 419–420
attribute evaluation methods, 421,
422–423
attribute selection, 392–393, 420–425
boosting, 416
classifier errors, 379
Classify panel, 384
clustering algorithms, 418–419
Cluster panel, 391–392
CSV format, 370, 371
error log, 378
file format converters, 380–382
filtering algorithms, 393–403
filters, 382–384
J4.8, 373–377
learning algorithms, 403–414. See also
learning algorithms
metalearning algorithms, 414–418
models, 377–378
panels, 380
Preprocess panel, 380
search methods, 421, 423–425
Select attributes panel, 392–393
starting up, 369–379
supervised filters, 401–403
training/testing learning schemes,
384–387
unsupervised attribute filters, 395–400
unsupervised instance filters, 400–401
User Classifier, 388–391
Visualize panel, 393
extraction problems, 353, 354
F
Fahrenheit, Daniel, 51
fallback heuristic, 239
false negative (FN), 162
false positive (FP), 162
false positive rate, 163
False positive rate, 378
Familiar, 360
family tree, 45
tabular representation of, 46
FarthestFirst, 419
features. See attributes
feature selection, 341. See also attribute
selection
feedforward networks, 233
fielded applications, 22
continuous monitoring, 28–29
customer support and service, 28
cybersecurity, 29
diagnosis, 25–26
ecological applications, 23, 28
electricity supply, 24–25
hazard detection system, 23–24
load forecasting, 24–25
loan application, 22–23
manufacturing processes, 28
marketing and sales, 26–28
oil slick detection, 23
preventive maintenance of
electromechanical devices, 25–26
scientific applications, 28
file format converters, 380–382
file mining, 49
filter, 290
filter in Weka, 382–384
FilteredClassifier, 401, 414
filtering algorithms in Weka, 393–403
sparse instances, 401
supervised filters, 401–403
unsupervised attribute filters, 395–400
unsupervised instance filters, 400–401
filtering approaches, 315
filters menu, 383
finite mixture, 262, 263
FirstOrder, 399
P088407-INDEX.qxd 4/30/05 11:25 AM Page 512
I N D E X
5 1 3
Fisher, R. A., 15
flat file, 45
F-measure, 172
FN (false negatives), 162
folds, 150
forward pruning, 34, 192
forward selection, 292, 294
forward stagewise additive modeling, 325–327
Fourier analysis, 25
FP (false positives), 162
freedom, degrees of 93, 155
functional dependencies, 350
functions in Weka, 404–405, 409–410
G
gain ratio, 104
GainRatioAttributeEval, 423
gambling, 160
garbage in, garbage out. See cost of errors; data
cleaning; error rate
Gaussian-distribution assumption, 92
Gaussian kernel function, 252
generalization as search, 30–35
bias, 32–35
enumerating concept space, 31–32
generalized distance functions, 241–242
generalized exemplars, 236
general-to-specific search bias, 34
genetic algorithms, 38
genetic algorithm search procedures, 294,
341
GeneticSearch, 424
getOptions(), 482
getting to know your data, 60
global discretization, 297
globalInfo(), 472
global optimization, 205–207
Gosset, William, 184
gradient descent, 227, 229, 230
Grading, 417
graphical models, 283
GraphViewer, 431
gray bar in margin of textbook (optional
sections), 30
greedy search, 33
GreedyStepwise, 423–424
growing set, 202
H
Hamming distance, 335
hand-labeled data, 338
hapax legomena, 310
hard instances, 322
hash table, 280
hazard detection system, 23–24
hidden attributes, 272
hidden layer, 226, 231, 232
hidden units, 226, 231, 234
hierarchical clustering, 139
highly-branching attribute, 86
high-performance rule inducers, 188
histogram equalization, 298
historical literary mystery, 358
holdout method, 146, 149–150, 333
homeland defense, 357
HTML, 355
hypermetrope, 13
hyperpipes, 139
Hyperpipes, 414
hyperplane, 124, 125
hyperrectangle, 238–239
hyperspheres, 133
hypertext markup language (HTML), 355
hypothesis testing, 29
I
IB1, 413
IB3, 237
IBk, 413
ID3, 105
Id3, 404
identification code, 86, 102–104
implementation—real-world schemes,
187–283
Bayesian networks, 271–283
classification rules, 200–214
clustering, 254–271
decision tree, 189–199
instance-based, 236–243
linear models, 214–235
P088407-INDEX.qxd 4/30/05 11:25 AM Page 513
Dostları ilə paylaş: |