Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Yüklə 4,3 Mb.

Pdf görüntüsü

səhifə	216/219
tarix	08.10.2017
ölçüsü	4,3 Mb.
	#3816

1 ... 211 212 213 214 215 216 217 218 219

5 1 2

I N D E X

Experimenter, 437–447

advanced panel, 443–445

Analyze panel, 443–445

analyzing the results, 440–441

distributing processing over several

machines, 445–447

running an experiment, 439–440

simple setup, 441–442

starting up, 438–441

subexperiments, 447

Explorer, 369–425

ARFF, 370, 371, 380–382

Associate panel, 392

association-rule learners, 419–420

attribute evaluation methods, 421,

422–423

attribute selection, 392–393, 420–425

boosting, 416

classiﬁer errors, 379

Classify panel, 384

clustering algorithms, 418–419

Cluster panel, 391–392

CSV format, 370, 371

error log, 378

ﬁle format converters, 380–382

ﬁltering algorithms, 393–403

ﬁlters, 382–384

J4.8, 373–377

learning algorithms, 403–414. See also

learning algorithms

metalearning algorithms, 414–418

models, 377–378

panels, 380

Preprocess panel, 380

search methods, 421, 423–425

Select attributes panel, 392–393

starting up, 369–379

supervised ﬁlters, 401–403

training/testing learning schemes,

384–387

unsupervised attribute ﬁlters, 395–400

unsupervised instance ﬁlters, 400–401

User Classiﬁer, 388–391

Visualize panel, 393

extraction problems, 353, 354

Fahrenheit, Daniel, 51

fallback heuristic, 239

false negative (FN), 162

false positive (FP), 162

false positive rate, 163

False positive rate, 378

Familiar, 360

family tree, 45

tabular representation of, 46

FarthestFirst, 419

features. See attributes

feature selection, 341. See also attribute

selection

feedforward networks, 233

ﬁelded applications, 22

continuous monitoring, 28–29

customer support and service, 28

cybersecurity, 29

diagnosis, 25–26

ecological applications, 23, 28

electricity supply, 24–25

hazard detection system, 23–24

load forecasting, 24–25

loan application, 22–23

manufacturing processes, 28

marketing and sales, 26–28

oil slick detection, 23

preventive maintenance of

electromechanical devices, 25–26

scientiﬁc applications, 28

ﬁle format converters, 380–382

ﬁle mining, 49

ﬁlter, 290

ﬁlter in Weka, 382–384

FilteredClassiﬁer, 401, 414

ﬁltering algorithms in Weka, 393–403

sparse instances, 401

supervised ﬁlters, 401–403

unsupervised attribute ﬁlters, 395–400

unsupervised instance ﬁlters, 400–401

ﬁltering approaches, 315

ﬁlters menu, 383

ﬁnite mixture, 262, 263

FirstOrder, 399

P088407-INDEX.qxd 4/30/05 11:25 AM Page 512

I N D E X

5 1 3

Fisher, R. A., 15

ﬂat ﬁle, 45

F-measure, 172

FN (false negatives), 162

folds, 150

forward pruning, 34, 192

forward selection, 292, 294

forward stagewise additive modeling, 325–327

Fourier analysis, 25

FP (false positives), 162

freedom, degrees of 93, 155

functional dependencies, 350

functions in Weka, 404–405, 409–410

G

gain ratio, 104

GainRatioAttributeEval, 423

gambling, 160

garbage in, garbage out. See cost of errors; data

cleaning; error rate

Gaussian-distribution assumption, 92

Gaussian kernel function, 252

generalization as search, 30–35

bias, 32–35

enumerating concept space, 31–32

generalized distance functions, 241–242

generalized exemplars, 236

general-to-speciﬁc search bias, 34

genetic algorithms, 38

genetic algorithm search procedures, 294,

341

GeneticSearch, 424

getOptions(), 482

getting to know your data, 60

global discretization, 297

globalInfo(), 472

global optimization, 205–207

Gosset, William, 184

gradient descent, 227, 229, 230

Grading, 417

graphical models, 283

GraphViewer, 431

gray bar in margin of textbook (optional

sections), 30

greedy search, 33

GreedyStepwise, 423–424

growing set, 202

Hamming distance, 335

hand-labeled data, 338

hapax legomena, 310

hard instances, 322

hash table, 280

hazard detection system, 23–24

hidden attributes, 272

hidden layer, 226, 231, 232

hidden units, 226, 231, 234

hierarchical clustering, 139

highly-branching attribute, 86

high-performance rule inducers, 188

histogram equalization, 298

historical literary mystery, 358

holdout method, 146, 149–150, 333

homeland defense, 357

HTML, 355

hypermetrope, 13

hyperpipes, 139

Hyperpipes, 414

hyperplane, 124, 125

hyperrectangle, 238–239

hyperspheres, 133

hypertext markup language (HTML), 355

hypothesis testing, 29

I

IB1, 413

IB3, 237

IBk, 413

ID3, 105

Id3, 404

identiﬁcation code, 86, 102–104

implementation—real-world schemes,

187–283

Bayesian networks, 271–283

classiﬁcation rules, 200–214

clustering, 254–271

decision tree, 189–199

instance-based, 236–243

linear models, 214–235

P088407-INDEX.qxd 4/30/05 11:25 AM Page 513

Yüklə 4,3 Mb.

Dostları ilə paylaş:

1 ... 211 212 213 214 215 216 217 218 219