Data Mining: Practical Machine Learning Tools and Techniques, Second Edition



Yüklə 4,3 Mb.
Pdf görüntüsü
səhifə216/219
tarix08.10.2017
ölçüsü4,3 Mb.
#3816
1   ...   211   212   213   214   215   216   217   218   219

5 1 2

I N D E X

Experimenter, 437–447

advanced panel, 443–445



Analyze panel, 443–445

analyzing the results, 440–441

distributing processing over several

machines, 445–447

running an experiment, 439–440

simple setup, 441–442

starting up, 438–441

subexperiments, 447

Explorer, 369–425

ARFF, 370, 371, 380–382



Associate panel, 392

association-rule learners, 419–420

attribute evaluation methods, 421,

422–423


attribute selection, 392–393, 420–425

boosting, 416

classifier errors, 379

Classify panel, 384

clustering algorithms, 418–419



Cluster panel, 391–392

CSV format, 370, 371

error log, 378

file format converters, 380–382

filtering algorithms, 393–403

filters, 382–384

J4.8, 373–377

learning algorithms, 403–414. See also

learning algorithms

metalearning algorithms, 414–418

models, 377–378

panels, 380



Preprocess panel, 380

search methods, 421, 423–425



Select attributes panel, 392–393

starting up, 369–379

supervised filters, 401–403

training/testing learning schemes,

384–387

unsupervised attribute filters, 395–400



unsupervised instance filters, 400–401

User Classifier, 388–391



Visualize panel, 393

extraction problems, 353, 354



F

Fahrenheit, Daniel, 51

fallback heuristic, 239

false negative (FN), 162

false positive (FP), 162

false positive rate, 163



False positive rate, 378

Familiar, 360

family tree, 45

tabular representation of, 46



FarthestFirst, 419

features. See attributes

feature selection, 341. See also attribute

selection

feedforward networks, 233

fielded applications, 22

continuous monitoring, 28–29

customer support and service, 28

cybersecurity, 29

diagnosis, 25–26

ecological applications, 23, 28

electricity supply, 24–25

hazard detection system, 23–24

load forecasting, 24–25

loan application, 22–23

manufacturing processes, 28

marketing and sales, 26–28

oil slick detection, 23

preventive maintenance of

electromechanical devices, 25–26

scientific applications, 28

file format converters, 380–382

file mining, 49

filter, 290

filter in Weka, 382–384

FilteredClassifier, 401, 414

filtering algorithms in Weka, 393–403

sparse instances, 401

supervised filters, 401–403

unsupervised attribute filters, 395–400

unsupervised instance filters, 400–401

filtering approaches, 315

filters menu, 383

finite mixture, 262, 263

FirstOrder, 399

P088407-INDEX.qxd  4/30/05  11:25 AM  Page 512




I N D E X

5 1 3


Fisher, R. A., 15

flat file, 45

F-measure, 172

FN (false negatives), 162

folds, 150

forward pruning, 34, 192

forward selection, 292, 294

forward stagewise additive modeling, 325–327

Fourier analysis, 25

FP (false positives), 162

freedom, degrees of 93, 155

functional dependencies, 350

functions in Weka, 404–405, 409–410

G

gain ratio, 104



GainRatioAttributeEval, 423

gambling, 160

garbage in, garbage out. See cost of errors; data

cleaning; error rate

Gaussian-distribution assumption, 92

Gaussian kernel function, 252

generalization as search, 30–35

bias, 32–35

enumerating concept space, 31–32

generalized distance functions, 241–242

generalized exemplars, 236

general-to-specific search bias, 34

genetic algorithms, 38

genetic algorithm search procedures, 294,

341

GeneticSearch, 424

getOptions(), 482

getting to know your data, 60

global discretization, 297

globalInfo(), 472

global optimization, 205–207

Gosset, William, 184

gradient descent, 227, 229, 230



Grading, 417

graphical models, 283



GraphViewer, 431

gray bar in margin of textbook (optional

sections), 30

greedy search, 33



GreedyStepwise, 423–424

growing set, 202



H

Hamming distance, 335

hand-labeled data, 338

hapax legomena, 310

hard instances, 322

hash table, 280

hazard detection system, 23–24

hidden attributes, 272

hidden layer, 226, 231, 232

hidden units, 226, 231, 234

hierarchical clustering, 139

highly-branching attribute, 86

high-performance rule inducers, 188

histogram equalization, 298

historical literary mystery, 358

holdout method, 146, 149–150, 333

homeland defense, 357

HTML, 355

hypermetrope, 13

hyperpipes, 139



Hyperpipes, 414

hyperplane, 124, 125

hyperrectangle, 238–239

hyperspheres, 133

hypertext markup language (HTML), 355

hypothesis testing, 29



I

IB1, 413

IB3, 237


IBk, 413

ID3, 105


Id3, 404

identification code, 86, 102–104

implementation—real-world schemes,

187–283


Bayesian networks, 271–283

classification rules, 200–214

clustering, 254–271

decision tree, 189–199

instance-based, 236–243

linear models, 214–235

P088407-INDEX.qxd  4/30/05  11:25 AM  Page 513



Yüklə 4,3 Mb.

Dostları ilə paylaş:
1   ...   211   212   213   214   215   216   217   218   219




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə