Data Mining: Practical Machine Learning Tools and Techniques, Second Edition



Yüklə 4,3 Mb.
Pdf görüntüsü
səhifə213/219
tarix08.10.2017
ölçüsü4,3 Mb.
#3816
1   ...   209   210   211   212   213   214   215   216   ...   219

5 0 6

I N D E X

anomaly detection systems, 357

antecedent, of rule, 65



AODE, 405

Apriori, 419

Apriori method, 141

area under the curve (AUC), 173

ARFF format, 53–55

converting files to, 380–382

Weka, 370, 371



ARFFLoader, 381, 427

arithmetic underflow, 276

assembling the data, 52–53

assessing performance of learning scheme, 286

assignment of key phrases, 353

Associate panel, 392

association learning, 43

association rules, 69–70, 112–119

binary attributes, 119

generating rules efficiently, 117–118

item sets, 113, 114–115

Weka, 419–420

association-rule learners in Weka, 419–420

attackers, 357

Attribute, 451

attribute(), 480

attribute discretization. See discretizing

numeric attributes

attribute-efficient, 128

attribute evaluation methods in Weka, 421,

422–423


attribute filters in Weka, 394, 395–400, 402–403

attributeIndices, 382

attribute noise, 313

attribute-relation file format. See ARFF format

attributes, 49–52

adding irrelevant, 288

Boolean, 51

class, 53

as columns in tables, 49

combinations of, 65

continuous, 49

discrete, 51

enumerated, 51

highly branching, 86

identification code, 86

independent, 267

integer-valued, 49

nominal, 49

non-numeric, 17

numeric, 49

ordinal, 51

relevant, 289

rogue, 59

selecting, 288

subsets of values in, 80

types in ARFF format, 56

weighting, 237



See also orderings

AttributeSelectedClassifier, 417

attribute selection, 286–287, 288–296

attribute evaluation methods in Weka, 421,

422–423


backward elimination, 292, 294

beam search, 293

best-first search, 293

forward selection, 292, 294

race search, 295

schemata search, 295

scheme-independent selection, 290–292

scheme-specific selection, 294–296

searching the attribute space, 292–294

search methods in Weka, 421, 423–425

Weka, 392–393, 420–425

AttributeSelection, 403

attribute subset evaluators in Weka, 421, 422



AttributeSummarizer, 431

attribute transformations, 287, 305–311

principal components analysis, 306–309

random projections, 309

text to attribute vectors, 309–311

time series, 311

attribute weighting method, 237–238

AUC (area under the curve), 173

audit logs, 357

authorship ascription, 353

AutoClass, 269–270, 271

automatic data cleansing, 287, 312–315

anomalies, 314–315

improving decision trees, 312–313

robust regression, 313–314

P088407-INDEX.qxd  4/30/05  11:25 AM  Page 506




I N D E X

5 0 7


automatic filtering, 315

averaging over subnetworks, 283

axis-parallel class boundaries, 242

B

background knowledge, 348

backpropagation, 227–233

backtracking, 209

backward elimination, 292, 294

backward pruning, 34, 192

bagging, 316–319

Bagging, 414–415

bagging with costs, 319–320

bag of words, 95

balanced Winnow, 128

ball tree, 133–135

basic methods. See algorithms-basic methods

batch learning, 232

Bayes, Thomas, 141

Bayesian classifier. See Naïve Bayes

Bayesian clustering, 268–270

Bayesian multinet, 279–280

Bayesian network, 141, 271–283

AD tree, 280–283

Bayesian multinet, 279–280

caveats, 276, 277

counting, 280

K2, 278

learning, 276–283



making predictions, 272–276

Markov blanket, 278–279

multiplication, 275

Naïve Bayes classifier, 278

network scoring, 277

simplifying assumptions, 272

structure learning by conditional

independence tests, 280

TAN (Tree Augmented Naïve Bayes), 279

Weka, 403–406

Bayesian network learning algorithms, 277–283

Bayesian option trees, 328–331, 343

Bayesians, 141

Bayesian scoring metrics, 277–280, 283

Bayes information, 271

BayesNet, 405

Bayes’s rule, 90, 181

beam search, 34, 293

beam width, 34

beer purchases, 27

Ben Ish Chai, 358

Bernoulli process, 147

BestFirst, 423

best-first search, 293

best-matching node, 257

bias, 32


defined, 318

language, 32–33

multilayer perceptrons, 225, 226

overfitting-avoidance, 34–35

perceptron learning rule, 124

search, 33–34

what is it, 317

bias-variance decomposition, 317, 318

big data (massive datasets), 346–349

binning


equal-frequency, 298

equal-interval, 298

equal-width, 342

binomial coefficient, 218

bits, 102

boolean, 51, 68

boosting, 321–325, 347

boosting in Weka, 416

bootstrap aggregating, 318

bootstrap estimation, 152–153

British Petroleum, 28

buildClassifier(), 453, 472, 482

C

C4.5, 105, 198–199

C5.0, 199

calm computing, 359, 362

capitalization conventions, 310

CAPPS (Computer Assisted Passenger Pre-

Screening System), 357

CART (Classification And Regression Tree), 29,

38, 199, 253

categorical attributes, 49. See also nominal

attributes

category utility, 260–262

P088407-INDEX.qxd  4/30/05  11:25 AM  Page 507



Yüklə 4,3 Mb.

Dostları ilə paylaş:
1   ...   209   210   211   212   213   214   215   216   ...   219




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə