Data Mining. Concepts and Techniques, 3rd Edition


HAN 22-ind-673-708-9780123814791



Yüklə 7,95 Mb.
Pdf görüntüsü
səhifə341/343
tarix08.10.2017
ölçüsü7,95 Mb.
#3817
1   ...   335   336   337   338   339   340   341   342   343

HAN

22-ind-673-708-9780123814791

2011/6/1

3:27

Page 696

#24

696

Index

partitioning (Continued)

recursive, 335

tuples, 334

Partitioning Around Medoids (PAM) algorithm,

455–457


partitioning methods, 448, 451–457, 491

centroid-based, 451–454

global optimality, 449

iterative relocation techniques, 448



k-means, 451–454

k-medoids, 454–457

k-modes, 454

object-based, 454–457



See also cluster analysis

path-based similarity, 594

pattern analysis, in recommender systems,

282


pattern clustering, 308–310

pattern constraints, 297–300

pattern discovery, 601

pattern evaluation, 8

pattern evaluation measures, 267–271

all confidence

, 268

comparison, 269–270



cosine, 268

Kulczynski, 268

max confidence

, 268


null-invariant, 270–271

See also measures

pattern space pruning, 295

pattern-based classification, 282, 318

pattern-based clustering, 282, 516

Pattern-Fusion, 302–307

characteristics, 304

core pattern, 304–305

initial pool, 306

iterative, 306

merging subpatterns, 306

shortcuts identification, 304

See also colossal patterns

pattern-guided mining, 30

patterns

actionable, 22

co-location, 319

colossal, 301–307, 320

combined significance, 312

constraint-based generation, 296–301

context modeling of, 314–315

core, 304–305

distance, 309

evaluation methods, 264–271

expected, 22

expressed, 309

frequent, 17

hidden meaning of, 314

interesting, 21–23, 33

metric space, 306–307

negative, 280, 291–294, 320

negatively correlated, 292, 293

rare, 280, 291–294, 320

redundancy between, 312

relative significance, 312

representative, 309

search space, 303

strongly negatively correlated, 292

structural, 282

type specification, 15–23

unexpected, 22

See also frequent patterns

pattern-trees, 264

Pearson’s correlation coefficient, 222

percentiles, 48

perception-based classification (PBC), 348

illustrated, 349

as interactive visual approach, 607

pixel-oriented approach, 348–349

split screen, 349

tree comparison, 350

phylogenetic trees, 590

pivot (rotate) operation, 148

pixel-oriented visualization, 57

planning and analysis tools, 153

point queries, 216, 217, 220

pool-based approach, 433

positive correlation, 55, 56

positive tuples, 364

positively skewed data, 47

possibility theory, 428

posterior probability, 351

postpruning, 344–345, 346

power law distribution, 592

precision measure, 368–369

predicate sets

frequent, 288–289



k, 289

predicates

repeated, 288

variables, 295

prediction, 19

classification, 328

link, 593–594

loan payment, 608–609

with naive Bayesian classification, 353–355

numeric, 328, 385




HAN

22-ind-673-708-9780123814791

2011/6/1

3:27

Page 697

#25

Index

697

prediction cubes, 227–230, 235

example, 228–229

Probability-Based Ensemble, 229–230

predictive analysis, 18–19

predictive mining tasks, 15

predictive statistics, 24

predictors, 328

prepruning, 344, 346

prime relations

contrasting classes, 175, 177

deriving, 174

target classes, 175, 177

principle components analysis (PCA), 100, 102–103

application of, 103

correlation-based clustering with, 511

illustrated, 103

in lower-dimensional space extraction, 578

procedure, 102–103

prior probability, 351

privacy-preserving data mining, 33, 621, 626

distributed, 622



k-anonymity method, 621–622

l-diversity method, 622

as mining trend, 624–625

randomization methods, 621

results effectiveness, downgrading, 622

probabilistic clusters, 502–503

probabilistic hierarchical clustering, 467–470

agglomerative clustering framework, 467,

469


algorithm, 470

drawbacks of using, 469–470

generative model, 467–469

interpretability, 469

understanding, 469

See also hierarchical methods

probabilistic model-based clustering, 497–508, 538

expectation-maximization algorithm, 505–508

fuzzy clusters and, 499–501

product reviews example, 498

user search intent example, 498



See also cluster analysis

probability

estimation techniques, 355

posterior, 351

prior, 351

probability and statistical theory, 601

Probability-Based Ensemble (PBE), 229–230

PROCLUS, 511

profiles, 614

proximity measures, 67

for binary attributes, 70–72

for nominal attributes, 68–70

for ordinal attributes, 74–75

proximity-based methods, 552, 560–567, 581

density-based, 564–567

distance-based, 561–562

effectiveness, 552

example, 552

grid-based, 562–564

types of, 552, 560



See also outlier detection

pruning


cost complexity algorithm, 345

data space, 300–301

decision trees, 331, 344–347

in k-nearest neighbor classification, 425

network, 406–407

pattern space, 295, 297–300

pessimistic, 345

postpruning, 344–345, 346

prepruning, 344, 346

rule, 363

search space, 263, 301

sets, 345

shared dimensions, 205

sub-itemset, 263

pyramid algorithm, 101

Q

quality control, 600

quantile plots, 51–52

quantile-quantile plots, 52

example, 53–54

illustrated, 53



See also graphic displays

quantitative association rules, 281, 283, 288,

320

clustering-based mining, 290–291



data cube-based mining, 289–290

exceptional behavior disclosure, 291

mining, 289

quartiles, 48

first, 49

third, 49

queries, 10

intercuboid expansion, 223–225

intracuboid expansion, 221–223

language, 10

OLAP, 129, 130

point, 216, 217, 220

processing, 163–164, 218–227

range, 220

relational operations, 10



Yüklə 7,95 Mb.

Dostları ilə paylaş:
1   ...   335   336   337   338   339   340   341   342   343




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə