HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 684
#12
684
Index
data transformation (Continued)
attribute construction, 112
in back-end tools/utilities, 134
concept hierarchy generation, 112, 120
discretization, 111, 112, 120
normalization, 112, 113–115, 120
smoothing, 112
strategies, 112–113
See also data preprocessing
data types
complex, 166
complex, mining, 585–598
for data mining, 8
data validation, 592–593
data visualization, 56–65, 79, 602–603
complex data and relations, 64–65
geometric projection techniques, 58–60
hierarchical techniques, 63–64
icon-based techniques, 60–63
mining process, 603
mining result, 603, 605
pixel-oriented techniques, 57–58
in science applications, 613
summary, 65
tag clouds, 64, 66
techniques, 39–40
data warehouses, 10–13, 26, 33, 125–185
analytical processing, 153
back-end tools/utilities, 134, 178
basic concepts, 125–135
bottom-up design approach, 133, 151–152
business analysis framework for, 150
business query view, 151
combined design approach, 152
data mart, 132, 142
data mining, 154
data source view, 151
design process, 151
development approach, 133
development tools, 153
dimensions, 10
enterprise, 132
extractors, 151
fact constellation, 141–142
for financial data, 608
framework illustration, 11
front-end client layer, 132
gateways, 131
geographic, 595
implementation, 156–165
information processing, 153
integrated, 126
metadata, 134–135
modeling, 10, 135–150
models, 132–134
multitier, 134
multitiered architecture, 130–132
nonvolatile, 127
OLAP server, 132
operational database systems versus, 128–129
planning and analysis tools, 153
retail industry, 609–610
in science applications, 612
snowflake schema, 140–141
star schema, 139–140
subject-oriented, 126
three-tier architecture, 131, 178
time-variant, 127
tools, 11
top-down design approach, 133, 151
top-down view, 151
update-driven approach, 128
usage for information processing, 153
view, 151
virtual, 133
warehouse database server, 131
database management systems (DBMSs), 9
database queries. See queries
databases, 9
inductive, 601
relational. See relational databases
research, 26
statistical, 148–149
technology evolution, 3
transactional, 13–15
types of, 32
web-based, 4
data/pattern analysis. See data mining
DBSCAN, 471–473
algorithm illustration, 474
core objects, 472
density estimation, 477
density-based cluster, 472
density-connected, 472, 473
density-reachable, 472, 473
directly density-reachable, 472
neighborhood density, 471
See also cluster analysis; density-based methods
DDPMine, 422
decimal scaling, normalization by, 115
decision tree analysis, discretization by, 116
decision tree induction, 330–350, 385
algorithm differences, 336
algorithm illustration, 333
HAN
22-ind-673-708-9780123814791
2011/6/1
3:27
Page 685
#13
Index
685
attribute selection measures, 336–344
attribute subset selection, 105
C4.5, 332
CART, 332
CHAID, 343
gain ratio, 340–341
Gini index, 332, 341–343
ID3, 332
incremental versions, 336
information gain, 336–340
multivariate splits, 344
parameters, 332
scalability and, 347–348
splitting criterion, 333
from training tuples, 332–333
tree pruning, 344–347, 385
visual mining for, 348–350
decision trees, 18, 330
branches, 330
illustrated, 331
internal nodes, 330
leaf nodes, 330
pruning, 331, 344–347
root node, 330
rule extraction from, 357–359
deep web, 597
default rules, 357
DENCLUE, 476–479
advantages, 479
clusters, 478
density attractor, 478
density estimation, 476
kernel density estimation, 477–478
kernels, 478
See also cluster analysis; density-based methods
dendrograms, 460
densification power law, 592
density estimation, 476
DENCLUE, 477–478
kernel function, 477–478
density-based methods, 449, 471–479, 491
DBSCAN, 471–473
DENCLUE, 476–479
object division, 449
OPTICS, 473–476
STING as, 480
See also cluster analysis
density-based outlier detection, 564–567
local outlier factor, 566–567
local proximity, 564
local reachability density, 566
relative density, 565
descendant cells, 189
descriptive mining tasks, 15
DIANA (Divisive Analysis), 459, 460
dice operation, 148
differential privacy, 622
dimension tables, 136
dimensional cells, 189
dimensionality reduction, 86, 99–100, 120
dimensionality reduction methods, 510,
519–522, 538
list of, 587
spectral clustering, 520–522
dimension/level
application of, 297
constraints, 294
dimensions, 10, 136
association rule, 281
cardinality of, 159
concept hierarchies and, 142–144
in multidimensional view, 33
ordering of, 210
pattern, 281
ranking, 225
relevance analysis, 175
selection, 225
shared, 204
See also data warehouses
direct discriminative pattern mining, 422
directed acyclic graphs, 394–395
discernibility matrix, 427
discovery-driven exploration, 231–234, 235
discrepancy detection, 91–93
discrete attributes, 44
discrete Fourier transform (DFT), 101, 587
discrete wavelet transform (DWT), 100–102,
587
discretization, 112, 120
by binning, 115
by clustering, 116
by correlation analysis, 117
by decision tree analysis, 116
by histogram analysis, 115–116
techniques, 113
discriminant analysis, 600
discriminant rules, 16
discriminative frequent pattern-based classification,
416, 419–422, 437
basis for, 419
feature generation, 420
feature selection, 420–421
framework, 420–421
learning of classification model, 421
Dostları ilə paylaş: |