Data Mining: Practical Machine Learning Tools and Techniques, Second Edition



Yüklə 4,3 Mb.
Pdf görüntüsü
səhifə3/219
tarix08.10.2017
ölçüsü4,3 Mb.
#3816
1   2   3   4   5   6   7   8   9   ...   219

Contents

Foreword

v

Preface

xxiii

Updated and revised content

xxvii

Acknowledgments

xxix

Part I Machine learning tools and techniques

1

1

What’s it all about?

3

1.1

Data mining and machine learning

4

Describing structural patterns

6

Machine learning

7

Data mining

9

1.2

Simple examples: The weather problem and others

9

The weather problem

10

Contact lenses: An idealized problem

13

Irises: A classic numeric dataset

15

CPU performance: Introducing numeric prediction

16

Labor negotiations: A more realistic example

17

Soybean classification: A classic machine learning success

18

1.3

Fielded applications

22

Decisions involving judgment

22

Screening images

23

Load forecasting

24

Diagnosis

25

Marketing and sales

26

Other applications

28

v i i


P088407-FM.qxd  4/30/05  10:55 AM  Page vii


1.4

Machine learning and statistics

29

1.5

Generalization as search

30

Enumerating the concept space

31

Bias

32

1.6

Data mining and ethics

35

1.7

Further reading

37

2

Input: Concepts, instances, and attributes

41

2.1

What’s a concept?

42

2.2

What’s in an example?

45

2.3

What’s in an attribute?

49

2.4

Preparing the input

52

Gathering the data together

52

ARFF format

53

Sparse data

55

Attribute types

56

Missing values

58

Inaccurate values

59

Getting to know your data

60

2.5

Further reading

60

3

Output: Knowledge representation

61

3.1

Decision tables

62

3.2

Decision trees

62

3.3

Classification rules

65

3.4

Association rules

69

3.5

Rules with exceptions

70

3.6

Rules involving relations

73

3.7

Trees for numeric prediction

76

3.8

Instance-based representation

76

3.9

Clusters

81

3.10

Further reading

82

v i i i


C O N T E N TS

P088407-FM.qxd  4/30/05  10:55 AM  Page viii




4

Algorithms: The basic methods

83

4.1

Inferring rudimentary rules

84

Missing values and numeric attributes

86

Discussion

88

4.2

Statistical modeling

88

Missing values and numeric attributes

92

Bayesian models for document classification

94

Discussion

96

4.3

Divide-and-conquer: Constructing decision trees

97

Calculating information

100

Highly branching attributes

102

Discussion

105

4.4

Covering algorithms: Constructing rules

105

Rules versus trees

107

A simple covering algorithm

107

Rules versus decision lists

111

4.5

Mining association rules

112

Item sets

113

Association rules

113

Generating rules efficiently

117

Discussion

118

4.6

Linear models

119

Numeric prediction: Linear regression

119

Linear classification: Logistic regression

121

Linear classification using the perceptron

124

Linear classification using Winnow

126

4.7

Instance-based learning

128

The distance function

128

Finding nearest neighbors efficiently

129

Discussion

135

4.8

Clustering

136

Iterative distance-based clustering

137

Faster distance calculations

138

Discussion

139

4.9

Further reading

139

C O N T E N TS

i x

P088407-FM.qxd  4/30/05  10:55 AM  Page ix




Yüklə 4,3 Mb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9   ...   219




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə