Data Mining: Practical Machine Learning Tools and Techniques, Second Edition



Yüklə 4,3 Mb.
Pdf görüntüsü
səhifə16/219
tarix08.10.2017
ölçüsü4,3 Mb.
#3816
1   ...   12   13   14   15   16   17   18   19   ...   219

The rules we have seen so far are classification rules: they predict the classifi-

cation of the example in terms of whether to play or not. It is equally possible

to disregard the classification and just look for any rules that strongly associate

different attribute values. These are called association rules. Many association

rules can be derived from the weather data in Table 1.2. Some good ones are as

follows:


If temperature 

= cool


then humidity 

= normal


If humidity 

= normal and windy = false then play = yes

If outlook 

= sunny and play = no

then humidity 

= high


If windy 

= false and play = no

then outlook 

= sunny


and humidity 

= high.


All these rules are 100% correct on the given data; they make no false predic-

tions. The first two apply to four examples in the dataset, the third to three

examples, and the fourth to two examples. There are many other rules: in fact,

nearly 60 association rules can be found that apply to two or more examples of

the weather data and are completely correct on this data. If you look for rules

that are less than 100% correct, then you will find many more. There are so

many because unlike classification rules, association rules can “predict” any of

the attributes, not just a specified class, and can even predict more than one

thing. For example, the fourth rule predicts both that outlook will be sunny and

that humidity will be high.

1 2

C H A P T E R   1



|

W H AT ’ S   I T   A L L   A B O U T ?



Table 1.3

Weather data with some numeric attributes.

Outlook


Temperature

Humidity


Windy

Play


sunny

85

85



false

no

sunny



80

90

true



no

overcast


83

86

false



yes

rainy


70

96

false



yes

rainy


68

80

false



yes

rainy


65

70

true



no

overcast


64

65

true



yes

sunny


72

95

false



no

sunny


69

70

false



yes

rainy


75

80

false



yes

sunny


75

70

true



yes

overcast


72

90

true



yes

overcast


81

75

false



yes

rainy


71

91

true



no

P088407-Ch001.qxd  4/30/05  11:11 AM  Page 12




Contact lenses: An idealized problem

The contact lens data introduced earlier tells you the kind of contact lens to pre-

scribe, given certain information about a patient. Note that this example is

intended for illustration only: it grossly oversimplifies the problem and should

certainly not be used for diagnostic purposes!

The first column of Table 1.1 gives the age of the patient. In case you’re won-

dering, presbyopia is a form of longsightedness that accompanies the onset of

middle age. The second gives the spectacle prescription: myope means short-

sighted and hypermetrope means longsighted. The third shows whether the

patient is astigmatic, and the fourth relates to the rate of tear production, which

is important in this context because tears lubricate contact lenses. The final

column shows which kind of lenses to prescribe: hard, soft, or none. All possi-

ble combinations of the attribute values are represented in the table.

A sample set of rules learned from this information is shown in Figure 1.1.

This is a rather large set of rules, but they do correctly classify all the examples.

These rules are complete and deterministic: they give a unique prescription for

every conceivable example. Generally, this is not the case. Sometimes there are

situations in which no rule applies; other times more than one rule may apply,

resulting in conflicting recommendations. Sometimes probabilities or weights

1 . 2


S I M P L E   E X A M P L E S : T H E  W E AT H E R   P RO B L E M  A N D   OT H E R S

1 3


If tear production rate = reduced then recommendation = none

If age = young and astigmatic = no and

   tear production rate = normal then recommendation = soft

If age = pre-presbyopic and astigmatic = no and

   tear production rate = normal then recommendation = soft

If age = presbyopic and spectacle prescription = myope and

   astigmatic = no then recommendation = none

If spectacle prescription = hypermetrope and astigmatic = no and

   tear production rate = normal then recommendation = soft

If spectacle prescription = myope and astigmatic = yes and

   tear production rate = normal then recommendation = hard

If age = young and astigmatic = yes and

   tear production rate = normal then recommendation = hard

If age = pre-presbyopic and

   spectacle prescription = hypermetrope and astigmatic = yes

   then recommendation = none

If age = presbyopic and spectacle prescription = hypermetrope

   and astigmatic = yes then recommendation = none



Figure 1.1 Rules for the contact lens data.

P088407-Ch001.qxd  4/30/05  11:11 AM  Page 13




Yüklə 4,3 Mb.

Dostları ilə paylaş:
1   ...   12   13   14   15   16   17   18   19   ...   219




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə