Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Yüklə 4,3 Mb.

Pdf görüntüsü

səhifə	14/219
tarix	08.10.2017
ölçüsü	4,3 Mb.
	#3816

1 ... 10 11 12 13 14 15 16 17 ... 219

1.2 Simple examples: The weather problem and others

These meanings have some shortcomings when it comes to talking about com-

puters. For the ﬁrst two, it is virtually impossible to test whether learning has

been achieved or not. How do you know whether a machine has got knowledge

of something? You probably can’t just ask it questions; even if you could, you

wouldn’t be testing its ability to learn but would be testing its ability to answer

questions. How do you know whether it has become aware of something? The

whole question of whether computers can be aware, or conscious, is a burning

philosophic issue. As for the last three meanings, although we can see what they

denote in human terms, merely “committing to memory” and “receiving

instruction” seem to fall far short of what we might mean by machine learning.

They are too passive, and we know that computers ﬁnd these tasks trivial.

Instead, we are interested in improvements in performance, or at least in the

potential for performance, in new situations. You can “commit something to

memory” or “be informed of something” by rote learning without being able to

apply the new knowledge to new situations. You can receive instruction without

beneﬁting from it at all.

Earlier we deﬁned data mining operationally as the process of discovering

patterns, automatically or semiautomatically, in large quantities of data—and

the patterns must be useful. An operational deﬁnition can be formulated in the

same way for learning:

Things learn when they change their behavior in a way that makes them

perform better in the future.

This ties learning to performance rather than knowledge. You can test learning

by observing the behavior and comparing it with past behavior. This is a much

more objective kind of deﬁnition and appears to be far more satisfactory.

But there’s still a problem. Learning is a rather slippery concept. Lots of things

change their behavior in ways that make them perform better in the future, yet

we wouldn’t want to say that they have actually learned. A good example is a

comfortable slipper. Has it learned the shape of your foot? It has certainly

changed its behavior to make it perform better as a slipper! Yet we would hardly

want to call this learning. In everyday language, we often use the word “train-

ing” to denote a mindless kind of learning. We train animals and even plants,

although it would be stretching the word a bit to talk of training objects such

as slippers that are not in any sense alive. But learning is different. Learning

implies thinking. Learning implies purpose. Something that learns has to do so

intentionally. That is why we wouldn’t say that a vine has learned to grow round

a trellis in a vineyard—we’d say it has been trained. Learning without purpose

is merely training. Or, more to the point, in learning the purpose is the learner’s,

whereas in training it is the teacher’s.

Thus on closer examination the second deﬁnition of learning, in operational,

performance-oriented terms, has its own problems when it comes to talking about

C H A P T E R 1

W H AT ’ S I T A L L A B O U T ?

P088407-Ch001.qxd 4/30/05 11:11 AM Page 8

computers. To decide whether something has actually learned, you need to see

whether it intended to or whether there was any purpose involved. That makes

the concept moot when applied to machines because whether artifacts can behave

purposefully is unclear. Philosophic discussions of what is really meant by “learn-

ing,” like discussions of what is really meant by “intention” or “purpose,” are

fraught with difﬁculty. Even courts of law ﬁnd intention hard to grapple with.

Data mining

Fortunately, the kind of learning techniques explained in this book do not

present these conceptual problems—they are called machine learning without

really presupposing any particular philosophic stance about what learning actu-

ally is. Data mining is a practical topic and involves learning in a practical, not

a theoretical, sense. We are interested in techniques for ﬁnding and describing

structural patterns in data as a tool for helping to explain that data and make

predictions from it. The data will take the form of a set of examples—examples

of customers who have switched loyalties, for instance, or situations in which

certain kinds of contact lenses can be prescribed. The output takes the form of

predictions about new examples—a prediction of whether a particular customer

will switch or a prediction of what kind of lens will be prescribed under given

circumstances. But because this book is about ﬁnding and describing patterns

in data, the output may also include an actual description of a structure that

can be used to classify unknown examples to explain the decision. As well as

performance, it is helpful to supply an explicit representation of the knowledge

that is acquired. In essence, this reﬂects both deﬁnitions of learning considered

previously: the acquisition of knowledge and the ability to use it.

Many learning techniques look for structural descriptions of what is learned,

descriptions that can become fairly complex and are typically expressed as sets

of rules such as the ones described previously or the decision trees described

later in this chapter. Because they can be understood by people, these descrip-

tions serve to explain what has been learned and explain the basis for new pre-

dictions. Experience shows that in many applications of machine learning to

data mining, the explicit knowledge structures that are acquired, the structural

descriptions, are at least as important, and often very much more important,

than the ability to perform well on new examples. People frequently use data

mining to gain knowledge, not just predictions. Gaining knowledge from data

certainly sounds like a good idea if you can do it. To ﬁnd out how, read on!

1.2 Simple examples: The weather problem and others

We use a lot of examples in this book, which seems particularly appropriate con-

sidering that the book is all about learning from examples! There are several

1 . 2

S I M P L E E X A M P L E S : T H E W E AT H E R P RO B L E M A N D OT H E R S

P088407-Ch001.qxd 4/30/05 11:11 AM Page 9

Yüklə 4,3 Mb.

Dostları ilə paylaş:

1 ... 10 11 12 13 14 15 16 17 ... 219