whether
or not a fault existed, but to diagnose the kind of fault, given that one
was there. Thus there was no need to include fault-free cases in the training set.
The measured attributes were rather low level and had to be augmented by inter-
mediate concepts, that is, functions of basic attributes, which were defined in
consultation with the expert and embodied some causal domain knowledge.
The derived attributes were run through an induction algorithm to produce a
set of diagnostic rules. Initially, the expert was not satisfied with the rules
because he could not relate them to his own knowledge and experience. For
him, mere statistical evidence was not, by itself, an adequate explanation.
Further background knowledge had to be used before satisfactory rules were
generated. Although the resulting rules were quite complex, the expert liked
them because he could justify them in light of his mechanical knowledge. He
was pleased that a third of the rules coincided with ones he used himself and
was delighted to gain new insight from some of the others.
Performance tests indicated that the learned rules were slightly superior to
the handcrafted ones that had previously been elicited from the expert, and this
result was confirmed by subsequent use in the chemical factory. It is interesting
to note, however, that the system was put into use not because of its good per-
formance but because the domain expert approved of the rules that had been
learned.
Marketing and sales
Some of the most active applications of data mining have been in the area of
marketing and sales. These are domains in which companies possess massive
volumes of precisely recorded data, data which—it has only recently been real-
ized—is potentially extremely valuable. In these applications, predictions them-
selves are the chief interest: the structure of how decisions are made is often
completely irrelevant.
We have already mentioned the problem of fickle customer loyalty and the
challenge of detecting customers who are likely to defect so that they can be
wooed back into the fold by giving them special treatment. Banks were early
adopters of data mining technology because of their successes in the use of
machine learning for credit assessment. Data mining is now being used to
reduce customer attrition by detecting changes in individual banking patterns
that may herald a change of bank or even life changes—such as a move to
another city—that could result in a different bank being chosen. It may reveal,
for example, a group of customers with above-average attrition rate who do
most of their banking by phone after hours when telephone response is slow.
Data mining may determine groups for whom new services are appropriate,
such as a cluster of profitable, reliable customers who rarely get cash advances
from their credit card except in November and December, when they are pre-
2 6
C H A P T E R 1
|
W H AT ’ S I T A L L A B O U T ?
P088407-Ch001.qxd 4/30/05 11:11 AM Page 26
pared to pay exorbitant interest rates to see them through the holiday season. In
another domain, cellular phone companies fight churn by detecting patterns of
behavior that could benefit from new services, and then advertise such services
to retain their customer base. Incentives provided specifically to retain existing
customers can be expensive, and successful data mining allows them to be pre-
cisely targeted to those customers where they are likely to yield maximum benefit.
Market basket analysis is the use of association techniques to find
groups of
items that tend to occur together in transactions, typically supermarket check-
out data. For many retailers this is the only source of sales information that is
available for data mining. For example, automated analysis of checkout data
may uncover the fact that customers who buy beer also buy chips, a discovery
that could be significant from the supermarket operator’s point of view
(although rather an obvious one that probably does not need a data mining
exercise to discover). Or it may come up with the fact that on Thursdays, cus-
tomers often purchase diapers and beer together, an initially surprising result
that, on reflection, makes some sense as young parents stock up for a weekend
at home. Such information could be used for many purposes: planning store
layouts, limiting special discounts to just one of a set of items that tend to be
purchased together, offering coupons for a matching product when one of them
is sold alone, and so on. There is enormous added value in being able to iden-
tify individual customer’s sales histories. In fact, this value is leading to a pro-
liferation of discount cards or “loyalty” cards that allow retailers to identify
individual customers whenever they make a purchase; the personal data that
results will be far more valuable than the cash value of the discount. Identifica-
tion of individual customers not only allows historical analysis of purchasing
patterns but also permits precisely targeted special offers to be mailed out to
prospective customers.
This brings us to direct marketing, another popular domain for data mining.
Promotional offers are expensive and have an extremely low—but highly
profitable—response rate. Any technique that allows a promotional mailout to
be more tightly focused, achieving the same or nearly the same response from
a much smaller sample, is valuable. Commercially available databases contain-
ing demographic information based on ZIP codes that characterize the associ-
ated neighborhood can be correlated with information on existing customers
to find a socioeconomic model that predicts what kind of people will turn out
to be actual customers. This model can then be used on information gained in
response to an initial mailout, where people send back a response card or call
an 800 number for more information, to predict likely future customers. Direct
mail companies have the advantage over shopping-mall retailers of having com-
plete purchasing histories for each individual customer and can use data mining
to determine those likely to respond to special offers. Targeted campaigns are
cheaper than mass-marketed campaigns because companies save money by
1 . 3
F I E L D E D A P P L I C AT I O N S
2 7
P088407-Ch001.qxd 4/30/05 11:11 AM Page 27