8
Moving on: Extensions and applications
345
8.1
Learning from massive datasets
346
8.2
Incorporating domain knowledge
349
8.3
Text and Web mining
351
8.4
Adversarial situations
356
8.5
Ubiquitous data mining
358
8.6
Further reading
361
Part II The Weka machine learning workbench
363
9
Introduction to Weka
365
9.1
What’s in Weka?
366
9.2
How do you use it?
367
9.3
What else can you do?
368
9.4
How do you get it?
368
10
The Explorer
369
10.1
Getting started
369
Preparing the data
370
Loading the data into the Explorer
370
Building a decision tree
373
Examining the output
373
Doing it again
377
Working with models
377
When things go wrong
378
10.2
Exploring the Explorer
380
Loading and filtering files
380
Training and testing learning schemes
384
Do it yourself: The User Classifier
388
Using a metalearner
389
Clustering and association rules
391
Attribute selection
392
Visualization
393
10.3
Filtering algorithms
393
Unsupervised attribute filters
395
Unsupervised instance filters
400
Supervised filters
401
C O N T E N TS
x i i i
P088407-FM.qxd 4/30/05 10:55 AM Page xiii
10.4
Learning algorithms
403
Bayesian classifiers
403
Trees
406
Rules
408
Functions
409
Lazy classifiers
413
Miscellaneous classifiers
414
10.5
Metalearning algorithms
414
Bagging and randomization
414
Boosting
416
Combining classifiers
417
Cost-sensitive learning
417
Optimizing performance
417
Retargeting classifiers for different tasks
418
10.6
Clustering algorithms
418
10.7
Association-rule learners
419
10.8
Attribute selection
420
Attribute subset evaluators
422
Single-attribute evaluators
422
Search methods
423
11
The Knowledge Flow interface
427
11.1
Getting started
427
11.2
The Knowledge Flow components
430
11.3
Configuring and connecting the components
431
11.4
Incremental learning
433
12
The Experimenter
437
12.1
Getting started
438
Running an experiment
439
Analyzing the results
440
12.2
Simple setup
441
12.3
Advanced setup
442
12.4
The Analyze panel
443
12.5
Distributing processing over several machines
445
x i v
C O N T E N TS
P088407-FM.qxd 4/30/05 10:55 AM Page xiv
13
The command-line interface
449
13.1
Getting started
449
13.2
The structure of Weka
450
Classes, instances, and packages
450
The weka.core package
451
The weka.classifiers package
453
Other packages
455
Javadoc indices
456
13.3
Command-line options
456
Generic options
456
Scheme-specific options
458
14
Embedded machine learning
461
14.1
A simple data mining application
461
14.2
Going through the code
462
main()
462
MessageClassifier()
462
updateData()
468
classifyMessage()
468
15
Writing new learning schemes
471
15.1
An example classifier
471
buildClassifier()
472
makeTree()
472
computeInfoGain()
480
classifyInstance()
480
main()
481
15.2
Conventions for implementing classifiers
483
References
485
Index
505
About the authors
525
C O N T E N TS
x v
P088407-FM.qxd 5/3/05 9:13 AM Page xv