Book on Parallel Data Mining


Alex A. Freitas & Simon H. Lavington.
Mining Very Large Databases
with Parallel Processing.
Kluwer Academic Publishers, 1998.
224 pp. ISBN 0-7923-8048-7


TABLE OF CONTENTS

PREFACE

ACKNOWLEDGMENTS

INTRODUCTION
The Motivation for Data Mining and Knowledge Discovery
The Inter-disciplinary Nature of Knowledge Discovery in Databases
The Challenge of Efficient Knowledge Discovery in Large Databases and Data Warehouses
Organization of the Book

PART I - KNOWLEDGE DISCOVERY AND DATA MINING

1 KNOWLEDGE DISCOVERY TASKS
1.1 Discovery of Association Rules
1.2 Classification
1.3 Other KDD Tasks

2 KNOWLEDGE DISCOVERY PARADIGMS
2.1 Rule Induction (RI)
2.2 Instance-Based Learning (IBL)
2.3 Neural Networks (NN)
2.4 Genetic Algorithms (GA)
2.5 On-Line Analytical Processing (OLAP
2.6 Focus on Rule Induction

3 THE KNOWLEDGE DISCOVERY PROCESS
3.1 An Overview of the Knowledge Discovery Process
3.2 Data Warehouse (DW)
3.3 Attribute Selection
3.4 Discretization
3.5 Rule-Set Refinement

4 DATA MINING
4.1 Decision-Tree Building
4.2 Overfitting
4.3 Data-Mining-Algorithm Bias
4.4 Improved Representation Languages
4.5 Integrated Data Mining Architectures

5 DATA MINING TOOLS.
5.1 Clementine
5.2 Darwin
5.3 MineSet
5.4 Intelligent Miner
5.5 Decision-Tree-Building Tools

PART II - PARALLEL DATABASE SYSTEMS

6 BASIC CONCEPTS ON PARALLEL PROCESSING
6.1 Temporal and Spatial Parallelism
6.2 Granularity, Level and Degree of Parallelism
6.3 Shared and Distributed Memory
6.4 Evaluating the Performance of a Parallel System
6.5 Communication Overhead
6.6 Load Balancing
6.7 Approaches for Exploiting Parallelism

7 DATA PARALLELISM, CONTROL PARALLELISM AND RELATED ISSUES
7.1 Data Parallelism and Control Parallelism
7.2 Easy of Use and Automatic Parallelization
7.3 Machine-Architecture Independence.
7.4 Scalability
7.5 Data Partitioning
7.6 Data Placement (Declustering)

8 PARALLEL DATABASE SERVERS
8.1 Architectures of Parallel Database Servers
8.2 From the Teradata DBC 1012 to the NCR WorldMark 5100
8.3 ICL Goldrush Running Oracle Parallel Server
8.4 IBM SP2 Running DB2 Parallel Edition (DB2-PE)
8.5 Monet

PART III - PARALLEL DATABASE SYSTEMS

9 APPROACHES TO SPEED UP DATA MINING
9.1 Overview of Approaches to Speed up Data Mining
9.2 Discretization
9.3 Attribute Selection
9.4 Sampling and Related Approaches
9.5 Fast Algorithms
9.6 Distributed Data Mining
9.7 Parallel Data Mining
9.8 Discussion

10 PARALLEL DATA MINING WITHOUT DBMS FACILITIES
10.1 Parallel Rule Induction
10.2 Parallel Decision-Tree Building
10.3 Parallel Instance-Based Learning
10.4 Parallel Genetic Algorithms
10.5 Parallel Neural Networks
10.6 Discussion

11 PARALLEL DATA MINING WITH DATABASE FACILITIES
11.1 An Overview of Integrated Data Mining/Data Warehouse Frameworks
11.2 The Case for Integrating Data Mining and the Data Warehouse
11.3 Server-Based KDD Systems
11.4 Hybrid Client/Server-Based KDD Systems
11.5 Generic, Set-Oriented Primitives for the Hybrid Client/Server-Based KDD Framework
11.6 A Generic, Set-Oriented Primitive for Candidate-Rule (CR) Evaluation in Rule Induction
11.7 A Generic, Set-Oriented Primitive for Computing Distance Metrics in Instance-Based Learning.
11.8 Parallel Data Mining with Specialized-Hardware Parallel Database Servers

12 SUMMARY AND SOME OPEN PROBLEMS
12.1 Data-Parallel vs. Control-Parallel Data Mining
12.2 Client/Server Frameworks for Parallel Data Mining
12.3 Open Problems

REFERENCES

INDEX
More information:
Kluwer Academic Publishers
101 Philip Drive, Norwell, Ma. 02061
Phone: 781-871-6600, Fax: (781) 871-6528
E-mail: kluwer@wkap.com, URL: http://www.wkap.nl


Computer Science @ Kent
Go to the University of Kent's homepage Last modified Friday July 19 15:20:47 BST 2002
Problems with this page? Contact the CS Webmaster
http://www.cs.ukc.ac.uk/people/staff/aaf/book-kluwer-ukc.html