Computer Science at Kent

Alex A. Freitas & Simon H. Lavington.

Mining Very Large Databases

with Parallel Processing.

Kluwer Academic Publishers, 1998.

224 pp. ISBN 0-7923-8048-7

TABLE OF CONTENTS

PREFACE

ACKNOWLEDGMENTS

INTRODUCTION

The Motivation for Data Mining and Knowledge Discovery

The Inter-disciplinary Nature of Knowledge Discovery in Databases

The Challenge of Efficient Knowledge Discovery in Large Databases and Data Warehouses

Organization of the Book

PART I - KNOWLEDGE DISCOVERY AND DATA MINING

1 KNOWLEDGE DISCOVERY TASKS

1.1 Discovery of Association Rules

1.2 Classification

1.3 Other KDD Tasks

2 KNOWLEDGE DISCOVERY PARADIGMS

2.1 Rule Induction (RI)

2.2 Instance-Based Learning (IBL)

2.3 Neural Networks (NN)

2.4 Genetic Algorithms (GA)

2.5 On-Line Analytical Processing (OLAP

2.6 Focus on Rule Induction

3 THE KNOWLEDGE DISCOVERY PROCESS

3.1 An Overview of the Knowledge Discovery Process

3.2 Data Warehouse (DW)

3.3 Attribute Selection

3.4 Discretization

3.5 Rule-Set Refinement

4 DATA MINING

4.1 Decision-Tree Building

4.2 Overfitting

4.3 Data-Mining-Algorithm Bias

4.4 Improved Representation Languages

4.5 Integrated Data Mining Architectures

5 DATA MINING TOOLS.

5.1 Clementine

5.2 Darwin

5.3 MineSet

5.4 Intelligent Miner

5.5 Decision-Tree-Building Tools

PART II - PARALLEL DATABASE SYSTEMS

6 BASIC CONCEPTS ON PARALLEL PROCESSING

6.1 Temporal and Spatial Parallelism

6.2 Granularity, Level and Degree of Parallelism

6.3 Shared and Distributed Memory

6.4 Evaluating the Performance of a Parallel System

6.5 Communication Overhead

6.6 Load Balancing

6.7 Approaches for Exploiting Parallelism

7 DATA PARALLELISM, CONTROL PARALLELISM AND RELATED ISSUES

7.1 Data Parallelism and Control Parallelism

7.2 Easy of Use and Automatic Parallelization

7.3 Machine-Architecture Independence.

7.4 Scalability

7.5 Data Partitioning

7.6 Data Placement (Declustering)

8 PARALLEL DATABASE SERVERS

8.1 Architectures of Parallel Database Servers

8.2 From the Teradata DBC 1012 to the NCR WorldMark 5100

8.3 ICL Goldrush Running Oracle Parallel Server

8.4 IBM SP2 Running DB2 Parallel Edition (DB2-PE)

8.5 Monet

PART III - PARALLEL DATABASE SYSTEMS

9 APPROACHES TO SPEED UP DATA MINING

9.1 Overview of Approaches to Speed up Data Mining

9.2 Discretization

9.3 Attribute Selection

9.4 Sampling and Related Approaches

9.5 Fast Algorithms

9.6 Distributed Data Mining

9.7 Parallel Data Mining

9.8 Discussion

10 PARALLEL DATA MINING WITHOUT DBMS FACILITIES

10.1 Parallel Rule Induction

10.2 Parallel Decision-Tree Building

10.3 Parallel Instance-Based Learning

10.4 Parallel Genetic Algorithms

10.5 Parallel Neural Networks

10.6 Discussion

11 PARALLEL DATA MINING WITH DATABASE FACILITIES

11.1 An Overview of Integrated Data Mining/Data Warehouse Frameworks

11.2 The Case for Integrating Data Mining and the Data Warehouse

11.3 Server-Based KDD Systems

11.4 Hybrid Client/Server-Based KDD Systems

11.5 Generic, Set-Oriented Primitives for the Hybrid Client/Server-Based KDD Framework

11.6 A Generic, Set-Oriented Primitive for Candidate-Rule (CR) Evaluation in Rule Induction

11.7 A Generic, Set-Oriented Primitive for Computing Distance Metrics in Instance-Based Learning.

11.8 Parallel Data Mining with Specialized-Hardware Parallel Database Servers

12 SUMMARY AND SOME OPEN PROBLEMS

12.1 Data-Parallel vs. Control-Parallel Data Mining

12.2 Client/Server Frameworks for Parallel Data Mining

12.3 Open Problems

REFERENCES

INDEX

More information:

Kluwer Academic Publishers

101 Philip Drive, Norwell, Ma. 02061

Phone: 781-871-6600, Fax: (781) 871-6528

E-mail: kluwer@wkap.com, URL: http://www.wkap.nl

Computer Science @ Kent

	`Last modified Friday July 19 15:20:47 BST 2002 Problems with this page? Contact the CS Webmaster`
`http://www.cs.ukc.ac.uk/people/staff/aaf/book-kluwer-ukc.html`

Book on Parallel Data Mining

Alex A. Freitas & Simon H. Lavington. Mining Very Large Databases with Parallel Processing. Kluwer Academic Publishers, 1998. 224 pp. ISBN 0-7923-8048-7

Alex A. Freitas & Simon H. Lavington.
Mining Very Large Databases
with Parallel Processing.
Kluwer Academic Publishers, 1998.
224 pp. ISBN 0-7923-8048-7