Tuesday, July 29, 2014

Data Warehousing Tutorial Part - 1

Data Warehousing Concepts
This chapter provides an overview of the Oracle data warehousing implementation. It includes

1) What is a Data Warehouse?
2) Data Warehouse Architectures?

Note that this book is meant as a supplement to standard texts about data warehousing. This book focuses on Oracle-specific material and does not reproduce in detail material of a general nature. Two standard texts are:
  • The Data Warehouse Toolkit by Ralph Kimball (John Wiley and Sons, 1996)
  • Building the Data Warehouse by William Inmon (John Wiley and Sons, 1996)

WHAT IS A DATA WAREHOUSE?

A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.

In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.

A common way of introducing data warehousing is to refer to Inmon's characteristics of a data warehouse, who says that they are:
  • Subject Oriented
  • Integrated
  • Nonvolatile
  • Time Variant

Subject Oriented

Data warehouses are designed to help you analyze your data. For example, you might want to learn more about your company's sales data. To do this, you could build a warehouse concentrating on sales. In this warehouse, you could answer questions like "Who was our best customer for this item last year?" This kind of focus on a topic, sales in this case, is what is meant by subject oriented.

Integrated

Integration is closely related to subject orientation. Data warehouses need to have the data from disparate sources put into a consistent format. This means that naming conflicts have to be resolved and problems like data being in different units of measure must be resolved.

Nonvolatile

Nonvolatile means that the data should not change once entered into the warehouse. This is logical because the purpose of a warehouse is to analyze what has occurred.

Time Variant

Most business analysis requires analyzing trends. Because of this, analysts tend to need large amounts of data. This is very much in contrast to OLTP systems, where performance requirements demand that historical data be moved to an archive

No comments:

Post a Comment