Database and Data Warehousing Design

Database and Data Warehousing Design

Student’s name:

Institutional affiliation:

Database and Data Warehousing Design

Design Document

Data warehouses are critical in providing support to business decisions through collection, consolidating and organizing data for reporting as well as analyzing with tools such as data mining and online analytical processing, OLAP. Although data warehouses are usually built on relational database technology, the data warehouse’s design differs substantially for a design that is similar to an OLTP database. In order to design an effective database, it is essential to consider its structure and viability in offering storage options for the company’s data. Designing a data warehouse is usually different from an OLTP system. Contrary to an OLTP system, which entails capturing high data rates of data changes, the purpose of a data warehouse is always organize large stable data amounts to enable easy analysis and retrieval.

It is essential to understand the main attributes of a data warehouse. In this case, it is crucial to acknowledge the definition that is provided for a data warehouse. Data warehouse, which is also referred to as enterprise data warehouse in most occasions, is a database chiefly used for purposes of data analysis and reporting. This particular computing tool is usually the central repository of data that is mainly created or formed through integration of data from either one or more disparate sources. In addition, data warehouse is very useful in storage of both current and historical data, and thus are mostly used in creation of trending reports commonly for senior management reports like the annual or the quarterly comparisons (Singh, 2011). This is a critical element that the company should consider having in its plan of designing a database using data warehousing.

Data warehousing, on the other hand, is relatively not limited to a single department or to a specific department. Instead, it represents the database of the entire organization. In this data storage entity, data stored will subsequently be more detailed, although indexing is technically light as it purposely stores large amounts of information. In this particular data entity, it is usually difficult to manage as it commonly takes a long period to process. This implies that the data marts are significantly quick and relatively easy since they only use small data amounts. Consequently, the data warehousing is quite expensive simply because of the same reason.Data warehouse has various benefits upon its implementation in many organizations. Firstly, the data warehouses save time. This is because they are relatively fast and therefore users can quickly access vital data from numerous sources, subsequently all in one place (Ballard, 2009). Secondly, data warehousing fully enhances data quality as well as consistency. This is mainly because data from various sections and departments is usually standardized; hence each department subsequently produces clear results that are in line with each and every other department. Data warehousing significantly enhances business intelligence in various organizations it is subsequently implemented. In this case, it provides data from numerous sources that executives; managers no longer need to make various business decision on the basis of limited data or even their gut(Ballard, 2009)..

The company’s data warehouse for the database can be design as illustrated by the following database model. The model uses both a star and snowflake schema:

The figure above represents a data warehouse model that utilizes both the snowflake and star schema.

From the diagram, the sales represent a fact table that is surrounded the Date, Product and Store dimension tables. In this case, the Product dimension shows the snowflake dimension, which on the other hand three levels and the three tables, since a large number of rows, commonly referred to as row count is expected. Date dimension as presented in the table is a star dimension that comprises of four levels, which should represent the company’s data as YEAR, QUARTER, MONTH and DAY. The four levels should be presented as a single physical table since a low row count is anticipated. The STORE dimension as shown in the figure is partly denormalized. In this case, the table will contain the STORE level, while the second section contains the CITY and REGION levels.

Implementing the design represented in the schema requires a precise understanding of the database’s physical data model. In this regard, implementing a physical data model will be able to transform the physical data model into physical database through a generation of SQL data definition language, DDL script, which is responsible for creating all the objects in the database. Secondly, after the implementation of the physical data model in a capacity of production environment, as well as populating it with data. Notably, the ability to change the implementation will likely be limited due the data volumes in the production environment of the data warehouse. Moreover, the main goal of the implementation of physical data warehouse design in an excellent and good query performance. Therefore, the goal will be achieved by facilitating the collocated queries and sometimes distributing data across all intended database partitions for the company(Ballard, 2009). A collocated query will have all the data that is required in order to complete the query on the same partition of the database. In addition, an even distribution of the data will have the same rows for the partitioned section or table on each database partition (Harrington, 2009). Moreover, improving one of the database aspects can come at the expense of other aspects. More importantly, the admin of the system should consider achieving a balance between the collocated queries and data distribution in the data warehouse database.

The database design can be illustrated using ERD, Entity Relationship Diagram in order to show the snapshot of the data structures. In this case, the ERD diagram is drawn using DIA application software. For this company’s database design, it is important to have the entity relationship diagram. This would entail three basic elements in the ER diagrams, which are as follows;

  1. Entities are the elements for which were to store information, which entails the data collected from the company’s website, as well as other data collection platforms.
  2. Attributes in this system will be all the data that is collected by the company.
  3. Relationship describes the sort of relationships that exists between the different entities of the company.

The database that is being designed in this case can be represented in a Data Flow Diagram DFD to show the relation of tables in the database schema. This has been achieved by the use of DIA program software.

The flow of data in the design presented above includes both inputs and output of the data warehouse use. These are illustrated from the data that is submitted, which is the input, while the central data exchanged shows a continuous process of data that is collected into the database.

References

Ballard, C. (2009). Data warehousing with the Informix Dynamic server. United States?: IBM,International Technical Support Organization.

Harrington, J. L., & Harrington, J. L. (2009).Relational database design and implementation:Clearly explained. Amsterdam: Morgan Kaufmann/Elsevier.

Singh, S. K. (2011). Database systems: Concepts, design and applications. Delhi: DorlingKindersley (India.

Place an Order

Plagiarism Free!

Scroll to Top