Friday, July 30, 2010

Separation of Warehouse Data

One example of this characteristic is that a terabyte of data may have 50GB that are actively used and 950GB that are accessed perhaps only once a month or once a quarter. The organization pays the same for the data regardless of how frequently it is used. The data warehouse administrator can either archive the inactive data or place it in near-line storage. Accessing the inactive data, moving it to near-line storage, then deleting the data from the data warehouse defines the separation.

While it is true that all data warehouses face separation, the degree of separation varies among warehouses, based on these factors:

* Size of the warehouse
* Type of business the warehouse supports
* Who uses the warehouse
* What kind of processing is being done
* Level of sophistication of end-user analysts

Critical Success Factors

There are three critical success factors that each company needs to identify before moving forward with the issue of data quality:
* Commitment by senior management to the quality of corporate data
* Definition of data quality
* Quality assurance of data.

The senior management commitment to maintaining the quality of corporate data can be achieved by instituting a data administration department that oversees the management of corporate data. The role of this department will be to establish data management standards, policies, procedures, and guidelines pertaining to data and data quality.
Data Quality

In addition to referring to the usefulness of the data, data quality has to be defined as data that meets the following five criteria:

1. Complete
2. Timely
3. Accurate
4. Valid
5. Consistent

The definition of data quality must include the definition of the degree of quality that is required for each element being loaded into the data warehouse. If, for example, customer addresses are stored, it might be acceptable that the four-digit extension to the zip code, or the three-digit extension to a postal code, is missing. However, the street address, city, and state or province are of much higher importance. This parameter must be identified by each individual company and for each item that is used in the data warehouse.
Read on
Managing Customer Information
Many organizations in the public and private sectors collect personal data for marketing purposes, to evaluate their products services and to enhance profitability.

A third factor that needs to be considered is the quality assurance of data. Since data is moved from transactional/legacy systems to the data warehouse, the accuracy of this data needs to be verified and corrected if necessary, and this will often involve cleansing of existing data. Since no company is able to rectify all of its unclean data, procedures have to be put in place to ensure data quality at the source.
Modify Business Processes

This task can only be achieved by modifying business processes and designing data quality into the system. In identifying every data item and its usefulness to the ultimate users of this data, data quality requirements can be established. One might argue that this is too costly, but is has to be kept in mind that increasing the quality of data as an after-the-fact task is five to ten times more costly than capturing it correctly at the source.

If companies want to use a data warehouse for competitive advantage and reap its benefits, the issue of data quality is extremely important. Only when data quality is recognized as a corporate asset by every member of the organization will the benefits of data warehousing and CRM initiatives be realized



SOURCE:

http://customer-relations.suite101.com/article.cfm/separation_of_warehouse_data

No comments:

Post a Comment