Карта сайта
Версия для печати

Связь между Хранилищами и Витринами Данных

2 мая 2012 Джон Пейжд, независимый эксперт Citia BTC, дает ценные советы о том, к какой взаимосвязи Хранилищ и Витрин Данных нужно стремиться. Всего возможны пять вариантов, но только один из них позволит минимизировать неточности и потери информации. (Материал представлен на английском языке)

There are five possibilities at least worthy of discussion:

Possibility One

There is no Data Warehouse and just a single Data Mart: in this case, the Data Mart is really a Data Warehouse – it may be small or it may be poorly defined but there is no clear difference between the two concepts at this level.

Possibility Two

There is no Data Warehouse but more than one Data Mart: this type of deployment is a mess and needs sorting out as a matter of priority. Such Data Marts are called ‘independent’ but should be called ‘unreliable’ instead.

Possibility Three

There is a Data Warehouse and one or many Data Marts which get all of their data from the Data Warehouse: this is a sound basic architecture and in this case the Data Marts are termed ‘dependent’ as they are dependent on the Data Warehouse for all of their data.

Possibility Four

There is a Data Warehouse and one or several Data Marts, but the Marts get data not only from the Warehouse but from other systems as well: this is the worst case scenario and disaster is looming.

Possibility Five

There is a Data Warehouse and no Data Marts: the preferred solution architecture, enabling the actual ‘single version of the truth’ – well, almost.

 Classic DW Application Areas

Key to success with the Data Warehouse is selecting the right application areas and deploying applications (reports) in an orderly fashion against a pre-defined roadmap. Data Warehouse techniques and capabilities favour some types of application above others, with ideal candidates sharing some of the following characteristics:
  • They need detailed, historic data.
  • They need data from different systems.
  • They process data in complex hierarchies.
  • They tend to look for trends and correlations rather than run short, simple queries.
  • They can live with data that is not current.
  • They do not calculate complex figures where 100% accuracy is needed.
  • The reports created are unique  across the enterprise.
  • Requirements are always changing.
Bearing in mind the list above, perhaps the most common use of the Data Warehouse is in marketing, because marketing is a process that is notoriously difficult to measure in terms of performance. Traditionally, campaigns are created and executed and if there happens to be an increase in sales at around the same time, then the campaign is deemed a success. My guess is that 90% of all marketing campaigns are designed with no scientific or factual input. I also guess that 90% of all responses to all campaigns cannot be tracked back to the responsible campaign, and that 90% of all campaigns have minimal effect on a company’s bottom line. This is pretty depressing bearing in mind all the data that is available to the marketers, and the impressive tools and huge computers at their disposal. The use of a simple Data Warehouse has the potential to improve the efficiency of marketing many times over.


Business Intelligence in the 21st Century Блог Джона Пейджа