Page 1 of 2

What is a typical size of DataWarehouse and Data Marts?

Posted: Sat Jan 06, 2007 11:50 pm
by chandankambli
Dear Experts:

What is a typical size of DataWarehouse and Data Marts?

Posted: Sun Jan 07, 2007 12:07 am
by ray.wurlod
Welcome aboard. :D

Your question is in the "how long is a piece of string?" category.

The only valid answer is something like "big enough, and growing over time".

I have worked with star schemas as small as a few Megabytes and as large as a few Petabytes.

It's all driven by the business needs of the owners of the data.

Posted: Sun Jan 07, 2007 1:06 am
by kumar_s
Hi,
And btw, pls avoid interview question in this forum.

Posted: Sun Jan 07, 2007 2:40 am
by DSguru2B
Welcome Aboard
Answer to such questions can be found by a simple google. Try doing some research.

Posted: Sun Jan 07, 2007 2:36 pm
by ray.wurlod
DH Architects are, in my experience, governed by consultation with Business Analysts who prepare target-from-source mapping specifications which, along with row counts from source, allows the size of the target to be estimated.

But there's no way I'm going to put a figure on it, which your original question asks. There is no "typical size".

Growth is an important factor too. It is often more economical to buy larger hardware up front that to increase it later, but try explaining that to the average bean counter. By its very nature (being periodic snapshots of business data, possibly enriched with other data) a DW or DM will grow.

Too simplistically consider the following scenario. Each week you capture 5% of the business data. Ignoring the initial load and any load of historical/legacy data, in only 20 weeks you have captured 100% of the business data volume; in one year 260%, in five years more than 1000%. Don't get caught under-sizing!

Posted: Sun Jan 07, 2007 3:38 pm
by chandankambli
ray.wurlod wrote:DH Architects are, in my experience, governed by consultation with Business Analysts who prepare target-from-source mapping specifications which, along with row counts from source, allows the size of ...
Ray, a perfect understanding of the question and a perfect response that's what I can appreciate.

Posted: Sun Jan 07, 2007 4:35 pm
by kumar_s
Planning has its own weightage. As mentioned, its purely depends on your source data and the business needs. The business requirements will be in such a way that, the DM1 will required only 5% of the operational data. Ray has mentioned, how even the 5% grows to 1000%.
Calculate the capacity of individual datamart, excluding the confirmed facts, you ll get the DWH size. (If you are on bottom up.)

Posted: Sun Jan 07, 2007 4:53 pm
by ray.wurlod
... of course, I meant DWH Architects, not DH Architects! :oops:

Posted: Sun Jan 07, 2007 6:34 pm
by chulett
chandankambli wrote:Ray, a perfect understanding of the question and a perfect response that's what I can appreciate.
So say we all. Nothing like shooting for the moon in your first thread here, however. :wink:

I would think you would need to start with the perfect question before you could see perfect understanding and receive a perfect answer... and that only in a perfect world. :cry:

Posted: Sun Jan 07, 2007 9:34 pm
by narasimha
Like everybody stated above there is no typical size for a Datawarehouse.
It depends on your intended business needs.

You can determine the size, growth of your DWH with the help of the Business Analysts, Domain experts who are supposed to have knowledge of how much of data you get on a daily, monthly and yearly basis.

It is not a easy task. You need to have a complete knowledge of your Data Model.
You can do it yourself or take the help of your DBA's to estimate the size of individual tables.
You can calculate the Average Row length of your tables, multiply it with your estimated data to get an estimate of the space required for that table.

Your estimates may not match the actual numbers all the time.
It is better to ask for more space before hand rather than getting caught with space issues later.
(With the cost of disk space decreasing day by day, the fight for disk space is not a major issue in most of the companies nowadays)

Posted: Sun Jan 07, 2007 9:50 pm
by ray.wurlod
Once you've done all those calculations, add 42% as a safety margin. Document it as a safety margin. When challenged, mutter something about the "ultimate answer (to life, the universe, and everything)".

Posted: Mon Jan 08, 2007 12:07 am
by kumar_s
42% With any specific formula you arrived this number :?:

Posted: Mon Jan 08, 2007 12:36 am
by chulett
The reference to 'Life, the Universe and Everything' was supposed to clue you into a Hitchhiker's Guide to the Galaxy reference - and 42 was a pivotal number from the series and was, in fact as Ray notes, the answer to that very question. :wink:

Posted: Mon Jan 08, 2007 1:03 am
by kumar_s
So that was an ultimate answer to the amateur question.

Posted: Mon Jan 08, 2007 4:31 am
by ray.wurlod
Concurrent knowledge of the answer and the question is not permitted.

-- Douglas Adams, in one of the five books in the inaccurately named Hitchhiker's Guide to the Galaxy trilogy.