What is a typical size of DataWarehouse and Data Marts?
-
- Participant
- Posts: 14
- Joined: Sun Jun 11, 2006 2:16 pm
What is a typical size of DataWarehouse and Data Marts?
Dear Experts:
What is a typical size of DataWarehouse and Data Marts?
What is a typical size of DataWarehouse and Data Marts?
Thanks experts.
datastage_learner
datastage_learner
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Welcome aboard. :D
Your question is in the "how long is a piece of string?" category.
The only valid answer is something like "big enough, and growing over time".
I have worked with star schemas as small as a few Megabytes and as large as a few Petabytes.
It's all driven by the business needs of the owners of the data.
Your question is in the "how long is a piece of string?" category.
The only valid answer is something like "big enough, and growing over time".
I have worked with star schemas as small as a few Megabytes and as large as a few Petabytes.
It's all driven by the business needs of the owners of the data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
DH Architects are, in my experience, governed by consultation with Business Analysts who prepare target-from-source mapping specifications which, along with row counts from source, allows the size of the target to be estimated.
But there's no way I'm going to put a figure on it, which your original question asks. There is no "typical size".
Growth is an important factor too. It is often more economical to buy larger hardware up front that to increase it later, but try explaining that to the average bean counter. By its very nature (being periodic snapshots of business data, possibly enriched with other data) a DW or DM will grow.
Too simplistically consider the following scenario. Each week you capture 5% of the business data. Ignoring the initial load and any load of historical/legacy data, in only 20 weeks you have captured 100% of the business data volume; in one year 260%, in five years more than 1000%. Don't get caught under-sizing!
But there's no way I'm going to put a figure on it, which your original question asks. There is no "typical size".
Growth is an important factor too. It is often more economical to buy larger hardware up front that to increase it later, but try explaining that to the average bean counter. By its very nature (being periodic snapshots of business data, possibly enriched with other data) a DW or DM will grow.
Too simplistically consider the following scenario. Each week you capture 5% of the business data. Ignoring the initial load and any load of historical/legacy data, in only 20 weeks you have captured 100% of the business data volume; in one year 260%, in five years more than 1000%. Don't get caught under-sizing!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 14
- Joined: Sun Jun 11, 2006 2:16 pm
Ray, a perfect understanding of the question and a perfect response that's what I can appreciate.ray.wurlod wrote:DH Architects are, in my experience, governed by consultation with Business Analysts who prepare target-from-source mapping specifications which, along with row counts from source, allows the size of ...
Thanks experts.
datastage_learner
datastage_learner
Planning has its own weightage. As mentioned, its purely depends on your source data and the business needs. The business requirements will be in such a way that, the DM1 will required only 5% of the operational data. Ray has mentioned, how even the 5% grows to 1000%.
Calculate the capacity of individual datamart, excluding the confirmed facts, you ll get the DWH size. (If you are on bottom up.)
Calculate the capacity of individual datamart, excluding the confirmed facts, you ll get the DWH size. (If you are on bottom up.)
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
So say we all. Nothing like shooting for the moon in your first thread here, however.chandankambli wrote:Ray, a perfect understanding of the question and a perfect response that's what I can appreciate.
I would think you would need to start with the perfect question before you could see perfect understanding and receive a perfect answer... and that only in a perfect world.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Like everybody stated above there is no typical size for a Datawarehouse.
It depends on your intended business needs.
You can determine the size, growth of your DWH with the help of the Business Analysts, Domain experts who are supposed to have knowledge of how much of data you get on a daily, monthly and yearly basis.
It is not a easy task. You need to have a complete knowledge of your Data Model.
You can do it yourself or take the help of your DBA's to estimate the size of individual tables.
You can calculate the Average Row length of your tables, multiply it with your estimated data to get an estimate of the space required for that table.
Your estimates may not match the actual numbers all the time.
It is better to ask for more space before hand rather than getting caught with space issues later.
(With the cost of disk space decreasing day by day, the fight for disk space is not a major issue in most of the companies nowadays)
It depends on your intended business needs.
You can determine the size, growth of your DWH with the help of the Business Analysts, Domain experts who are supposed to have knowledge of how much of data you get on a daily, monthly and yearly basis.
It is not a easy task. You need to have a complete knowledge of your Data Model.
You can do it yourself or take the help of your DBA's to estimate the size of individual tables.
You can calculate the Average Row length of your tables, multiply it with your estimated data to get an estimate of the space required for that table.
Your estimates may not match the actual numbers all the time.
It is better to ask for more space before hand rather than getting caught with space issues later.
(With the cost of disk space decreasing day by day, the fight for disk space is not a major issue in most of the companies nowadays)
Narasimha Kade
Finding answers is simple, all you need to do is come up with the correct questions.
Finding answers is simple, all you need to do is come up with the correct questions.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Once you've done all those calculations, add 42% as a safety margin. Document it as a safety margin. When challenged, mutter something about the "ultimate answer (to life, the universe, and everything)".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
The reference to 'Life, the Universe and Everything' was supposed to clue you into a Hitchhiker's Guide to the Galaxy reference - and 42 was a pivotal number from the series and was, in fact as Ray notes, the answer to that very question.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Concurrent knowledge of the answer and the question is not permitted.
-- Douglas Adams, in one of the five books in the inaccurately named Hitchhiker's Guide to the Galaxy trilogy.
-- Douglas Adams, in one of the five books in the inaccurately named Hitchhiker's Guide to the Galaxy trilogy.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.