星vs雪花模式 - 你的信仰是什么?


数据仓库的两个常见设计被识别为“Star Schema”和“雪花架构”。当我们讨论差异和个人偏好时,通常手套会脱落,DBA将戴上它。它成为一种宗教争论,充满激情和热情。让我们来看看为什么......星星架构的定义是通过外键关系直接与事实表直接相关的定义。它是最简单的数据仓库设计形式,其中复数尺寸关系折叠成一层尺寸表。看着设计的图表显示了一个“明星”类型的图形与中心的事实表,从而是名称。雪花架构从额外的层或尺寸表的图层获取其名称,这些层与事实表没有直接相关。额外的层让星形看起来更像是“雪花”,虽然它通常需要几个蛋白酒来看待相似性。维度表有效地标准化以产生额外的关系。归一化设计对于高性能的OLTP数据库至关重要,其中数据需要存储在一个地方,仅为一个地方,以确保有效的更新和空间节省。 One exception is where we selectively choose to violate the rules of normalization in the name of performance costing redundant storage. This is called “denormalization” and should be performed with extreme care. With a Data Warehouse we can relax the rules of normalization a little more and denormalize as long as we buy off on the extra space needed. As always it comes down to performance versus storage. The Star Schema effectively is a denormalized schema costing extra storage. Proponents would argue this is done for performance reasons and will also emphasize the “keep it simple stupid” or “KISS” approach to database development just to wind-up the Snowflake gurus. The Snowflake Schema is more normalized and saves space and redundancy. Skeptics would argue that it does this at the expense of performance degradation caused by excessive joins. Of course, there’s a bit of Star and Snowflake in all of us. The typical Time Dimension in both schemas is really a collapsed snowflake-turned-star schema design with Year, Quarter, Month dimensions collapsed into a single table. Some older analysis products actually required a Star Schema. Fortunately, Microsoft SQL Server Analysis Services (SSAS) now allows us a choice as it supports both schema designs. To calm people down, I usually introduce the concept of a “Constellation” or “Galaxy” schema which actually contains multiple Star and/or Snowflake Schemas combined so we can look to the heavens knowing there is a common solution for all of us. And SSAS supports that too, just so we can all coexist. Thankfully. Cheers, Brian.


