Basics of SSAS (Analysis Services)

Understanding Analysis Services
The basic idea of OLAP is fairly simple. Let’s think about that book ordering data for
a moment. Suppose you want to know how many people ordered a particular book
during each month of the year. You could write a fairly simple query to get the
information you want. The catch is that it might take a long time for SQL Server to
churn through that many rows of data.
And what if the data was not all in a single SQL Server table, but scattered around in
various databases throughout your organization? The customer info, for example,
might be in an Oracle database, and supplier information in a legacy xBase database.
SQL Server can handle distributed heterogeneous queries, but they’re slower.
What if, after seeing the monthly numbers, you wanted to drill down to weekly or
daily numbers? That would be even more time-consuming and require writing even
more queries.
This is where OLAP comes in. The basic idea is to trade off increased storage space
now for speed of querying later. OLAP does this by precalculating and storing
aggregates. When you identify the data that you want to store in an OLAP database,
Analysis Services analyzes it in advance and figures out those daily, weekly, and
monthly numbers and stores them away (and stores many other aggregations at the
same time). This takes up plenty of disk space, but it means that when you want to
explore the data you can do so quickly.
The basic concepts of OLAP include:-
• Cube
• Dimension table
• Dimension
• Hierarchy
• Level
• Fact table
• Measure
• Schema

Cube
The basic unit of storage and analysis in Analysis Services is the cube. A cube is a
collection of data that’s been aggregated to allow queries to return data quickly. For
example, a cube of order data might be aggregated by time period and by title,
making the cube fast when you ask questions concerning orders by week or orders
by title.
Cubes are ordered into dimensions and measures. The data for a cube comes from a
set of staging tables, sometimes called a star-schema database. Dimensions in the
cube come from dimension tables in the staging database, while measures come from
fact tables in the staging database.

Dimension table
A dimension table lives in the staging database and contains data that you’d like to
use to group the values you are summarizing. Dimension tables contain a primary
key and any other attributes that describe the entities stored in the table. Examples
would be a Customers table that contains city, state and postal code information to
be able to analyze sales geographically, or a Products table that contains categories
and product lines to break down sales figures.

Dimension
Each cube has one or more dimensions, each based on one or more dimension tables.
A dimension represents a category for analyzing business data: country or product
line in the examples above. Frequently, a dimension has a natural hierarchy so that
lower results can be “rolled up” into higher results. For example, in a geographical
level you might have city totals aggregated into state totals, and state totals into
country totals.

Hierarchy
A hierarchy can be best visualized as a node tree. A company’s organizational chart is
an example of a hierarchy. Each dimension can contain multiple hierarchies; some of
them are natural hierarchies (the parent-child relationship between attribute values
occur naturally in the data), others are navigational hierarchies (the parent-child
relationship is established by developers.)

Level
Each layer in a hierarchy is called a level. For example, you can speak of a week level
or a month level in a fiscal time hierarchy, and a city level or a country level in a
geography hierarchy.

Fact table
A fact table lives in the staging database and contains the basic information that you
wish to summarize. This might be order detail information, payroll records, drug
effectiveness information, or anything else that’s amenable to summing and
averaging. Any table that you’ve used with a Sum or Avg function in a totals query
is a good bet to be a fact table. The fact tables contain fields for the individual facts
as well as foreign key fields relating the facts to the dimension tables.

Measure
Every cube will contain one or more measures, each based on a column in a fact table
that you’d like to analyze. In the cube of book order information, for example, the
measures would be things such as unit sales and profit.

Schema
Fact tables and dimension tables are related, which is hardly surprising, given that
you use the dimension tables to group information from the fact table. The relations
within a cube form a schema. There are two basic OLAP schemas: star and
snowflake. In a star schema, every dimension table is related directly to the fact table.
In a snowflake schema, some dimension tables are related indirectly to the fact table.
For example, if your cube includes OrderDetails as a fact table, with Customers
and Orders as dimension tables, and Customers is related to Orders, which in
turn is related to OrderDetails, then you’re dealing with a snowflake schema.

Additional useful Information :-
There are additional schema types besides the star and snowflake
schemas supported by SQL Server 2008, including parent-child
schemas and data-mining schemas. However, the star and
snowflake schemas are the most common types in normal cubes.

Comments

Popular posts from this blog

SSIS vs. T-SQL

Sync SSAS Cube From One Environment to Another

SQL Function to replace special characters from a string