Browsed by
Category: SQL

Using SSAS dimension data with different labels

Using SSAS dimension data with different labels

In relation to my article about using date pickers to filter SSRS cube data, I have been asked a few times how it would be possible to use one set of parameters to select the date range that points to dimension data from different cubes.
For example, I have a report that pulls data from two different cubes. In the one cube, the main time dimension is called “Date”, and in the other cube there is a time dimension that is calle “Value Date”.

Basically there are two ways to accomplish this. The first one is exactly how I described in my previous post about this topic. For each date dimension you would have to create two extra parameters and use the expression as I mentioned. In the example above, you would have to create four parameters:

Parameter: DateFrom
Default value: =”[Date].[Date].&[” & Format(Parameters!FromDatePicker.Value, “yyyy-MM-dd”) & “T00:00:00]”

Parameter: DateTo
Default value: =”[Date].[Date].&[” & Format(Parameters!ToDatePicker.Value, “yyyy-MM-dd”) & “T00:00:00]”

Parameter: ValueDateFrom
Default value: =”[Value Date].[Date].&[” & Format(Parameters!FromDatePicker.Value, “yyyy-MM-dd”) & “T00:00:00]”

Parameter: ValueDateTo
Default value: =”[Value Date].[Date].&[” & Format(Parameters!ToDatePicker.Value, “yyyy-MM-dd”) & “T00:00:00]”

The second way (I don’t particularly see on beneficial way) is to use the Replace-function. As you can see above, the difference between the “DateFrom” parameter and the “ValueDateFrom” parameter is only in the first couple of characters. As a default value for the “ValueDateFrom” parameter, I also could have put this:

=Replace(Parameters!DateFrom.Value, “[Date].[Date]”, “[Value Date].[Date]”)

One little side note: When you add the date dimensions from different cubes in your report and let SSRS just create the parameters for you, you don’t have to create those parameters additionally. Make sure though that in the above example, there would have been two datasets for the available values for the ValueDateFrom and ValueDateTo parameters. Since we don’t see these parameters in the report anyway, you can set the available values to None and delete the datasets.

Various SQL date formats to use with SSIS, SSAS and/or SSRS

Various SQL date formats to use with SSIS, SSAS and/or SSRS

While designing data warehouses, SSIS packages, etc., like myself you’ve probably ran into the issue of having to format different date and time values that ends up in a long string of SQL code or expressions in SSIS. The purpose of this post is to collect some links to various pages where these kind of date and time formatting can be found.

Getting only date part of a datetime field

First of all, the function that I used the most is when I’m trying to get only the date part of a datetime field type. For example, when I’m looking at a sales order fact table, an order could have been placed at August 15th 2011 at 9:15am. This date would be stored as 2011-08-15 09:15:00:000. The issue here, is that it’s difficult to design your SSAS cube and use this field as a link to your time dimension table, because the time dimension table usually holds the dates only. Therefore, the following function could be used:

CAST(
FLOOR( CAST( GETDATE() AS FLOAT ) )
AS DATETIME
)

(thanks to Ben Nadel)

When you’re using SQL Server 2008 and above, you can also use the following function that is a lot easier to use. Since the date and time fields were introduced in SQL Server 2008, this won’t work in 2005 and earlier versions:

DATE(datefield)

Date formats

Some simple date formats can be found here. These are just some standard SQL date formats but these could help you tremendously to avoid having to concatenate several strings together.

Calculations: First and last day of the month

You might have ran into the issue of calculating dates in relation to a certain input date. This page shows you how to get the first or last day of the month in relation to a certain date.

Calculations: First and last day of the year

Just as calculating the first and last day of the month, you can see here how you can calculate the first and last day of the year.

Standard functions

Other than the ones above, if you are looking for a calculation in terms of number of days, weeks, months, years, etc., the standard DATEADD and DATEDIFF functions in SQL would do the trick.

Report on your reports

Report on your reports

When you deploy your SSRS reports, you might want to know about how they perform. Or maybe your reports perform allright, but there might be improvements on how the end users can be presented with the reports even faster. Also, you might have deployed a bunch of reports that nobody reads but they’re just sitting there… To get insight into these things, the ReportServer database has a few tables that contain log records of your reports.

First of all, by default the records are kept for 60 days. Older log entries are deleted. Since I always want to have a little bigger time frame of log data, I have changed the setting of this. You can find this setting in the table ConfigurationInfo in the ReportServer database. For example, I’ve used the following query to change the settings so the log entries are kept for a whole year:

UPDATE ConfigurationInfo
SET Value = ‘365’
WHERE Name = ‘ExecutionLogDaysKept’

Right, so if you also want to have log data for a year to use this functionality, please come back in 10 months… just kidding, the 60 days by default should usually generate enough data that you can test out the functionality described in this article.

As of SQL Server 2008, the log data is stored in the table dbo.ExecutionLogStorage (in earlier versions, the table is called ExecutionLog). For some reason Microsoft does not recommend to query against tables directly, therefore there are three views created to use for querying the log. By default this view is called ExecutionLog, in SQL Server 2008 the view ExecutionLog2 was introduced, and in 2008 R2 the view ExecutionLog3 was introduced. The only difference between ExecutionLog2 and ExecutionLog3 is two column names. ReportPath is called ItemPath and ReportAction is called ItemAction.

On to the next step… getting an idea how your end users view the reports. Well, this is a very straightforward thing, if you log table has grown proportionally, it means that your work has paid off. There are several columns in this table/view that gives you an idea what the users have been browsing for and the type of actions that have been performed.

To measure the performance of your reports, there are five columns that are important to take a look at. These columns are:

  • TimeDataRetrieval: This value reflects the number of milliseconds spent interacting with data sources and data extensions. This includes all data sets for the report itself and datasets for possible subreports. This value includes the time that is spent opening connections to the data source and the time spent reading the data from the database. This retrieval executes in parallel (in some cases this is not possible), therefore the number that is displayed here is not the sum of the retrieval of the datasets, but the dataset that took the longest to retrieve. In those cases where the datasets are retrieved in sequence, this number is the sum of the retrieval of all data sets.
  • TimeProcessing: This value (also in milliseconds) is the time spent on processing the data in the reporting engine. This phase also includes the processing time of the tablixes, for example sorting, filtering, grouping, etc. 
  • TimeRendering: This value (I’m sure it’s clear now in which measurement unit…) reflects the time spent on rendering the reports and applying the pagination to the reports.
  • ByteCount: This is the number of bytes that are generated for the output of this report request.
  • RowCount: The number of rows that were retrieved from the data source to generate this report.

I’ll conclude with some tips how to prevent reports from taking a long time to load based on the values in the above five columns:

  • When TimeDataRetrieval has a high value, try to modify your query to the data source. Narrow it down to the data that you need. Try to avoid SQL statements with “SELECT * “, where all the columns of a table are retrieved.
  • When TimeProcessing is high, try to “prepare” the data differently that is used in your report. It’s sometimes easier to use an ORDER BY in your query that eliminates the need of sorting in your tablixes. Also, try to use as many filters in your SQL query to reduce the data that is processed.
  • When TimeRendering is high, try to rearrange items on your report. For example, one list with one chart that is repeated 10 times is better than having 10 different charts each filtered on a different column value.
  • When ByteCount and/or RowCount is high, try to reduce the amount of data that you use for your report. When you have a tablix that shows the top 10 best performing sales agents, it’s better to have a TOP 10 condition in your SQL query than pulling the whole set and filter on your tablix.
Refresh local cache for IntelliSense

Refresh local cache for IntelliSense

I have come across many situations where I had a SQL Server query window open and I changed something in my table definitions. This resulted in not being able to use the IntelliSense on those new columns or tables.

In SQL Server Management Studio, you can refresh the IntelliSence cache by pressing Ctrl+Shift+R, or go to Edit – IntelliSense – Refresh Local Cache:
It takes a second or two when the query has been updated, but at least you can use the IntelliSense on your new table and columns without having to close the query window and open up a new one.
Monitor your SQL database size and disk space

Monitor your SQL database size and disk space

I faced the challenge of retrieving information about the size of different databases on different servers and the disk space left on the server. Basically I wanted to find a way how I can retrieve this data periodically so I could monitor the growth of the databases and the disk space left.

I decided to create an SSIS package from this that loops through the servers and databases to get the information out of it from one single package. In this post I’ll describe how I set this up. In case you’re facing a similar issue, bear in mind that you can basically get every piece of information about the databases as long as you can query it. I’ll stick to the basics to show how it works, I won’t go too much into detail which different data you can get out of it.
I’ll split the whole thing in parts so it’s well readable… the first couple of tasks look like this:
Before I start with the tasks in the package, I have created an OLE DB connection with a random server name and database. The location of the server and database name are really not important here because they will be changed later on.
The first task is a very straightforward task that retrieves all the SQL Servers where you want to retrieve the database information from. For now I just hardcoded the servers, you could have a table somewhere where these servers are listed.
Some settings that are important here is that on the general page, the ResultSet is set to Full result set. On the page Result Set, add one result set with the name 0 (zero) and in the variable name, select a variable that you have create before (the variable needs to be of type Object!).
Right after this package I have added a foreach loop container. This container has an ADO Enumerator with the result set variable from the previous package selected. On the section Variable mappings, add a variable of type String, that holds the name of the server in each iteration of this loop. The variable that I have used here is called “Server”.
The next package is a script task, and this task is responsible for setting the connection to the right server in each iteration. It’s a VB .NET script package, and you have to make sure that the variable “Server” is on the ReadOnly list.
Edit the script, and I have added the following code:
Dim ConnMgr As ConnectionManager = Dts.Connections(“ds.ServerName.master”)
Dim ConnStr As String = ConnMgr.ConnectionString
ConnStr = “Provider=SQLNCLI10.1;Data Source=” + Dts.Variables(“Server”).Value.ToString + “;Integrated Security=SSPI;Initial Catalog=master”
ConnMgr.ConnectionString = ConnStr
ConnMgr.AcquireConnection(Nothing)
Dts.TaskResult = ScriptResults.Success
This code basically sets the connection manager that you have created before with a connection string for the different servers in each iteration.
Back to the control flow… the next package is an SQL task that retrieves all the databases from the server. This package has similar settings to the server, also a full result set in an object variable but then for the different databases on the server.
Time for the second part of the control flow:
In the tempdb of the server, I have created a table that holds the physical disk space of the server. The first task in the screenshot above creates the table, the second task inserts the data with the following statement:
INSERT INTO [tempdb].[dbo].PhysicalDisk
EXEC xp_fixeddrives
In my original setup I had an OLE source connection in the data flow task that retrieved the disk space with the EXEC xp_fixeddrives statement, but apparently SSIS didn’t like that very much… therefore I changed it to this setup, mainly because I couldn’t find a way to use a source connection component that executes a stored procedure.
The next step is a loop through the databases, and for each database, the data flow task is executed. The data flow task looks like this:
On the left side, several pieces of information about the database is retrieved, like the logical name, physical name, size, etc. Since this query is different for every loop, I have put the query in an expression variable.
On the right side, I retrieve the physical disk space that I have added in the previous steps in the control flow.
I’ve joined this information together, and just for testing purposes I have put this in a text file. Eventually I’ll put a database connection as the destination in order to keep track of the database size.
Group small values into an “other” category in a Reporting Services chart

Group small values into an “other” category in a Reporting Services chart

The following chart is a simple pie chart from the Adventure Works SSAS cube (Analysis Services) where I show the reseller’s gross profit by product in the United States.

As you can see, it’s trying to squeeze in all the products into the pie chart with very small numbers for a lot of products. But what to do when you are only interested in the products that are selling the best? So basically I would like to show the top x products and the rest grouped in a category called “other”.

This option is available in SQL Server Reporting Services, although in my opinion there’s a little extra that could be added to this functionality in SSRS to make it even better. I’ll show you:

In the chart, click on the colored bit to make the series properties active. In the title bar of the property window, you’ll see the name of the object as well as the type, which should say “Chart Series”. In the list of properties, expand the option CustomAttributes. Change the attribute CollectedStyle to for example SingleSlice. With the attribute CollectedThreshold you define the level of the grouping. For example, when you set this option to 5, every piece of the pie that is 5 percent or lower of the total value, will be collected in the group “Other”.

I have set my properties to the following values:

and here is the result when I run the report with these settings:

That looks a lot better, right?

Here are some variations to the different options in the attributes:
– Attribute CollectedStyle to CollectedPie: This creates another pie chart next to the original pie chart, where the details of the categories that are grouped in “Other” are shown.
– Attribute CollectedThresholdUsePercent: With this attribute, you define if you are going to use the threshold as a percentage of the total value, or an absolute value.

Most of the other options in the custom attributes are layout-related, therefore I won’t go into any of those.

I promised you before that I would share my thought about how this functionality could be improved a little better…well, what if I want to show the details of the top 5 selling products, and the rest of the products categorized in “other”? I can’t, because I have to set a threshold to whether a product is shown separately or if it’s included into “other”. It would be nice to have the option that let’s you choose a number of the top x you want to show and the rest in “other”.