SSDT Data Source Views and MDS Metadata
Summary: When input to an SSDT DSV involves complex metadata such as nested views and/or MDS subscription views, performance of the DSV ‘refresh’ function may degrade to the point where it is unusable. Simplifying the metadata represented by the input solves the problem.
SQL Server Data Tools (SSDT) data source views (DSVs) are used to present the physical input structure of a data mart to an SSAS multidimensional (‘traditional’) model cube project. If you supply some of the data from Master Data Services (MDS) subscription views, you might reasonably want to add them directly to the DSV. You can link MDS subscription views from a given MDS model and version to each other using the ID columns, and you can link them to other non-MDS entities using the Code or other columns.
Since MDS has a limited set of data types, another approach can be to link indirectly to the subscription views via your own views in another database that translate data types and codings, and provide access control security.
If, however, the resulting nested views are complex, you can run into a limitation of SSDT (or even BIDS) DSVs. The way this has emerged for me is that SSDT will choke when I click the DSV ‘refresh’ button. The volume of metadata from the nested views — and especially the MDS subscription views — is too much for the tool.
This overload can result in timeouts during processing, which SSDT handles very in-elegantly by offering to delete the affected object from the DSV. If you don’t catch that, and don’t have the DSV saved somewhere, you may be in for some cleanup work.
It appears that the reason for the timeouts is that the memory requirements of SSDT can expand enormously when presented with complex metadata, causing swapping and then thrashing as the amount of active memory used by the tool expands beyond what is physically available. Even if timeouts do not occur, the time required to do a refresh can become completely unreasonable, and you may not be able to do anything else with the machine while the refresh is running, or with the VM host, if running in a VM.
The solution I have come up with is to simplify the metadata seen by SSDT. For now, I am replacing source views (especially MDS subscription views) with ‘cache’ tables that I create from a script that I run when I publish a new version of MDS production data.
The idea is simple. For each view that you wish to replace with a table, create a script that executes a SELECT … INTO from the existing view into a new table, creates indexes as required, and grants access. If you are clever about naming the source view and the new table, you may not even have to modify the DSV itself. Then when you commit a new version of an MDS model, run the script to refresh the information cached in the new tables.
While the biggest performance improvement is seen when performing DSV refreshes, significant improvements in cube processing time could sometimes be seen as well, especially when MDS subscription views are involved and columns other than ID are being used as indexes on those views.
Comments
SSDT Data Source Views and MDS Metadata — No Comments
HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>