What's new in Denali CTP3, DMV-wise?

August 7, 2011 22:33 by Matt Whitfield

Ok, so for researching what's new in Denali CTP3, and therefore what schema inspector has to cope with, I have had to look at the DMV differences between CTP3 and 2008. So, here's a summary of the differences. I'm not going to replicate BOL here, just provide a simple lookup for 'what's new?'.

So, DMVs that are in Denali CTP3 but not 2008:

  • [sys].[availability_group_listener_ip_addresses]
  • [sys].[availability_group_listeners]
  • [sys].[availability_groups]
  • [sys].[availability_read_only_routing_lists]
  • [sys].[availability_replicas]
  • [sys].[column_store_dictionaries]
  • [sys].[column_store_index_stats]
  • [sys].[column_store_segments]
  • [sys].[database_filestream_options]
  • [sys].[dm_db_fts_index_physical_stats]
  • [sys].[dm_db_log_space_usage]
  • [sys].[dm_db_uncontained_entities]
  • [sys].[dm_filestream_non_transacted_handles]
  • [sys].[dm_fts_semantic_similarity_population]
  • [sys].[dm_hadr_auto_page_repair]
  • [sys].[dm_hadr_availability_group_states]
  • [sys].[dm_hadr_availability_replica_states]
  • [sys].[dm_hadr_cluster]
  • [sys].[dm_hadr_cluster_members]
  • [sys].[dm_hadr_cluster_networks]
  • [sys].[dm_hadr_database_replica_cluster_states]
  • [sys].[dm_hadr_database_replica_states]
  • [sys].[dm_hadr_instance_node_map]
  • [sys].[dm_hadr_name_id_map]
  • [sys].[dm_logpool_hashentries]
  • [sys].[dm_logpool_stats]
  • [sys].[dm_os_cluster_properties]
  • [sys].[dm_os_memory_broker_clerks]
  • [sys].[dm_os_server_diagnostics_log_configurations]
  • [sys].[dm_os_windows_info]
  • [sys].[dm_server_memory_dumps]
  • [sys].[dm_server_registry]
  • [sys].[dm_server_services]
  • [sys].[dm_tcp_listener_states]
  • [sys].[filetable_system_defined_objects]
  • [sys].[filetables]
  • [sys].[fulltext_semantic_language_statistics_database]
  • [sys].[fulltext_semantic_languages]
  • [sys].[registered_search_properties]
  • [sys].[registered_search_property_lists]
  • [sys].[sequences]

DMVs that are in 2008 but not Denali CTP3:

  • [sys].[database_principal_aliases]

DMVs columns that are in Denali CTP3 but not 2008:

  • [sys].[change_tracking_databases].[max_cleanup_version]
  • [sys].[data_spaces].[is_system]
  • [sys].[database_principals].[authentication_type]
  • [sys].[database_principals].[authentication_type_desc]
  • [sys].[database_principals].[default_language_lcid]
  • [sys].[database_principals].[default_language_name]
  • [sys].[databases].[containment]
  • [sys].[databases].[containment_desc]
  • [sys].[databases].[default_fulltext_language_lcid]
  • [sys].[databases].[default_fulltext_language_name]
  • [sys].[databases].[default_language_lcid]
  • [sys].[databases].[default_language_name]
  • [sys].[databases].[group_database_id]
  • [sys].[databases].[is_nested_triggers_on]
  • [sys].[databases].[is_transform_noise_words_on]
  • [sys].[databases].[replica_id]
  • [sys].[databases].[target_recovery_time_in_seconds]
  • [sys].[databases].[two_digit_year_cutoff]
  • [sys].[dm_clr_appdomains].[compatibility_level]
  • [sys].[dm_clr_appdomains].[survived_memory_kb]
  • [sys].[dm_clr_appdomains].[total_allocated_memory_kb]
  • [sys].[dm_clr_appdomains].[total_processor_time_ms]
  • [sys].[dm_database_encryption_keys].[encryptor_type]
  • [sys].[dm_db_file_space_usage].[allocated_extent_page_count]
  • [sys].[dm_db_file_space_usage].[filegroup_id]
  • [sys].[dm_db_file_space_usage].[total_page_count]
  • [sys].[dm_db_task_space_usage].[is_remote_task]
  • [sys].[dm_db_task_space_usage].[task_address]
  • [sys].[dm_exec_cached_plans].[parent_plan_handle]
  • [sys].[dm_exec_query_stats].[last_rows]
  • [sys].[dm_exec_query_stats].[max_rows]
  • [sys].[dm_exec_query_stats].[min_rows]
  • [sys].[dm_exec_query_stats].[total_rows]
  • [sys].[dm_exec_sessions].[authenticating_database_id]
  • [sys].[dm_exec_sessions].[database_id]
  • [sys].[dm_exec_sessions].[open_transaction_count]
  • [sys].[dm_os_buffer_descriptors].[read_microsec]
  • [sys].[dm_os_cluster_nodes].[is_current_owner]
  • [sys].[dm_os_cluster_nodes].[status]
  • [sys].[dm_os_cluster_nodes].[status_description]
  • [sys].[dm_os_memory_cache_counters].[pages_in_use_kb]
  • [sys].[dm_os_memory_cache_counters].[pages_kb]
  • [sys].[dm_os_memory_cache_entries].[pages_kb]
  • [sys].[dm_os_memory_clerks].[page_size_in_bytes]
  • [sys].[dm_os_memory_clerks].[pages_kb]
  • [sys].[dm_os_memory_nodes].[cpu_affinity_mask]
  • [sys].[dm_os_memory_nodes].[foreign_committed_kb]
  • [sys].[dm_os_memory_nodes].[online_scheduler_mask]
  • [sys].[dm_os_memory_nodes].[pages_kb]
  • [sys].[dm_os_memory_nodes].[processor_group]
  • [sys].[dm_os_memory_objects].[max_pages_in_bytes]
  • [sys].[dm_os_memory_objects].[pages_in_bytes]
  • [sys].[dm_os_nodes].[online_scheduler_mask]
  • [sys].[dm_os_nodes].[processor_group]
  • [sys].[dm_os_sys_info].[affinity_type]
  • [sys].[dm_os_sys_info].[affinity_type_desc]
  • [sys].[dm_os_sys_info].[committed_kb]
  • [sys].[dm_os_sys_info].[committed_target_kb]
  • [sys].[dm_os_sys_info].[physical_memory_kb]
  • [sys].[dm_os_sys_info].[process_kernel_time_ms]
  • [sys].[dm_os_sys_info].[process_user_time_ms]
  • [sys].[dm_os_sys_info].[time_source]
  • [sys].[dm_os_sys_info].[time_source_desc]
  • [sys].[dm_os_sys_info].[virtual_machine_type]
  • [sys].[dm_os_sys_info].[virtual_machine_type_desc]
  • [sys].[dm_os_sys_info].[virtual_memory_kb]
  • [sys].[dm_os_sys_info].[visible_target_kb]
  • [sys].[dm_os_threads].[processor_group]
  • [sys].[dm_os_worker_local_storage].[federatedxact_address]
  • [sys].[dm_os_worker_local_storage].[query_driver_address]
  • [sys].[dm_os_workers].[processor_group]
  • [sys].[dm_tran_session_transactions].[open_transaction_count]
  • [sys].[filegroups].[is_system]
  • [sys].[fulltext_index_columns].[statistical_semantics]
  • [sys].[fulltext_indexes].[property_list_id]
  • [sys].[partition_functions].[is_system]
  • [sys].[partition_schemes].[is_system]
  • [sys].[server_audits].[predicate]
  • [sys].[server_event_session_events].[predicate_xml]
  • [sys].[server_file_audits].[max_files]
  • [sys].[server_file_audits].[predicate]
  • [sys].[server_principals].[is_fixed_role]
  • [sys].[server_principals].[owning_principal_id]
  • [sys].[server_trigger_events].[is_trigger_event]
  • [sys].[service_queues].[is_poison_message_handling_enabled]
  • [sys].[stats].[is_temporary]
  • [sys].[system_internals_partition_columns].[default_value]
  • [sys].[system_internals_partition_columns].[has_default]
  • [sys].[system_internals_partitions].[is_sereplicated]
  • [sys].[tables].[is_filetable]
  • [sys].[trigger_events].[is_trigger_event]

DMVs columns that are in 2008 but not Denali CTP3:

  • [sys].[dm_os_memory_cache_counters].[multi_pages_in_use_kb]
  • [sys].[dm_os_memory_cache_counters].[multi_pages_kb]
  • [sys].[dm_os_memory_cache_counters].[single_pages_in_use_kb]
  • [sys].[dm_os_memory_cache_counters].[single_pages_kb]
  • [sys].[dm_os_memory_cache_entries].[pages_allocated_count]
  • [sys].[dm_os_memory_clerks].[multi_pages_kb]
  • [sys].[dm_os_memory_clerks].[page_size_bytes]
  • [sys].[dm_os_memory_clerks].[single_pages_kb]
  • [sys].[dm_os_memory_nodes].[multi_pages_kb]
  • [sys].[dm_os_memory_nodes].[single_pages_kb]
  • [sys].[dm_os_memory_objects].[max_pages_allocated_count]
  • [sys].[dm_os_memory_objects].[pages_allocated_count]
  • [sys].[dm_os_sys_info].[bpool_commit_target]
  • [sys].[dm_os_sys_info].[bpool_committed]
  • [sys].[dm_os_sys_info].[bpool_visible]
  • [sys].[dm_os_sys_info].[physical_memory_in_bytes]
  • [sys].[dm_os_sys_info].[virtual_memory_in_bytes]
  • [sys].[dm_os_workers].[locale]

So, some neat stuff added. I'm not surprised to see database_principal_aliases disappear, but I was a little surprised to see some of the dm_os* stuff disappear... Interesting, Mr. Bond, very interesting.

Tags:

metadata

Statistics metadata

August 30, 2010 00:14 by Matt Whitfield

So, statistics. The information relating to statistics comes from [sys].[stats] and [sys].[stats_columns], so let's get straight in and have a look at those two tables:

object_idThis is the ID of the object on which the statistics were created.
nameThis is the name of the statistics. Like indexes, you cannot have more than one set of stats on an object with the same name, but sets of stats on different objects can share the same name. Auto-generated stats have names starting with '_WA_Sys'.
stats_idThis is the unique ID of the statistics within the entry.
auto_createdThis is 1 if the statistics were auto-created by the query processor. These will only appear if AUTO_CREATE_STATISTICS has been set on for the database using ALTER DATABASE.
user_createdThis is 1 if the statistics were created manually
no_recomputeThis is 1 if the statistics were created with the NORECOMPUTE option, meaning that statistics are not recomputed automatically. This can be a good idea for static data, but seeing as statistics recompution is triggered by 20% of the rows of the underlying object changing, is mostly without merit.
has_filterLike indexes, from SQL Server 2008 onwards, statistics could be filtered using a WHERE clause style filter. Again, like indexes, this column will be 1 if such a filter exists.
filter_definitionThis column will contain the definition of the statistics filter if one is specified, or NULL if not.

So, that's the basics of statistics covered - so what about the involved columns? For that information, we look to [sys].[stats_columns]:

object_idThis is the ID of the object on which the statistics were created.
stats_idThis is the same as the stats_id column from [sys].[stats].
stats_column_idThis is the ID of the column within in the statistics (i.e. giving a sequential number to the columns involved).
column_idThis is the ID of the column covered by the statistics, and can be matched with [sys].[columns] for any given object_id value.

Job done - we now know everything we need to about stats and their metadata. Notice that there are a lot less columns than for indexes, because there isn't any scope for included columns or partitioning, and the order and ascendency of the columns sampled does not matter.

We'll take a break from the metadata series for a while, and posts will be on a variety of topics relating to SQL Server.

Tags: ,

metadata

Index metadata

August 23, 2010 12:32 by Matt Whitfield

Ok, so it's been a while, but we're finally onto index metadata. We have three types of indexes in SQL Server, Relational, XML, Spatial. Okay, so that's really only three types of index and last time I said there were four, but the metadata for statistics is very similar, and we'll cover that next time. Our main table for indexes is [sys].[indexes] - with further information available in [sys].[xml_indexes] and [sys].[spatial_indexes]. For the columns participating in indexes, we will look at [sys].[index_columns].

So let's look at the most basic data first, from [sys].[indexes]:

object_idThis is the ID of the object that the index is created on.
nameThis is the name of the index. Each object cannot have more than index with the same name, but indexes on different objects can have the same names. If this is NULL, then the index represents a heap, which means the table has no clustered index - this is a universally bad thing.
index_idThis is the ID of the index - and is unique per object. However, the index ID also has special meaning:
  • 0 - Heap
  • 1 - Clustered index
  • 2+ - Nonclustered index
  • 256,000+ - XML index
  • 384,000+ - XML index
It is interesting to note these values - the numeric space given far exceeds the specified maximum numbers of indexes per object, and they are not even convenient bit masks.
typeThis column gives us an explicit type reference for the index:
  • 0 - Heap
  • 1 - Clustered
  • 2 - Nonclustered
  • 3 - XML
  • 4 - Spatial
type_descThis is the description that relates to the value in the type column.
is_uniqueThis will be 1 if the index enforces uniqueness, either because it was created using CREATE UNIQUE INDEX, or because it is enforcing a unique or primary key constraint.
data_space_idThis is the ID of the data space in which the data for the index is stored. Note that because heaps and clustered indexes actually store the data of the underlying object, these indexes tell us where the actual data is stored.
ignore_dup_keyThis is 1 if the IGNORE_DUP_KEY option is ON for the index.
is_primary_keyThis is 1 if the index was created to enforce a PRIMARY KEY constraint. These types of indexes cannot be created directly, they are created by using ALTER TABLE to create the relevant type of constraint.
is_unique_constraintThis is 1 if the index was created to enforce a UNIQUE constraint. These types of indexes cannot be created directly, they are created by using ALTER TABLE to create the relevant type of constraint.
fill_factorIf this is non-zero, then it represents the specified FILLFACTOR when the index was created or updated. Fill factor represents a percentage of how full an index page will be when the index is newly created or rebuilt. If this is zero, then the database default value is used. The database default value is typically 0, which, interestingly, means 100 (i.e. full pages), but can be changed using sp_configure.
is_paddedIf this is 1, then PAD_INDEX was specified for the index, which means that the intermediate or leaf nodes of the index will also be spaced according to the fill_factor value.
is_disabledIf this is 1 then the index is disabled.
is_hypotheticalIf this is 1, then the index is hypothetical, which means it can't actually be used for any useful querying. If you're wondering, hypothetical indexes are created by the database tuning advisor (DTA).
allow_row_locksThis is 1 if the index allows row level locks.
allow_page_locksThis is 1 if the index allows page level locks.
has_filterFrom SQL Server 2008 onwards, indexes could effectively have a WHERE clause specified for them, allowing indexes to be created on partial data. If the index has such a filter, then this column is 1.
filter_definitionIf the index has a filter, this contains the definition of it, otherwise this is NULL.

Ok, cooking with gas. We've got a lot of information above, which enables us to specify everything we need to in order to re-create or understand indexes, apart from the columns that are present. So, let's look at the [sys].[index_columns] table to look at what we get there:

object_idThis is the ID of the object that the index is created on.
index_idThis is the same as the index_id column from [sys].[indexes].
index_column_idThis is the ID of the column within in the index (i.e. giving a sequential number to the columns involved).
column_idThis is the ID of the indexed column, and can be matched with [sys].[columns] for any given object_id value.
key_ordinalThis gives the order in which the index key columns are specified - sorting the rows by this value gives you the key columns in the right order.
partition_ordinalThis gives you order in which the columns are involved in partitioning for the index. Because partition functions currently only accept one parameter, this is either 0 or 1 (i.e. is not or is the partitioning key), but support for multiple parameter partition functions could be included in the future.
is_descending_keyIf this is 1, then the index is arranged with this key descending.
is_included_columnIf this is 1, then the column is an included column, and is present only on the data page level of the index. Applies to non-clustered indexes only

That's given us everything we need to understand and re-create indexes. We can specify the key column list, with appropriate ordering, the inclusion list and even the partitioning key. So what extra information is available for XML indexes? Let's have a look at the extra columns we get in [sys].[xml_indexes] over and above what we get in [sys].[indexes]:

using_xml_index_idFor secondary XML indexes, a primary XML index is needed, and this column tells us what primary XML index the secondary index is referencing. For primary XML indexes, this column will contain NULL.
secondary_typeThis tells us the type of secondary XML index, if it is a secondary XML index:
  • P - PATH secondary XML index
  • V - VALUE secondary XML index
  • R - PROPERTY secondary XML index
If this is a primary XML index, then this column contains NULL.
secondary_type_descThis is the description that relates to the value in the secondary_type column.

So, there's not actually that much extra there - but it does cover what we need to know in order to accurately represent and recreate XML indexes.

What about spatial indexes? They have a lot of extra syntax over and above normal index creation, so we'd expect to find a lot of extra information in [sys].[spatial_indexes] over and above what we get in [sys].[indexes]:

spatial_index_typeThis is the type of spacial index:
  • 1 - Geometric spatial index
  • 2 - Geographic spatial index
spatial_index_type_descThis is the description that relates to the value in the spatial_index_type_desc column.
tessellation_schemeThis is the style of the tessellation scheme, either GEOMETRY_GRID or GEOGRAPHY_GRID. GEOMETRY_GRID is a bounded grid in flat space, and GEOGRAPHY_GRID is an unbounded grid in curved space (accounting for the shape of the Earth).

Hang on a minute. That wasn't the tasty snack we were expecting! Where is the information about bounding boxes for geometry grids, and the grid densities? For some unfathomable reason, this information is stored in [sys].[spatial_index_tessellations] - let's take a look:

object_idThis is the ID of the object that the index is created on.
index_idThis is the same as the index_id column from [sys].[indexes].
tessellation_schemeThis is the style of the tessellation scheme, either GEOMETRY_GRID or GEOGRAPHY_GRID. GEOMETRY_GRID is a bounded grid in flat space, and GEOGRAPHY_GRID is an unbounded grid in curved space (accounting for the shape of the Earth).
bounding_box_xminWhen using a GEOMETRY_GRID, these four values will define the bounding box in which the spatial index is defined. When using a GEOGRAPHY_GRID, the four values will be NULL.
bounding_box_ymin
bounding_box_xmax
bounding_box_ymax
level_1_gridGrid density appropriate grid level, and is one of the following values:
  • 16 = 4 by 4 grid
  • 64 = 8 by 8 grid
  • 256 = 16 by 16 grid
level_2_grid
level_3_grid
level_4_grid
level_1_grid_descDescription of the grid density for the appropriate grid level, either LOW, MEDIUM or HIGH.
level_2_grid_desc
level_3_grid_desc
level_4_grid_desc
cells_per_objectThis is the number of cells per object, as specified in the index DML.

That's better - now we've got the information we need for spatial indexes too. Statistics up next!

Tags: , , ,

metadata

Partition function metadata

June 23, 2010 23:03 by Matt Whitfield

So, we need to know about partition functions, and how they are represented in meta-data. We have three tables to look at, [sys].[partition_functions] which gives us information about the partition functions themselves, [sys].[partition_parameters] which gives us information about the parameter(s) to a partition function, and [sys].[partition_range_values] which tells us the actual boundary values for a partition function.

nameThe name of the partition function, which must be unique.
function_idThe unique ID of the partition function. This is the function_id referred to in [sys].[partition_schemes].
typeThis is the type code of the partition function, which can currently only be 'R' for a RANGE partition. If I was a betting man, I would bet on this being the only available option for a good few years.
type_descThis is the description that relates to the value in the type column.
fanoutThis is the number of partitions that result from the partition function. This one is a future compatibility option, seeing as the partition count is always the count of boundaries + 1 for a RANGE partition.
boundary_value_on_rightThis column is 1 if the boundary values themselves are placed on the right or left partition. For example, consider a range partition with boundaries 10, 20 and 30. So you get four partitions. Does 10 go into the first or the second partition? For a RANGE LEFT partition, it goes into the first, and for a RANGE RIGHT partition, it goes into the second, and in the latter case, this column shows a 1.
create_dateThis is the date on which the partition function was created.
modify_dateThis is the date on which the partition function was last altered.

We get what looks like a good bit of information there, but, in effect, only tells us whether the partition function is a RANGE LEFT or a RANGE RIGHT function.

So, what about [sys].[partition_parameters]? What we get here is unsurprisingly very similar to the column and parameter metadata, so I have copied my descriptions from that post here:

function_idThis is the ID of the partition function that the parameter is for.
parameter_idThis is the ID of the parameter. As for normal parameters, it is 1 based, and you would not expect to see any gaps. However, currently, you definitely won't see any gaps, as you can only have 1 parameter to RANGE partition functions - which are the only supported type.
system_type_idThis column relates the column or parameter to [sys].[types]. However, this column tells you the class of data, rather than the exact type. So, if a user defined data type 'myBigInteger' is based on a bigint, then this is the column that tells you that the data in question is a bigint.
max_lengthThis column, you might think, is the maximum length of the column. It's not. It's the storage space that's used. So for a [text] column, you'll see 16, even though the maximum length of the data that can be stored in the text type is a byte under 2GB. (MAX) and [xml] data types are the exception here - you'll see -1 in the [max_length] column for those. And a word of warning about nchar and nvarchar types - here you will see double the number declared as the data length, because those types take 2 bytes per stored character.
precisionPrecision is useful for numeric types. You will see values for other types, but they don't bear a lot of relevance. For example, do you really need to look at meta-data to know how many digits a 32-bit integer would store? However, for numeric types (i.e. numeric & decimal) the precision column is genuinely useful, as it tells you information that you need to know in order to determine how the data is defined, how it would be scripted and how the system will treat it. This you would not know without looking at the metadata.
scaleScale again is useful for numeric types. Exactly the same applies as for precision, you will see values for other types.
collation_nameThis is the name of the collation that the column is declared with, if the column is a character data type. Note that this is a [sys].[columns] only column because collation does not, and cannot, apply to parameters.
user_type_idThis column again relates the column or parameter to [sys].[types]. This column does tell you the exact type of the data, and allows you to derive the exact type name for the column or parameter.

A note on precision and scale, again copied from my previous post on columns and parameters - precision and scale are possibly the most misleading column names ever. You would think that precision would determine how precise a number was, and that scale would determine how big it could be. This leads to a common misconception that precision is the number of digits after the decimal point, and scale is the number of digits before the decimal point. It is actually the case that precision determines how many digits there are, and scale determines the number of digits to the right of the decimal point. In a way, the names do apply, but really I think total_digits and decimal_places would have been far more sensible in terms of column names.

We now know the partition functions name, range boundary application and parameter type, basically. So the only thing left to explore is boundary values - exposed in the cunningly named [sys].[partition_range_values]:

function_idThis is the ID of the partition function that the boundary value is for.
boundary_idThis is the ID of the boundary, starting at 1, again with no gaps. This order is based on the natural ordering of the data type that the partition scheme parameter operates on. If you create an out-of-order partition scheme, the boundary values are sorted, and you get a warning.
parameter_idThis is the ID of the parameter in [sys].[partition_parameters]. Seeing as the only supported type can only have 1 parameter, this column is effectively useless at the moment. In the future - who knows?
valueThis is the actual value of the boundary. The column is a sql_variant so it's type varies according to the type of the parameter.

Excellent - we now have a complete knowledge of the storage subsystem, and using this data can confidently predict what destination filegroup a particular value in a partitioned table would be sent to, in code.

Next time we'll look at indexes, in their four different flavours.

Tags: ,

metadata

Data space metadata

June 12, 2010 20:31 by Matt Whitfield

Ok, so I lied. I said next time we'd look at key and certificate based catalog views. But, quite frankly, they're intensely boring and my brain started to wander. So, I thought I'd just skip that chapter, and talk about the storage catalog views instead. Much better.

So, we're talking about filegroups, partition functions and partition schemes... The views we're interested in are [sys].[data_spaces], [sys].[filegroups], [sys].[partition_schemes] and [sys].[destination_data_spaces].

Let's look at [sys].[data_spaces] first:

nameThe name of the data space, which must be unique. Rocket science factor = 0.
data_space_idThe unique ID of the data space, used everywhere that metadata has to reference a data space (for example, index metadata, partition scheme destinations etc).
typeThis is the type code of the data space, one of the following:
  • FG - Filegroup
  • PS - Partition scheme
  • FD - FILESTREAM data filegroup
type_descThis is the description that relates to the value in the type column.
is_defaultThis is 1 if the data space is the default, and is therefore used for tables and indices. To set a filegroup as the default, you can use the ALTER DATABASE [name] MODIFY FILEGROUP [name] DEFAULT. Note that although the documentation talks about default data spaces, partition functions cannot be default data spaces, because they require the use of a partition function. In the future, there may be other types of data space which can be default, however.

So we get a basic overview of data spaces with that view. For some more specifics, we need to delve deeper. Looking at [sys].[filegroups] tells us more about filegroups. Like the schema-scoped object catalog views, we inherit columns from [sys].[data_spaces], and the extra columns we get are:

filegroup_guidThe name of the data space, which must be unique. Rocket science factor = 0.
log_filegroup_idA wholly and completely useless column. I think they kept this one in there just to remind us of how bad the pre-2005 metadata tables were.
is_read_onlyThis is 1 if the filegroup is read-only. This is effected using ALTER DATABASE [name] MODIFY FILEGROUP [name] READONLY.

Not really much extra there, but what more do you need to know about filegroups? Oh... you wanted to know where the files actually are? [sys].[database_files] is your friend, but we'll cover that another time. Next on our journey this time is [sys].[partition_schemes]. Again, inheriting from [sys].[data_spaces], the extra columns we get are:

function_idThis is the ID of the partition function that determines on what boundaries data on the partition scheme.

Was that it? How do we tell what filegroups the partition scheme points to? [sys].[destination_data_spaces] holds the answer.

partition_scheme_idThis column is an interesting choice of name, because it's actually a data_space_id. I guess they felt the name reinforces the fact that it relates to partition scheme data spaces only.
destination_idThis is the ordinal of the destination. So, for a partition scheme with 12 partitions, you'll have 12 rows, with destination_id values from 1 to 12.
data_space_idThis column again is interestingly named. It represents the data space to which data for the partition is sent. However, given that it's name is the same as the column in [sys].[data_spaces] that you're trying to relate to, you could be forgiven for thinking that this column stored the partition scheme's data_space_id. I would have preferred a column name of destination_data_space_id which would have clarified the column's meaning, and kept it in the same style as the table name.

Ok, useful, if annoyingly named.

We now have all the information we need about storage - but we have a gap in terms of what the partition functions themselves actually do... Next time!

Tags: , ,

metadata

Database and Server Principal Metadata

June 2, 2010 23:56 by Matt Whitfield

So, let's have a look at principals. There are two levels of principals - those that participate in the database (users and roles) and those that have access to the server (logins). Let's look at the principals at the database level first, by looking at [sys].[database_principals]:

nameThis is the name of the database principal. This is the name that will be referenced for statements such as ALTER AUTHORIZATION.
principal_idThis is the ID of the principal, referenced by principal_id columns.
typeThis is a single character that determines the type of the principal:
  • S = SQL user
  • U = Windows user
  • G = Windows group
  • A = Application role
  • R = Server role
  • C = User mapped to a certificate
  • K = User mapped to an asymmetric key
type_descThis is the description that relates to the value in the type column.
default_schema_nameThis is the name of the default schema for the database principal. Note that a name is used rather than the ID, because otherwise there would be a circular dependency when creating principals and schemas with owners. Which came first, the chicken principal or the egg schema?
create_dateThis is the date on which the principal was created.
modify_dateThis is the date on which the principal was last modified.
owning_principal_idThis is the ID of the database principal that owns the current principal.
sidThis is the SID for SQL users, Windows users and Windows groups.
is_fixed_roleThis is 1 if the entry is one of the fixed database roles:
  • db_owner
  • db_accessadmin
  • db_datareader
  • db_datawriter
  • db_ddladmin
  • db_securityadmin
  • db_backupoperator
  • db_denydatareader
  • db_denydatawriter

We get a good bit of information there, enough to re-create the database user / role if necessary. Let's look at the principals at the server level, this time looking at [sys].[server_principals]:

nameThis is the name of the server principal - this will be referenced when creating a database user.
principal_idThe is the ID of the server principal within the server.
sidThis is the Security ID of the principal, which will be the same as the Windows SID for a Windows login / group.
typeThis is a single character that determines the type of the principal:
  • S = SQL login
  • U = Windows login
  • G = Windows group
  • R = Server role
  • C = Login mapped to a certificate
  • K = Login mapped to an asymmetric key
type_descThis is the description that relates to the value in the type column.
is_disabledThis is 1 if the login is disabled.
create_dateThis is the date on which the principal was created.
modify_dateThis is the date on which the principal was last modified.
default_database_nameThis is the name of the default database for this principal.
default_language_nameThis is the name of the default language for this principal.
credential_idThis is the ID of the credential, in [sys].[credentials] that relates to this principal, if applicable.

Now we have enough information to create both the user and login... almost. Note that passwords and other sensitive information are never available in metadata for security reasons (somewhat obviously). The other pertinent information that we want is the members of each role - for which we look to [sys].[database_role_members]:

role_principal_idThis is the ID of the database principal in [sys].[database_principals] that represents the role.
member_principal_idThis is the ID of the database principal in [sys].[database_principals] that represents the role member.

Excellent, just what we need, nothing more, nothing less.

Next time, we'll take a look at the key and certificate based views.

Tags: ,

metadata

Assembly and Assembly Module Metadata

May 26, 2010 21:31 by Matt Whitfield

So, let's look at assembly, or CLR, objects. There are four principal types of CLR object - Aggregate Functions, Stored Procedures, Table Functions and Scalar Functions. Of course, there are also Assembly Types - but these were covered previously in the post on types, and are not objects, as such.

Assembly modules, as they are called, are all covered by one metadata view - [sys].[assembly_modules]. Let's look at what we get:

object_idThis is the object_id of the object to which the assembly module relates.
assembly_idThis is the ID of the assembly in [sys].[assemblies] that contains the class and, if applicable, method that defines the assembly module.
assembly_classThis is the name of the class that contains the method that defines the module, or the name of the class that implements the module in the case of aggregate functions.
assembly_methodThis is the name of the method within the class that implements the module (NULL for aggregate functions). Note that the method must be marked with an attribute which specifies that the method is a SQL method - these attributes are called SqlFunctionAttribute, SqlProcedureAttribute and SqlTriggerAttribute for functions, procedures and triggers respectively.
null_on_null_inputThis is 1 if the module was declared to produce NULL on any NULL input, by specifying WITH RETURNS NULL ON NULL INPUT. Note that this only applies to functions.
execute_as_principal_idThis is the ID of the database principal specified in the WITH EXECUTE AS clause. If there is no WITH EXECUTE AS clause, then this will be NULL. Explicitly specifying WITH EXECUTE AS CALLER will also result in this column being NULL. Specifying EXECUTE AS OWNER will result in the value being -2. For EXECUTE AS [database_principal_name] or EXECUTE AS SELF, this will be the ID of the relevant database principal in [sys].[database_principals].

So, we get a bit of information there. Certainly enough to re-create the modules, given the source assemblies. However, as virtually all of the 'juicy stuff' is defined within the assembly itself, there's not much to look at.

Let's look at what we're told about the assemblies themselves then, by the metadata view [sys].[assemblies]:

nameThis is the name of the assembly within the database, which is not necessarily the same as the assembly's natural CLR name.
principal_idThis is the ID of the database principal in [sys].[database_principals] that owns the assembly.
assembly_idThis is the ID of the assembly within the database.
clr_nameThis is the fully qualified, version-number-and-all version of the assembly name. This name uniquely identifies not only the assembly, but also the version of the assembly.
permission_setThis is the permission set that the assembly has - 1 for SAFE, 2 for EXTERNAL_ACCESS and 3 for UNSAFE. Permission sets are reasonably complex - and would warrant a post on their own. Basically speaking, however, the SAFE permission set restricts what the assembly can do, as well as limiting it's access to resources within the SQL Server instance. The EXTERNAL_ACCESS permission set still restricts what the assembly can do, but allows it access to resources outside of SQL Server (for example, the event log). Access to external resources is usually performed via the SQL Server service account. The UNSAFE permission set does not restrict what the assembly can do at all, and you have to be very careful when writing UNSAFE. Under this permission set you can do things like start threads, etc.
permission_set_descThis is the description that relates to the value in the permission_set column.
is_visibleThis is 1 if the assembly is 'visible' - i.e. it can be used to create assembly modules. An assembly that is not visible is only used by other assemblies in the database (containing common core functionality, for example).
create_dateThis is the date on which the assembly was created in the database.
modify_dateThis is the date on which the assembly was last modified using ALTER ASSEMBLY.
is_user_definedThis is 1 if the assembly is a user assembly. System assemblies are those that ship with the system (for example Microsoft.SqlServer.Types which defines the geometry, geography and hierarchyid data types).

That tells us quite a lot about the assemblies present in the database, but doesn't actually give us any idea of what the actual assembly DLL would contain, binary-wise. Enter [sys].[assembly_files]:

assembly_idThis is the ID of the assembly in [sys].[assemblies] to which the file belongs.
nameThis is the name of the assembly file. For the main assembly DLL file, this will be the name of the assembly.
file_idThis is the ID of the file within the assembly. ID 1 is always the assembly DLL, i.e. the content of the file as it would be on-disk. Other files can often be present when deploying from Visual Studio, which will include the source files as well as the assembly config files.
contentThis is the varbinary(MAX) content of the file - again, exactly as it would appear on disk.

That's better, that gives us the ability to actually re-create the DLL on-disk. SQL Everywhere, our SQL Server IDE, routinely uses the information contained in [sys].[assembly_files] to re-create the assembly on disk, and inspect it using reflection in order to provide intellisense for assembly classes, methods and type methods.

Next time, we'll be looking at some of the metadata available that gives us information about database principals, server principals and roles.

Tags: , ,

metadata

Procedure and Trigger Metadata

May 19, 2010 20:11 by Matt Whitfield

There are a couple of extra views which give us some more information about SQL Modules. They relate to specific types of SQL Modules, namely procedures and triggers.

Let's look at [sys].[procedures] first, which gives us some more information about procedures, over and above the information in [sys].[objects]:

is_auto_executed This is 1 if the procedure is marked for auto-execution at server startup. Note that only procedures in the master database can receive this flag. To mark a procedure for startup execution, we use sp_procoption. For example, to set a procedure called sp_my_startup_procedure to run at startup, we would use the following:
exec sp_procoption
     N'sp_my_startup_procedure',
     'startup',
     'on'.
is_execution_replicatedThis is 1 if the procedure's execution is replicated. This is achieved by using the proc exec value for the @type parameter of sp_addarticle.
is_repl_serializable_onlyThis is 1 if the procedure's execution is replicated, but only when the transaction is serializable. This is achieved by using the serializable proc exec value for the @type parameter of sp_addarticle.
skips_repl_constraintsThis is 1 if the procedure skips contraints marked NOT FOR REPLICATION, usually true only on subscribers.

So, not really much extra information about procedures. Let's look at [sys].[triggers] which gives us information about triggers, and does not inherit columns from [sys].[objects]:

nameThis is the name of the trigger.
object_idThis is the object ID of the trigger. Both DML and DDL triggers are assigned object IDs, even though DDL triggers are not visible in [sys].[objects].
parent_classThis is the parent class of the trigger - specifying whether the trigger is a database parented trigger (for DDL triggers) or an object parented trigger (for DML triggers). It will be 0 for DDL triggers and 1 for DML triggers.
parent_class_descThis is the description that relates to the value in the parent_class column. The current values are 'DATABASE' or 'OBJECT_OR_COLUMN'.
parent_idThis is the ID of the parent object (table or view) for normal DML triggers. For DDL triggers, it will always be 0.
typeThis is the tye code of the trigger, the same as it would be in [sys].[objects]. The trigger type codes are 'TA' for an assembly trigger, or 'TR' for a SQL trigger.
type_descThis is the description that relates to the value in the type column. Currently the values are either 'CLR_TRIGGER' or 'SQL_TRIGGER'.
create_dateThis is the date on which the trigger was created.
modify_dateThis is the date on which the trigger was last modified by using an ALTER statement.
is_ms_shippedThis is 1 if the trigger was created by an internal process, or if the trigger was created as part of the SQL Server installation.
is_disabledThis is 1 if the trigger has been disabled by using a DISABLE TRIGGER statement. It can be re-enabled by using an ENABLE TRIGGER statement.
is_not_for_replicationThis is 1 if the trigger was created with the NOT FOR REPLICATION option, meaning that it will not fire on subscribers.
is_instead_of_triggerThis is 1 if the trigger is an INSTEAD OF trigger. These can be useful for creating complex views that can be updated, inserted into or deleted from using normal DML.

There seems to be quite a lot of information about triggers, but, in effect we're not told that much. We simply find out if the trigger is marked NOT FOR REPLICATION, whether it is enabled and whether it is an INSTEAD OF or AFTER trigger.

One thing that we still don't know about triggers is what events they fire on. For example - does a particular DML trigger fire on an INSERT, an UPDATE, a DELETE or a combination of the three? For the answer, we have to look to another table, [sys].[trigger_events]:

object_idThis is the object_id of the trigger for which the event is defined. You may find multiple rows for a single object_id for triggers that fire on more than one event.
typeThis is the type number of the event that causes the trigger to fire. Triggers that fire on multiple events have multiple rows, with each row specifying a different type value.
type_descThis is the description that relates to the value in the type column.
event_group_typeThis is the event group identifier that tells us which event group the trigger was created on. If the trigger was not created on an event group, then this is null.
event_group_type_descThis is the description that relates to the value in the event_group_type_desc column.
is_firstThis is 1 if the trigger is marked as the first trigger to fire for the specified event, using the sp_settriggerorder procedure.
is_lastThis is 1 if the trigger is marked as the last trigger to fire for the specified event, using the sp_settriggerorder procedure.

We now have all the information we need about triggers. A note, though, about the is_first and is_last columns - a lot of people mistakenly believe that you can actually specify the order in which multiple triggers fire. You can't. You can just specify that a trigger must execute first for a particular event, or that a trigger must execute last for a particular event, and not the explicit order of any other triggers in between. However, if you run across this problem, then you probably have bigger issues, as having more than 3 triggers on the same event type for the same parent object would not be a brilliant idea, performance-wise.

Next time we'll look at some of the metadata that surrounds the CLR objects, as well as the assemblies that contain them.

Tags: , ,

metadata

SQL Module metadata

May 11, 2010 21:38 by Matt Whitfield

So what are SQL modules? Simply put, SQL modules are objects in the database that are programmed in SQL. This isn't entirely accurate, as there are a few SQL programmable object types that don't fall under the 'sql module' banner. These are:

  • Computed columns
  • Check constraints
  • Default constraints

So, effectively, that leaves us with stored procedures, replication filter procedures, the three types of functions (scalar, in-line table-valued and multi-statement table-valued) as well as both DML and DDL triggers. Also, the legacy 'standalone rule' and 'standalone default' object definitions count as SQL modules. However, the less said about them the better, frankly! The meta-data for these objects is stored in the system view [sys].[sql_modules] - so let's look at what we're offered:

object_idThis is the ID of the object to which this definition applies. Unlike previous versions of SQL Server, there will be only one entry per object ID, since the entire definition can be stored in a varchar(MAX) field.
definitionThis is the actual text that represents the module at the time that it was created. Note that this may not be the same as the text needed to re-create an object, because name changes effected with sp_rename are not reflected in this definition. Client side code, whether in SSMS, our very own sql editor or any other editor is responsible for replacing the old object name with the new. Some (SSMS included) fail to re-create the object correctly under various circumstances (for example, if comments are placed between the schema and object names), so tread with care if using this definition to recreate objects. This column will show as NULL if the was created using WITH ENCRYPTION keyword.
uses_ansi_nullsThis is 1 if SET ANSI_NULLS was ON when the module was created.
uses_quoted_identifierThis is 1 if SET QUOTED_IDENTIFIER was ON when the module was created.
is_schema_boundThis is 1 if the module is schema bound - i.e. was defined with the WITH SCHEMABINDING keyword. Note that this is only relevant to views and functions - procedures and triggers cannot be schema bound.
uses_database_collationThis is 1 if there is a dependency between the collation of the database and the sql module, where the sql module is also schema-bound. This type of dependency means that the database default collation cannot be changed.
is_recompiledThis is 1 if the module was created with the WITH RECOMPILE keyword.
null_on_null_inputThis is 1 if the module will produce a null result on null input. This applies only to functions.
execute_as_principal_idThis is the ID of the database principal specified in the WITH EXECUTE AS clause. If there is no WITH EXECUTE AS clause, then this will be NULL. Explicitly specifying WITH EXECUTE AS CALLER will also result in this column being NULL. Specifying EXECUTE AS OWNER will result in the value being -2. For EXECUTE AS [database_principal_name] or EXECUTE AS SELF, this will be the ID of the relevant database principal in [sys].[database_principals].

So we get nearly enough to create the script for all SQL modules from this one view. The major exception being the correction of object names for sql modules that have had their name changed using sp_rename. Interestingly, if a procedure is altered using ALTER PROCEDURE, then the definition in [sys].[sql_modules] will still contain the CREATE keyword.

About system stored procedures

System stored procedures are not magic, they are just SQL modules. Their defintion can be found in [sys].[all_sql_modules], in exactly the same format as for [sys].[sql_modules]. Ok, so they are slightly magic - they often reference internal objects that you simply don't have access to as a normal SQL user. But, looking at the source of system stored procedures can be a great way to learn more about the internals of SQL server, as well as giving you the basis for creating your own customised versions.

Next time we'll look at some of the extra meta-data that is available around SQL modules, specifically triggers, stored procedures and replication filter procedures.

Tags: , , , ,

metadata

Constraint metadata

May 5, 2010 22:25 by Matt Whitfield

There are five types of constraint in SQL Server, and they fall into three categories - programmable constraints, local key constraints and foreign key constraints.

So let's look at the programmable constraints first. These are default constraints, which provide a simple default value for a column, and check constraints, which validate that the data entered for a column passes business rules.

Default constraints are quite simple, and the meta data is available in [sys].[default_constraints]. The columns that this gives us over [sys].[objects] are as follows:

parent_column_idThis is the ID of the column to which the default constraint belongs. You can look up the column using the [parent_object_id] and [parent_column_id] columns to reference [object_id] and [column_id] in [sys].[columns]
definitionThis is the actual definition of the default. Note that the definition listed here will be bracketed and white space will be removed - therefore it's not uncommon to see a default value of 0 defined as ((0)). This is worth bearing in mind - your original SQL defintion of a constraint will not necessarily be re-creatable from meta-data.
is_system_namedThis is 1 if the constraint was system named - i.e. it was not given a name when it was defined. System named constraints are best avoided, as they lead to difficulty in matching up schemas between databases, and documentation of them becomes an onerous task.

So what about check constraints? There's a little more depth to these - the meta data is available in [sys].[check_constraints] and the columns that this gives us over [sys].[objects] are as follows:

is_disabledThis is 1 if the constraint has been disabled using ALTER TABLE ... NOCHECK CONSTRAINT.
is_not_for_replicationThis is 1 if the constraint was created with the NOT FOR REPLICATION keyword. This effectively means that replicated databases do not enforce the constraint during DML operations.
is_not_trustedThis is 1 if the constraint hasn't been verified for all rows. This will be 1 if the constraint has been disabled for some time, or was created on an existing table with the WITH NOCHECK keyword.
parent_column_idThis is much the same as for [parent_column_id] in [sys].[default_constraints] (see above) with the exception that if the value is 0, then the constraint is a check constraint which checks multiple columns, and is therefore a 'table level' check constraint.
definitionThis is the actual definition of the constraint. Note that the same rules apply as for the default constraint definition, and the format of the expression may well be different to the format with which it was created.
uses_database_collationThis is 1 if the database collation makes a difference when evaluating the check constraint. If it does, then the default collation of the database can't be changed.
is_system_namedAs for [sys].[default_constraints], with the same caveats (see above).

So that is the programmable constraints. Pretty straight forward. So what about the two types of key constraint? Well, these are neatly separated in the meta data into the [sys].[key_constraints] and [sys].[foreign_keys] columns.

Let's look at [sys].[key_constraints] first. The additional columns over [sys].[objects] here are:

unique_index_idThis is the ID of the index that enforces the constraint. You can use [parent_object_id] and [unique_index_id] to reference the [object_id] and [index_id] columns in [sys].[indexes]. Note that this is particularly important, because looking at [sys].[indexes], and, in turn, [sys].[index_columns], tells us what columns are involved in the constraint.
is_system_namedAgain, as for [sys].[default_constraints], with the same caveats (see above).

One thing that is uncovered here, through implication, is that functionally there is very little difference between a UNIQUE INDEX, a PRIMARY KEY and a UNIQUE constraint. The only major differences are:

  • A table can have only one PRIMARY KEY
  • A PRIMARY KEY does not accept NULL values

Apart from that, they are, to all intents and purposes, exactly the same. And ignore people who tell you that you can only create a foreign key to a primary key, because that's simply not true.

So, on to the last one, [sys].[foreign_keys]. The additional columns over [sys].[objects] are:

referenced_object_idThis is the ID of the object that the foreign key targets. Note that the ID of the object that is the source of the link is in the [parent_object_id] column.
key_index_idThis is the ID of the index on the referenced object that is targetted by the foreign key. This can be a UNIQUE INDEX, a PRIMARY KEY constraint or a UNIQUE constraint. You can use [referenced_object_id] and [key_index_id] to reference the [object_id] and [index_id] columns in [sys].[indexes].
is_disabledAs for [sys].[check_constraints], this is 1 if the constraint has been disabled using ALTER TABLE ... NOCHECK CONSTRAINT.
is_not_for_replicationAgain, as for [sys].[check_constraints], this is 1 if the constraint was created with the NOT FOR REPLICATION keyword. This effectively means that replicated databases do not enforce the constraint during DML operations.
is_not_trustedThird time lucky, as for [sys].[check_constraints], this is 1 if the constraint hasn't been verified for all rows. This will be 1 if the constraint has been disabled for some time, or was created on an existing table with the WITH NOCHECK keyword.
delete_referential_actionThis is one of the referential actions taken when a parent row is deleted - the referential actions are defined below.
delete_referential_action_descThis is the description of the delete referential action.
update_referential_actionThis is one of the referential actions taken when a parent row is updated - the referential actions are defined below.
update_referential_action_descThis is the description of the update referential action.
is_system_namedThis column, by now, needs no introduction! See above.

The referential actions are:

Keyword Value Description
NO ACTION 0This means that if the delete or update operation would leave orphaned rows in the parent table, then the DML operation fails.
CASCADE 1This means that the delete or update is 'cascaded' from the referenced table to the parent table. A delete of a referenced row deletes the parent table rows, an update of a referenced row updates the key values in the parent table row to match.
SET NULL 2This means that the delete or update causes rows in the parent table to have their key values set to NULL when rows in the referenced table are updated or deleted.
SET DEFAULT 3This means that the delete or update causes rows in the parent table to have their key values set to their default values when rows in the referenced table are updated or deleted.

So - does that tell us everything we need to know about foreign key constraints? No. We still need to find out which columns are involved in the key. Much the same as we could look this up via [sys].[indexes] and [sys].[index_columns] for the key constraints, we need some extra information matching each parent column to each referenced column. Enter [sys].[foreign_key_columns]:

constraint_object_idThis is the ID of the foreign key object itself.
constraint_column_idThis is a slightly misleading column name - I personally feel [key_ordinal] would have been a better choice here. The value is used to order the column reference pairs, and will always be contiguous, starting at 1.
parent_object_idThis is the ID of the parent table - duplicated from [sys].[foreign_keys].
parent_column_idThis is the ID of the column in the parent table - this can be used along with [parent_object_id] to look up the relevant column in [sys].[columns].
referenced_object_idThis is the ID of the referenced table - duplicated from [sys].[foreign_keys].
referenced_column_idThis is the ID of the column in the referenced table - this can be used along with [referenced_object_id] to look up the relevant column in [sys].[columns].

Now we know everything we need to know in order to be able to understand, work with, and re-create constraints from meta data.

Next time we'll take a look at SQL modules, and understand how they tie in with the meta data views.

SQL Everywhere

Tag cloud

Calendar

<<  May 2017  >>
MoTuWeThFrSaSu
24252627282930
1234567
891011121314
15161718192021
22232425262728
2930311234

View posts in large calendar