SQL Server Compression 101

The data compression feature in SQL Server can help reduce the size of the database as well as improve the performance of intensive I/O intensive workloads, especially on data warehouses. This performance boost in I/O is offset against the extra CPU resource that is required to compress/decompress the data whilst data is exchanged with the application. With data warehouses, the guideline is to compress all objects as there is typically CPU capacity whilst the data storage and memory capacity is at a premium, so Microsoft’s recommendation is to compress all objects to the highest level. However, if there are no compressed objects at all, it is better to take the cautious approach and to evaluate each database object individually and the effects on the workload, particularly if the CPU headroom is limited. Continue reading “SQL Server Compression 101”

SSDT and Pre-Model Scripts

Hello!

Earlier this week I posted about the need to create the database as a pre-model tasks: that is creating a database before we run sqlpackage/dacfx against a database/server so that the script itself does not create the database.

Going a step further than this, pre-deploy scripts in SSDT scripts may be executed before the main diff script is executed, but not before the diff script is generated. This is an important fact to understand. If you want scripts to be executed before the diff is generated, then you need to execute pre-model scripts.

How you go about this is up to you, but there needs to be a few rules that must be followed:

Idempotent: big word, and is used in maths. But don’t let that put you off. What it means in this context is that a script can be re-run and the same result happens. So in the context of altering a primary key, the pseudo-code would look something like this:

if database exists then
if table exists then
if primary key name eq “oldname” then
drop primary key then
add primary key with “newname”

and then you add primary key “newname” to the solution. That way the drop and add part of the script will only ever be run once, or again against an older version of the database that has the old primary key.The “newname” of the primary key guarantees that this will be the case.

Stored in the Solution: within database solutions you can add scripts that are part of the solution but are not executed:

checksIf you copy the file to the output, what you’ll be able to do is add the files required to your nuspec file and get them deployed as part of the NuGet package. Neat!

premodelnuspec

So if you have the scripts in the solution and re-runnable then you can be confident that they will run against older versions of the database. It keeps a level of audit and history of the database.

Now, how to run the scripts. As with the create database script, SMO is probably the easiest way to do this. You could use SQLCMD, or Invoke-SqlCmd, but it’s entirely up to you.

Now if you want to use SMO, and you are deploying using Octopus Deploy, then you’re in (some) luck: there are scripts available in the Octopus Library to do something like this, but you may will have to alter them to execute a folder full of sql scripts. Fortunately, I already have completed this, and I will add my script to the GitHub solution to be available on the Library. When this is available, assuming it meets the quality control, I’ll blog about how to use it.

edit: it is now available on the Octopus Library. There is also a part two on how to use the step template.

Dude, Where’s My Gold Code?

Let me tell you about a conversation I had yesterday with some developers in a team. They needed help with their database CI/CD process, and so I wanted to see the database solutions they were using.

Dev1: We don’t have solutions at the moment.

Me: OK, how come?

Dev1: Because we are making so many changes right now, there seemed to be little point in creating solutions, so we’re just using scripts.

[At this point, I bite my tongue and choose not to reply with ‘If you’re continuously making changes, wouldn’t it make sense to continuously integrate and  continuously deploy the changes. I mean, you’d hardly be taxing the definition of the word “continuously…” now are you…’]

Me: So how are you making changes?

Dev1: Oh well, we’re just using scripts to deploy the changes now.

Me: And where are the scripts?

Dev1 They’re in the wiki.

Dev2: No they’re not, because we put anything in the wiki that is changing regularly.

Dev1. Oh. So where are they?

[Dev2 stares blankly].

Dev1: So why do you want solutions?

Me: To help you build a release pipeline.

Dev2: But we’re making lots of changes regularly.

[Again, me, keeping cool in spite of the blindingly obvious contradiction presented to me.]

Me: But let’s get something built to help you make those changes regularly.

Dev1: OK, I’ll get something together by the end of the week. In the meantime we’re too busy making changes to get this together…

Sound familiar? I’m sure many of you have been in a situation like this before. It’s not the first time I’ve had a conversation like this either.

Now I can go on about how the business doesn’t buy into maintenance, or Continuous Integration or Continuous Delivery. Or I could talk about how CI/CD needs to be pushed by the devs and that associated tasks require items in the sprint to make sure that they are completed and audited. And I can talk about how the “build once, deploy many” process reduces the number of bugs and speeds up the deployment process. And you know, it is great that when wearing my DBA hat I have had to write wacky scripts  because dev teams haven’t been deploying to environments in a controlled manner, but I’d really rather be focusing on infrastructure. All these points are well documented.

But these points are consequences of implementing (or not) Continuous Delivery and Continuous Deployment. The most fundamental point of adopting CI/CD is getting your code into some sort of source control. It all starts with getting the code into source control, writing pre and post deployment scripts that are re-runnable, developers checking in changes, building the code, and taking the resulting artefact (in SSDT terms, a dacpac) and deploying that dacpac to your database. And taking the same dacpac and deploying again and again up to production. This process can be achieved in an hour. Actually, in 55 minutes! And AdventureWorks can be tough to automate! OK, it may take longer than that, but getting a repository of your code that can be audited, changes rolled back, tagged, branched, deployed, merged etc etc is the first step towards achieving Database DevOps.

SqlServer PowerShell Modules NuGet Package Now Available

Howdy.

Back in the July Update of SSMS 2016, a bunch of new SQL PowerShell functions were added, plus two neat additions to Invoke-Sqlcmd: -outputas, which allowed you to output the result set into a data object (eg, data row, data table etc), and -ConnectionString, which allows you to pass in a connection string instead of using the pre-defined parameters. All very useful stuff, go and have a read.

However, this update has two issues: firstly, it’s not updating the classic sqlps module, but rather has created a new module: sqlserver. This new module will be regularly updated and delivered with SSMS updates. And the sqlps will only be affected by changes from the DB Engine; so Cumulative Updates, Service Packs etc. So there’s quite a bit of overlap here.

Secondly, I don’t like how this new module is being delivered via a monthly release of SSMS. That’s a 900MB download and a long install to gain access to a module that is only 2MB in size. To make matters worse, the SQL Server 2016 Feature Pack, which includes a SQL PowerShell msi, IS NOT being updated! Waah!

So this may just seem like a whinge, but I have checked and it is possible to install this module on it’s own and use without having anything else other than a couple of prerequisites installed. So I’m going to be pro-active: I have taken in upon myself to upload the module to NuGet for people to consume. As far as prerequisites go, you only need SMO and SQLCLR installed, available through the Feature Packs, though they do have to be the SQL Server 2016 versions of these objects.

As mentioned in one of the links above, you have to Remove-Module sqlps first before you import the new module.

I’ll update the NuGet package every time there’s a new release of SSMS.

Oh, and here’s a quick syntax of how to import the module:


Import-Module -Name "C:\nuget\PowerShell.SqlServer.Modules.1.0.1\sqlserver" -Verbose
$conn = ".;Integrated Security=True"
Invoke-Sqlcmd -ConnectionString $conn -query "SELECT GETDATE() AS TimeOfQuery;"

Turbo LogShip 1.0 Released

Earlier this year, I wrote a long post about log shipping. One of the key points I mentioned was how slow restoring can be for a read only log shipped database. If going and reading the whole thing is too much effort for you now, I’ll save you the effort (you child of the internet you) and tell you it’s because the database needs to be kept transactionally consistent in between restores when making it available for read-only. It creates a .tuf file (transaction undo file) to keep the progress of all the pages that have uncommitted transactions saved to them. The checking, creation, applying and re-applying of this can take some time where there is a significant amount of uncommitted transactions within a log backup. Continue reading “Turbo LogShip 1.0 Released”

Import-Module SqlPs Takes Forever To Load

Hello!

I’ve been working on some pre-deployment scripts to SQL Server using PowerShell, and I chose to use Invoke-Sqlcmd as opposed to creating a sql connection*, however the first time I ran the script on my local box to test, it seemed to hang. I thought it may be my local instance not running, but that was running fine. So I created the simplest command possible and tried it:


Invoke-Sqlcmd -Query "SELECT GETDATE() AS TimeOfQuery;" -ServerInstance "."

But still it was hanging. And then warning messages started to appear:

slowToImport

And at the end of the messages, my query returned. Subsequent re-runs of the Invoke-Sqlcmd ran almost instantly. But the errors pertain to Azure, and I’m not using Azure!. But I do have the Azure modules installed on my machine. A little bit of Googling led me to this Connect issue. Essentially what is happening is tha the Import-Module that is ran implicitly when using Invoke-sqlcmd “incorrectly loads files (e.g. DLLs, .ps1xml, etc) in the context of the current provider, as opposed to the context of the “file system” provider (this causes the SQL Provider to try to connect to entities that may look like SQL Server, but they are not – as stated in the description of this issue; there isn’t anything intrinsically wrong that the SQL Provider is doing here.)”

And now for the good news: it’s fixed. Hurrah! But now for the bad news: you need to install SQL Server 2016 to get the fix. I did install just the June 2016 release of SSMS as it’s far more realistic for people to upgrade this than the database engine, but unfortunately this did not resolve the issue even after a restart. It was a long shot, but one worth trying. You can also uninstall the Azure modules; it’s arguably easier to uninstall these and run them on another box than upgrade to SQL Server 2016, but I’ll admit that’s hardly a great solution.

*I have found that it’s easier to set Invoke-sqlcmd to verbose and save any print messages from SQL to a file that can then be saved as an artifact to be used for deployment than any other way.

 

An Alternative Method of Creating a ResultSet in SQL

Happy Mardi Gras!

I’ve got a few ideas for posts that take more time than the 5 minutes I can spare each day. However today’s a quick one as I’m snowed under with work and the joys of house hunting. I came across this in an old stored proc:

</p><p>select top 0 * into Sales.SalesPersons from sales.SalesPerson<br />

and after a bit of Googling I found out it’s a way of creating a duplicate of the columns in the table. What it does not do is include indexes, constraints and keys. So when is it useful? Well the stored proc it is used in creates a table and switches a partition into the table then drops it. This is to speed up deletes (yes there are checks from dropping multiple runs inside one partition, I am disregarding this for this post). I guess it takes the pragmatic approach that the data would not be in there if there were any constraint violations, we’re not querying this table so indexes are not required and keys are not required for reference or uniqueness.