Spinach and Database Development- SQLSatExeter Keynote

Last weekend, we held our SQL Saturday event in Exeter. It was a brilliant event for many reasons but we were delighted to have a world exclusive keynote video by Phil Factor about Spinach and Database Development. With many thanks to those that made it possible and particularly to Phil Factor I have linked to the video here and also transcribed it. Please watch and read and understand the message

 

 

What has spinach got to do with Database Development?

 

 

Generations of children were fed spinach in preference to more nutritious things, such as cardboard, because of the persistence of bad data.

 

It wasn’t in fact the decimal point error of legend but confusion over the way that iron was measured in the late 19th century data. As a result nutritionists persisted in believing for generations that it was a rich source of iron that the body needs in order to create bloodcells. In fact, the very little iron that there is in spinach isn’t in a form that can be readily absorbed by the body anyway.

 

The consequences of bad data can be dire

 

Guarding the quality of your data is about the most important thing that you as a data professional can do. You may think that performance is important but it would just deliver you the wrong answer faster. Resilience? it would just make it more likely that you’d be able to deliver the wrong answer. Delivery? Yep you got it, the wrong answer quicker.

 

The spinach example is a good one because bad data is hard to detect and can go unnoticed for generations. This is probably because people don’t inspect and challenge data as much as they should. You would have thought it strange that a vegetable like spinach should have fifty times as much iron as any other vegetable but the fact came from a very reputable source so people just shrugged and accepted it

 

We have a touching faith in data,

 

We, as a culture, assume its correct and complete, we like to believe that it’s impossible that either prejudice, bias, criminality or foolishness could affect the result, worse we think that valuable truth can be sifted from any data no matter the source. If there’s enough of it then there must be value in it. It’s like panning for gold dust from a river. The sad truth is that this is a delusion but very common in our society. We are, in our mass culture, in the bronze age rather than the information age struggling with silvery toys imbued with mystical magical powers

 

A good database professional must be unequivocal.

 

Bad data cannot be cleaned in the same way that one can clean mud of a diamond. If data contains bad data then the entire data set must be rejected

 

There’s no such thing as data cleansing.

 

You as a DBA may be asked to take out data that seems absurd such as ages that are negative or ages that are so great that the person couldn’t possibly be alive but then that leaves you in the same dataset, data that is plausible but wrong.

 

Only in very exceptional circumstances when you know precisely why a minority of your data is wrong would you be justified in correcting it.

 

Statistics can help us to make very confident assertions about large datasets if they conform to one of the common distributions but they cannot tell us anything about individual items of data. You can of course remove outliers but in fact outliers are just items of data that don’t conform to your assumptions about the data and the whole point of data analysis is to test your assumptions. By cleaning data, by removing outliers you can prove almost anything scientifically

 

A well designed database is defended in depth at every possible opportunity.

 

Depth is actually an interesting analogy because experience tells us that bad data seems to leak in under pressure, through every crack when the database is working hard. Like you will see in a World War 2 submarine movie, in a well-used OLTP database, we are like the crew, swivelling our eyes in terror savouring the futility of any remediation as ghastly drips run down the walls of our database and wishing we had put in more constraints.

 

In terms of the defence of data, check constraints and foreign key constraints are excellent of course and triggers are good but there are other ways of getting warnings of errors in data such as sudden changes in the distribution of data and other anomalies. One check I like to do is the tourism check where you check your data all the way through back to source, this technique once famously picked up the fact that a famous motor manufacturer was reporting its deceleration figures in yards per second when it should have been metres per second.

 

When you start putting in check constraints you say to yourself, this couldn’t possibly happen. This is the voice of superstition. A famous programmer of the 1970’s took to putting a message in his code saying “this error could never happen” and he put it in places where it couldn’t possibly ever be executed and the funny thing was the more he tested the programme, the more that error appeared on the screen and it is the same with constraints, the more traps you set the more critters you catch and you’re left wondering how on earth all that bad data was getting in

 

Its misleading to go on about the value of the great flood of big data. There’s a strong superstition that data has some sort of intrinsic mystical value all of its own.

 

Unless you can prove that data is correct its valueless because if you trust it you can end up with generations of children compelled to eat spinach for no good reason at all.

PowerShelling SQL Saturday Sessions to the Guidebook app

Following on from my previous post about parsing XML where I used the information from Steve Jones blog post to get information from the SQL Saturday web site I thought that this information and script may be useful for others performing the same task.

  1. Edit – This post was written prior to the updates to the SQL Saturday website over the weekend. When it can back up the script worked perfectly but the website is unavailable at the moment again so I will check and update as needed once it is back.

    We are looking at using the Guidebook app to provide an app for our attendees with all the session details for SQL Saturday Exeter

    The Guidebook admin website requires the data for the sessions in a certain format. You can choose CSV or XLS.

    In the admin portal you can download the template

    down

    which gives an Excel file like this

-excel

 

So now all we need to do is to fill it with data.

I have an Excel Object Snippet which I use to create new Excel Objects when using Powershell to manipulate Excel. Here it is for you. Once you have run the code you will be able to press CTRL + J and be able to choose the New Excel Object Snippet any time.


$snippet = @{
Title = "New Excel Object";
Description = "Creates a New Excel Object";
Text = @"
# Create a .com object for Excel
`$xl = new-object -comobject excel.application
`$xl.Visible = `$true # Set this to False when you run in production
`$wb = `$xl.Workbooks.Add() # Add a workbook

`$ws = `$wb.Worksheets.Item(1) # Add a worksheet

`$cells=`$ws.Cells
<#
Do Some Stuff

perhaps

`$cells.item(`$row,`$col)="Server"
`$cells.item(`$row,`$col).font.size=16
`$Cells.item(`$row,`$col).Columnwidth = 10
`$col++
#>

`$wb.Saveas("C:\temp\Test`$filename.xlsx")
`$xl.quit()
"@
}
New-IseSnippet @snippet

I needed to change this to open the existing file by using

$wb = $xl.Workbooks.Open($GuideBookPath)

In the more help tab of the Excel workbook it says

2.     Make sure that your dates are in the following format: MM/DD/YYYY (i.e. 4/21/2011).  If the dates are in any other format, such
as “April 21, 2011” or “3-Mar-2012”, Gears will not be able to import the data and you will receive an error message.
3.     Make sure that your times are in the following format: HH:MM AM/PM (i.e. 2:30 PM, or 11:15 AM). If the times are in any other
format, such as “3:00 p.m.” or “3:00:00 PM”, Gears will not be able to import the data and you will receive an error message.

So we need to do some manipulation of the data we gather. As before I selected the information from the XML as follows

$Speaker = @{Name="Speaker"; Expression = {$_.speakers.speaker.name}}
$Room = @{Name="Room"; Expression = {$_.location.name}}
$startTime = @{Name="StartTime"; Expression = {[datetime]($_.StartTime)}}
$Endtime = @{Name ="EndTime"; Expression = {[datetime]($_.EndTime)}}
$Talks = $Sessions.event|Where-Object {$_.title -ne 'Coffee Break' -and $_.title -ne 'Room Change' -and $_.title -ne 'Lunch Break' -and $_.title -ne 'Raffle and Cream Tea'}| select $Speaker,$Room,$Starttime,$Endtime,Title,Description |Sort-Object StartTime

I then looped through the $Talks array and wrote each line to Excel like this


foreach ($Talk in $Talks)
{
$Date = $Talk.StartTime.ToString('MM/dd/yyyy') ## to put the info in the right format
$Start = $talk.StartTime.ToString('hh:mm tt') ## to put the info in the right format
$End = $Talk.Endtime.ToString('hh:mm tt') ## to put the info in the right format
$Title = $Talk.Title
$Description = $Talk.Description
$Room = $Talk.Room
$col = 2
$cells.item($row,$col) = $Title
$col ++
$cells.item($row,$col) = $Date
$col ++
$cells.item($row,$col) = $Start
$col ++
$cells.item($row,$col) = $End
$col ++
$cells.item($row,$col) = $Room
$col ++
$col ++
$cells.item($row,$col) = $Description
$row++
}

I know that I converted the String to DateTime and then back to a String again but that was the easiest (quickest) way to obtain the correct format for the Excel file

Then to finish save the file and quit Excel

$wb.Save()
$xl.quit()

Then you upload the file in the Guidebook admin area
import

wait for the email confirmation and all your sessions are available in the guidebook

sched

I hope that is useful to others. The full script is below

## From http://www.sqlservercentral.com/blogs/steve_jones/2015/01/26/downloading-sql-saturday-data/

$i = 372
$baseURL = “http://www.sqlsaturday.com/eventxml.aspx?sat=”
$DestinationFile = “E:\SQLSatData\SQLSat” + $i + “.xml”
$GuideBookPath = 'C:\temp\Guidebook_Schedule_Template.xls'
$sourceURL = $baseURL + $i

$doc = New-Object System.Xml.XmlDocument
$doc.Load($sourceURL)
$doc.Save($DestinationFile)

$Sessions = $doc.GuidebookXML.events
$Speaker = @{Name="Speaker"; Expression = {$_.speakers.speaker.name}}
$Room = @{Name="Room"; Expression = {$_.location.name}}
$startTime = @{Name="StartTime"; Expression = {[datetime]($_.StartTime)}}
$Endtime = @{Name ="EndTime"; Expression = {[datetime]($_.EndTime)}}

$Talks = $Sessions.event|Where-Object {$_.title -ne 'Coffee Break' -and $_.title -ne 'Room Change' -and $_.title -ne 'Lunch Break' -and $_.title -ne 'Raffle and Cream Tea'}| select $Speaker,$Room,$Starttime,$Endtime,Title,Description |Sort-Object StartTime

# Create a .com object for Excel
$xl = new-object -comobject excel.application
$xl.Visible = $true # Set this to False when you run in production
$wb = $xl.Workbooks.Open($GuideBookPath)
$ws = $wb.Worksheets.item(1)

$cells=$ws.Cells

$cells.item(2,1) = '' # To clear that entry
$cells.item(3,1) = '' # To clear that entry

$col = 2
$row = 2

foreach ($Talk in $Talks)
{
$Date = $Talk.StartTime.ToString('MM/dd/yyyy') ## to put the info in the right format
$Start = $talk.StartTime.ToString('hh:mm tt') ## to put the info in the right format
$End = $Talk.Endtime.ToString('hh:mm tt') ## to put the info in the right format
$Title = $Talk.Title
$Description = $Talk.Description
$Room = $Talk.Room
$col = 2
$cells.item($row,$col) = $Title
$col ++
$cells.item($row,$col) = $Date
$col ++
$cells.item($row,$col) = $Start
$col ++
$cells.item($row,$col) = $End
$col ++
$cells.item($row,$col) = $Room
$col ++
$col ++
$cells.item($row,$col) = $Description
$row++
}

$wb.Save()
$xl.quit()

Parsing XML Child Nodes and Converting to DateTime with PowerShell

As part of my organiser role for SQLSaturday Exeter (Training Day Information here and Saturday Information here) I needed to get some schedule information to input into a database.

I had read Steve Jones blog posts on Downloading SQL Saturday Data and followed the steps there to download the data from the SQL Saturday website for our event.

A typical session is held in the XML like this

<event>
    <importID>27608</importID>
    <speakers>
        <speaker>
            <id>27608</id>
            <name>William Durkin</name>
        </speaker>
    </speakers>
    <track>Track 2</track>
    <location>
        <name>Buccaneer's Refuge </name>
    </location>
    <title>Stories from the Trenches: Upgrading SQL with Minimal Downtime</title>
    <description>SQL Server has come a long way in the last few years, with Microsoft investing heavily in High Availability features. This session will show you how to use these features to enable you to safely upgrade a SQL Server, while ensuring you have a return path if things should go wrong. You will leave the session knowing what features you can use to upgrade either the OS, Hardware or SQL Server version while keeping your maintenance window to a minimum. The session will apply to Standard Edition as well as Enterprise Edition, so doesn't only apply to 'High Rollers'!</description>
    <startTime>4/25/2015 3:20:00 PM</startTime>
    <endTime>4/25/2015 4:10:00 PM</endTime>
</event>

 

I needed to output the following details – Speaker Name , Room , Start time,Duration and Title

To accomplish this I examined the node for Williams session

$i = 372
$baseURL = “http://www.sqlsaturday.com/eventxml.aspx?sat=”
$DestinationFile = “E:\SQLSatData\SQLSat” + $i + “.xml”
$sourceURL = $baseURL + $i

$doc = New-Object System.Xml.XmlDocument
$doc.Load($sourceURL)
$doc.Save($DestinationFile)

$Sessions = $doc.GuidebookXML.events

$Sessions.event[39]

I then established that to get the speakers name I had to obtain the value from the child node which I accomplished as follows

$Speaker = @{Name="Speaker"; Expression = {$_.speakers.speaker.name}}
$Sessions.event[39]|select $Speaker #To check that it worked

This is an easy way to obtain sub(or child) properties within a select in PowerShell and I would recommend that you practice and understand that syntax of @{Name=””; Expression = {} } which will enable you to perform all kinds of manipulation on those objects. You are not just limited to obtaining child properties but can perform calculations as well

I did the same thing to get the room and the start time

$Room = @{Name="Room"; Expression = {$_.location.name}}
$StartTime = @{Name="StartTime"; Expression = {$_.StartTime}}
$Sessions.event[39]|select $Speaker,$Room,$StartTime #To check that it worked

I then needed duration and thought that I could use

$Duration = @{Name ="Duration"; Expression = {($_.EndTime) - ($_.StartTime)}}
$Sessions.event[39]|select $duration

However that just gave me a blank result so to troubleshoot I ran

$Sessions.event[39].endtime - $sessions.event[39].startTime

Which errored with the (obvious when I thought about it) message

Cannot convert value “4/25/2015 4:10:00 PM” to type “System.Int32”. Error: “Input string was not in a correct format.”
At line:1 char:1
+ $Sessions.event[39].endtime – $sessions.event[39].startTime
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidArgument: (:) [], RuntimeException
+ FullyQualifiedErrorId : InvalidCastFromStringToInteger

The value was stored as a string

Running

$Sessions.event[39].endtime |Get-Member

showed me that there was a method called ToDateTime but there is an easier way. By defining the datatype of an object Powershell will convert it for you so the resulting code looks like this

$Sessions = $doc.GuidebookXML.events
$Speaker = @{Name="Speaker"; Expression = {$_.speakers.speaker.name}}
$Room = @{Name="Room"; Expression = {$_.location.name}}
$Duration = @{Name ="Duration"; Expression = {[datetime]($_.EndTime) - [datetime]($_.StartTime)}}
$startTime = @{Name="StartTime"; Expression = {[datetime]($_.StartTime)}}
$Sessions.event|select $Speaker,$Room,$Starttime,$Duration,Title |Format-Table -AutoSize -Wrap

and the resulting entry is finally as I required it. I believe that this will use the regional settings from the installation on the machine that you are using but I have not verified that. If anyone in a different region would like to run this code and check that that is the case I will update the post accordingly

zzCapture

Hopefully you have learnt from this how you can extend select from the pipeline and how defining the datatype can be beneficial. Any questions please comment below

Why You Should Visit the Sponsors at #SQLSatExeter and Other Community Events

 

 

SQL Saturdays and other community events rely on sponsors and you know that you will often get entered into a raffle for a prize in exchange for your contact details and there will be freebies of various types from many vendors but there is more that you can get from visiting the sponsors.

FREE COFFEE

At SQL Saturday Exeter on the 22nd March 2014  we are putting free coffee amongst the sponsors. Yes, it’s a ruse in some ways to put you in the same room as the sponsors whilst your mind is buzzing with all the new SQL learning you have been doing and you are feeling confident and inspired about SQL and what you can achieve.

We need the sponsors to put on the events and the sponsors need us to help put them in contact with purchasers of their wares. It is good for all community events if the sponsors can put SQL Saturday Exeter (or another community event} into their CRM as the point of first contact or the place a decision was made for a purchase as it will mean that when they analyse their data in readiness for next years budget community events will still be important to them and they will spend their money and we will continue to be able to benefit from superb free or very cheap training and learning, networking and down right good fun at next years events

DIRECT CONTACT WITH THE PEOPLE WHO MAKE YOUR TOOLS

For example, I use Red Gate’s SQL Monitor and make use of the graphs to baseline, to see when there are variations to that baseline and to get alerted about long running queries, deadlocks and many other useful DBA information.

Whilst at the Red Gate stand at SQL Saturday in Cambridge I got talking to Daniel Rothig who is one of the developers for SQL Monitor and I was able to ask him about using SQL Monitor.

I wanted to know how best to use the base lining feature and how best to describe some of the detail I was seeing to none-technical people. He and Jonathan Allen was able to give me some examples and knowledge to improve my capabilities in this area. It was fantastic to be able to discuss the product with him and see where they are wanting to take it.

THE SPONSORS WIN TOO

Obviously the sponsors need customers and that is why they put a large amount of marketing activity into SQL Community Events. They want to put their products in front of the people who will be using them and make sales.

But there is a further benefit too Daniel asked me to show him how I used the tool and what I would improve if I could.  I explained that I was having trouble getting the Regex correct for writing exceptions for the alerts for long running queries and I said that I wished there was button I could press to automatically ignore that query that sometimes. He said he would take that back to the team. Excellent, I was able to get a way to improve a good tool to make me work smarter and my experience better

I don’t know if my idea will make it to Production but I hope so. Daniel also said it was useful to see the way users of their software navigated the application and used the features and that that knowledge would help future development

Daniel said

“I’m sure we can make a sale or two on a SQL Saturday – but then, why am I there, and not a sales team? We’ve found it’s more valuable to meet people in the community, learn about their jobs and problems, and search for a gleam in their eyes when we show them our solutions. We take home those first impressions, and the feedback from long-time users, to make our software more focused, relevant, and useful.

And the conversations are always great fun – so come and say hi!”

It’s a win all ways round

WE’LL PUT BISCUITS WITH THE COFFEE TOO AT SQL SAT EXETER!!

Find out more about SQL Saturday Exeter at http://sqlsouthwest.co.uk/ 

SQL Saturday Exeter–What’s the Point? My Experience of 2013 SQLSatExeter

 

Disclaimer – I am on the committee organising the next SQL Saturday Exeter. To be kept up to date about SQL Saturday #269 in the South West, follow @SQLSatExeter and#SQLSatExeter on twitter and see details at the bottom. This post is about my experience at this years event.

In March this year the SQL South West User Group hosted SQL Saturday #194 in Exeter. I was a new member to the User Group having finally been able to join them for the first time in January. At that meeting Chris Testa O’Neill presented a session and was very passionate about the SQL Community and the benefit of the SQL Saturdays and other events.  I am always keen to learn new things and find ways of developing my skills. As I haven’t won the lottery I also look out for good deals as well!!

SQL SATURDAY PRE-CONS ARE EXCEPTIONAL VALUE

It was relatively easy to persuade my bosses to pay for my pre-con. For £150 I was able to spend a whole day in a room with about a dozen people being trained in SQL Server Security by Denny Cherry @mrdenny. The conversation went along the lines of

“I want to go to this training session being delivered by this guy. Link to MVP page. It’s £150 and is in Exeter so no other costs required”

My boss – “OK”

Of course there was a little more fun and games to be had with the payment but it was easy for me to get training sorted and £150 is not going to break the training budget.

Looking back through my notes from the session today I realise quite how much I have taken from it into my role at work. I can’t really comment which and what though that wouldn’t be good security!!

I remember an enjoyable day with plenty of technical learning, a lot of questions and answers and plenty of laughs as well. But more than that was the opportunity to mix with other professionals and talk with them. During the breaks and at lunch there were plenty of opportunities to chew the fat, learn how others do things, make new friends and put faces to twitter handles. (NOTE : I do look pretty much like my twitter profile picture so if you see me at SQL Community events I expect you to come up and say hi, that’s part of the benefit of attending these events, having a good natter)

Take a look at the end of this post for details of 2014 Pre-Cons

SQL SATURDAY – CAN’T GET CHEAPER THAN FREE

SQL Saturdays are FREE

SQL Saturdays offer sessions from internationally renowned and local SQL speakers on subjects relevant to you and your job, your future career, your development plan or just to challenge yourself by learning about something outside of your comfort zone. For Nothing. Add in the networking opportunities, the prizes from the sponsors, (if you were at Exeter this year the beer and the pasty) and if you added it up its a sizeable investment in yourself, your career and your development (did I mention a free beer and pasty?)

NOT BAD FOR FREE!!

To enable that, SQL Saturday organisers have to go out and talk sponsors into putting their hands into their pockets. They will only do that if it is worthwhile to them. You can make it easier for the organisers by going and spending time with the sponsors during the breaks, chatting with them and giving them your details. Also, if you choose to use one of their products please tell the sponsors you spoke to them at a SQL Saturday. They are (usually) data professionals who will record that and use that to make future decisions which will we hope include sponsoring SQL Saturdays.

This year on the Saturday I went to the following sessions

A temporary fix for a short term problem by Ian Meade
Advanced SQL Server 2012 HA and DR Architectures by Christian Bolton
Busting common T-SQL myths by Dave Morrison
Power View and the Cube by Régis Baccaro
Natural Born Killers, performance issues to avoid by Richard Douglas
Tracking server performance without slowing it down by Jonathan Allen which I also Room Monitored
Increasing Business and IT collaboration by Chris Testa-O’Neill

It was a really good day. I learnt so much from all those knowledgeable and talented people. It really kicked me on in my development at work. I was able to take from each of those sessions and use that knowledge to do my job better and I made new friends and new contacts. Just going back to my notes today has reminded me of something that I need to look into for work Smile Some of the conversations I have had at events this year have been fascinating – learning how other people do the same thing you do in a completely different but equally valid way,  problem-solving with a different set and type of minds than the ones at MyWork, laughing at the same things and moaning about similar frustrations. All have been both entertaining and rewarding and I think are worth mentioning as things I enjoyed about going to SQL Community events this year and play a part in the reason I shall continue to go to them (Just hope my boss doesn’t read this and think he won’t have to pay as I will go anyway!)

It’s busy and hectic, the sessions come along thick and fast and there are lots of people around to talk to. I wish I had made use of the SQL Saturday mobile phone app and I definitely recommend researching ahead of time and planning your day out.

This years sessions have not been decided yet but I have seen some of the submissions and there are some fabulous sessions there. You could also submit a session yourself. Choosing the sessions will be tough, but we want to offer the opportunity to speak to as many people as possible both new and experienced speakers.

You can submit your sessions at this link http://www.sqlsaturday.com/269/callforspeakers.aspx

ROUND-UP SQL SATURDAY EXETER WHY WOULDN’T YOU COME

For a newbie, as I was last time, SQL Saturday Exeter was a revelation.

An opportunity to learn without spending thousands of my own or MyWorks money to sit in a lecture room and listen to a trainer.

A chance to develop my understanding in a friendly environment amongst my peers where I could ask questions.

A place to meet new people and build relationships who have helped me with situations at work throughout the year. I reckon I’m in credit already

This year I have attended SQL Bits and SQL Saturday Cambridge and this month I shall be at SQL Relay in Cardiff and in Bristol. That all started with SQL Saturday 194 in Exeter 2013

WHAT ABOUT NEXT YEARS SQL SATURDAY EXETER?

Next years SQL Saturday in Exeter, SQL Saturday #269, will be held at the same place – Jury’s Inn Hotel Exeter on March 21/22nd 2014.

We had such amazing submissions for our pre-cons that we have had to find more rooms to be able to fit them all in.. You can see for yourself the quality of the sessions and speakers for SQL Saturday Exeter 2014 at the following link

http://sqlsouthwest.co.uk/sql-saturday-269-precon-training-day-details/

What do you think? I want to split myself into 8 and go to every one!

WHAT SHOULD YOU DO NOW?

I suggest that you should book Saturday 22nd March 2014 out in your calendar right this minute. Done that? Good.

Now go to this link

http://www.sqlsaturday.com/269/

and register for FREE to attend and let us know @SQLSatExeter

Next make yourself a coffee (Other beverages are available) and head to the pre-con page

http://sqlsouthwest.co.uk/sql-saturday-269-precon-training-day-details/

This bit is up to you, the choice is hard. I can’t tell you which one of our eight fabulous sessions you want to go to. It’s not for me to say which amazing speaker you want to spend a day with for a bargain price but if you need further info please get in touch and we will try and help. Unfortunately our human cloning experiment is not stable enough to allow you to go to more than one!

Then, let me know you have done so and come and say hi when you are here.