Dennis Burton's Develop Using .NET

Change is optional. Survival is not required.

What do you do when the limitations of SQL Management Studio will not show all of your results?

Our application works with an XML file defined by an external industry specification. We found that a significant portion of this file is not required by our application. So, when loading this file into our system, we remove the portions that we do not care about and stuff the rest into a database. What is left can still be a pretty good sized XML string. Common usage shows the reduced size to still be around 300-500 KB.

Recently, we needed to investigate the contents of one of these modified files as well as the history of what was loaded. Using the standard developer playbook, the first response is to quickly fire up SQL Management Studio and craft a query to get the ntext column. At that point, the 65K limit on a column in Management Studio yields about a quarter of the file.

It would be pretty easy to pull up Visual Studio and put together a quick utility to extract the XML into a file. As a developer, I would likely keep this project around for an excessively long period of time. After all, it is now part of my code toolbox. So, rather than squirrel away a project file, source file, and all the related baggage of a VS project, I chose PowerShell as the hammer for this nail. The C# code that would have been created in Visual Studio, in this case, is simply a language that is accessing the functionality in the .net framework. Since PowerShell has access to the .net framework, it is very similar code simply using a different language. Additionally, the PowerShell script is a single file without the weight of a VS project. The only thing that I keep in my toolbox is the code required to complete the task.

What is required is the pretty common code of opening a connection, executing a query, and using the results. This is how this common C# task is represented in PowerShell. We are pulling back several rows that contain this large XML field and tagging the resuts by upload date.

param( $theId, $server,$db )


$connection = New-Object System.Data.SqlClient.SqlConnection("data source=$server;initial catalog=$db;Integrated Security=SSPI")
$command = New-Object System.Data.SqlClient.SqlCommand("select * from MyTable where theItemId=$theId", $connection)

$connection.Open()

$reader = $command.ExecuteReader()
while( $reader.Read()  ) {
    $fileName = "Item_{0}_DateLoaded_{1:yyyy_MM_dd_HH_mm_ss}.xml" -f $theId,$reader['dStart']
    Out-File  -FilePath $fileName -InputObject $reader['reallyBigXMLText']
}

$connection.Close()
$reader.Close()

Save that text into a .ps1 file and that is all that is necessary to extract the large text field. I am not one that has cut over from the classic command line to PowerShell. At the same time, it is important to know when this tool can be helpful.

Tags: community | powershell

At this months GLUGnet Flint meeting, I will be giving a presentation on PowerShell. What drove me to put this presentation together is all of the recent conversations I have had about PowerShell. It seems to be one of those topics that developers intend to look at but never really get to. This presentation starts from the ground up to several real world examples that have come up in my work environment. I sincerely hope that after this meeting those that have not used PowerShell yet will walk away with another tool in their arsenal.

Hope to see you there.

UPDATE: Adding files and ppt

glugnet.zip (89.02 KB)

PowerShell has been a topic of interest for me for the last couple of weeks. I have been squeezing in some reading on it here and there for about a little over a week. Out of the blue pops up one of those issues at work that does not involve making an elegant object model, but rather digging through log files. How cool is that? For more years than I care to mention, it seems like every time I start looking into a new topic, something comes up where that research saves a ton of time. So there I am whipping up this script, when SharpReader pops up toast that says Using an IDE to write PowerShell Scripts. If you are doing any PowerShell work at all, go ahead, stop reading this post and go get that tool. The post will be here when you get back. The PowerShell Suite has been discounted for 2 weeks associated with Hanselman's post and there is a free version for non-commercial use.

The issue was that users in Germany were getting access denied during a particular time frame. Since this time frame overlapped with with the nightly import of user information, there was some speculation that there was a collision with the import. We wanted to get a list of users that got the access denied and correlate that with the users that were imported. That leaves us with 500MB for 2 hours on each server. 8 web servers and a 4 hour period in question yields 16GB worth of log files to mine. Definitely an automation task.

The log file looks like:

... cut
[8364/18416][Tue Feb 07 04:30:27 2008][..\..\..\CSmHttpPlugin.cpp:402][INFO:1] PLUGIN: ProcessResource - Resolved Url '/security/accessdenied.aspx?ReturnUrl=index.aspx'.
[8364/18416][Tue Feb 07 04:30:27 2008][..\..\..\CSmHttpPlugin.cpp:515][INFO:1] PLUGIN: ProcessResource - Resolved Method 'GET'.
[8364/18416][Tue Feb 07 04:30:27 2008][..\..\..\CSmHttpPlugin.cpp:566][INFO:2] PLUGIN: ProcessResource - Resolved cookie domain '.SITENAMEWASHERE.com'.
[8364/18416][Tue Feb 07 04:30:27 2008][..\..\..\CSmHttpPlugin.cpp:3727][INFO:1] PLUGIN: EstablishSession - Decoded SMSESSION cookie -  - User = 'uid=USERIDHERE,dcxdealercode=DEALERCODEWASHERE,ou=dealerships,ou=dealers,o=DOMAINWASHERE.com', IP address = 'IPADDRESSWASHERE'.
... cut

What we needed was the occurrence of access denied where the source was the home page. A few lines later in the log, the SiteMinder user information was present. The script then needs two modes: looking for access denied and looking for the user info. Get-Content provides reader like functionality kicking out a line of text at a time from the provided file. Piping to a ForEach calls the associated script block for each line of text in the file. Use some Regex magic to find the correct string and extract information from the user information line. The result of the Regex is where I ran into the biggest hang up. PowerShell has an automatic variable $matches that is populated when a match is found (when a match is found not when a match is attempted). So I had to clear out the $matches variable after processing a match. The only output was the parsed user information that could be passed off to Brian, who deals with the imports. It is a pretty simple piece of script that looks like:

param ( $infile )

$lookingForSiteMinderCookie = $FALSE
$matches = $null
$Get-Content $infile | ForEach-Object {
  if( $lookingForSiteMinderCookie )
  {
    if ( $_ -match '^.*\[.*\](\[.*\])\[.*\]\[.*\].*uid=([a-zA-Z0-9]*).*dcxdealercode=([0-9]*).*$' )
    {
      "TimeStamp: {0}`tUser: {1}`tDealer: {2}" -f $matches[1],$matches[2],$matches[3]
      $matches = $null
      $lookingForSiteMinderCookie = $FALSE
    }
  }
  else
  {
    if ( $_ -match '^.*accessdenied\.aspx\?ReturnUrl=\%2fhome\%2fmain\.aspx.*$' )
    { 
      $matches = $null
      $lookingForSiteMinderCookie = $TRUE 
    }
  }
}

 

This was saved as LogParse.ps1.

From the PowerShell command line:

  • Use Get-ChildItem with a filter that matches the log file names to get a list of log files
  • Iterate through each file using ForEach calling LogParse.
  • Output to Out-File to save the results

Get-ChildItem logfilepattern | ForEach-Object { .\LogParse $_ } | Out-File deniedUsers.results

I have found PowerShell to be a very useful tool for doing some quick and dirty tasks. I can certainly see that with a bit more experience with it, a command line .net interpreter could be very handy. Check it out. You can rarely go wrong by having another tool in the toolbox. I do want to point out that this was primarily an exercise in using Powershell because I wanted to learn it better. The power of PowerShell is not going to come in text processing where there are many tools that have been around for a very long time that do that job very well. Many shells use strings when piping between commands. Passing a string object is not that much better. The real power comes when you start using objects and even more when you use objects in the pipeline. That is the feature unique to PowerShell.

Dennis Burton

View Dennis Burton's profile on LinkedIn
Follow me on twitter
Rate my presentations
Google Code repository

Community Events

Windows Azure Boot Camp Lansing GiveCamp