PowerShell and ADO.NET

Lots of times when you are working with a SQL Server all you want to do is to poke at the data. You aren’t doing any real heavy duty transactional processing, you just want to get a look at what is going on. Of course you can fire up SSMS (SQL Server Management Studio) or SqlCmd and issue some ad hoc queries. Or you can copy all the tables you are interested in to a local instance of SQL Server on your system and do the poking there. Doing the latter is often times a lot more convienient but you need a copy of SQL Server locally and then you have to copy all the data, yadda, yadda, yadda.

Actually it’s pretty common to take the “work on the data locally” approach in C# applications. These applications use ADO.NET to build a DataSet in the application, then data bind, or add a little procedural programming or otherwise poke at it. But writing, compiling and running an ad hoc application each time you want to make a little change isn’t really practical. What does make this technique practical though is PowerShell.

In this article we’re going to look at the basics of using ADO.NET inside of PowerShell. This article assumes you know the basics of using ADO.NET in C#.

We will start with the DataSet. A DataSet is just a fancy container, just a bit more fancy than a hash table. It has tables and views and relations and queries sort of like a database does. You can use a DataSet without a database, but if you are going to use it with a database then the SqlDataAdapter is a handy way to make a snapshot of some data and save it into a dataset. Let’s start by using PowerShell to get a snapshot of the data in the Customers table of the Northwind database.

First of all we will need a query that returns all the data from the Customers table.

$query = "SELECT * FROM [Customers]"

We will also need a connection string. This one assumes a local SQL Server.

$connString = "server=.;integrated security;database=northwind"

Now we will need a DataSet to copy the customers data into. PowerShell does not initialize a variable just because it is declared so we will have to make an instance of the DataSet class.

$dataset = new-object "System.Data.DataSet" "MyDataSet"
PS C:\demos> $dataset
RemotingFormat          : Xml
SchemaSerializationMode : IncludeSchema
CaseSensitive           : False
DefaultViewManager      : {System.Data.DataViewManagerListItemTypeDescriptor}
EnforceConstraints      : True
DataSetName             : MyDataSet
Namespace               :
Prefix                  :
ExtendedProperties      : {}
HasErrors               : False
IsInitialized           : True
Locale                  : en-US
Site                    :
Relations               : {}
Tables                  : {}
Container               :
DesignMode              : False

We use the new-object command to make an instance of a DataSet. The second string passed to new-object is the name we want for the DataSet. Typing the name of a variable on the command line in PowerShell returns the properties of that object, in most cases. Here we see that the dataset has a DataSetName of “MyDataSet” as we wanted and that the Tables collection is empty.

Next we need a SqlDataAdapter so we can fill our dataset.

PS C:\demos> $da = new-object "System.Data.SqlClient.SqlDataAdapter" ($query, $connString)

Here we have used the new-object command again and passed into the SqlDataAdapter constructor both the query and the connection string. Note that when want to pass more than one parameter to a constructor for a .NET class in PowerShell, you must pass all the parameters in a single array of objects. That is why $query and $connString have been passed separated by commas and enclosed in parentheses. A comma separted list enclosed in parentheses in PowerShell creates an array of objects.

We can check to see that our DataAdapter was properly constructed by looking at the text for the SelectCommand and the ConnectionString for the connection the SelectCommand uses.

PS C:\demos> $da.SelectCommand.CommandText 
SELECT * FROM [Customers]
PS C:\demos> $da.SelectCommand.Connection.ConnectionString
server=.;integrated security=true;database=northwind

Now we have everything we need to get our snapshot.

PS C:\demos> $da.Fill($dataset)
92

All we did was to call the Fill method on $da and pass in our $dataset variable. Once we have done this we can take a look at what is changed about our dataset.

PS C:\demos> $dataset

RemotingFormat          : Xml
SchemaSerializationMode : IncludeSchema
CaseSensitive           : False
DefaultViewManager      : {System.Data.DataViewManagerListItemTypeDescriptor}
EnforceConstraints      : True
DataSetName             : MyDataSet
Namespace               :
Prefix                  :
ExtendedProperties      : {}
HasErrors               : False
IsInitialized           : True
Locale                  : en-US
Site                    :
Relations               : {}
Tables                  : {}
Container               :
DesignMode              : False

Nothing has really changed. Notice that the Table container is still empty. Don’t worry about this, is just an artifact of how PowerShell treats DataTables. If we ask how many DataTables are in $dataset we will see it contains one, as we would expect.

PS C:\demos> $dataset.Tables.Count
1
PS C:\demos>

It would be nice of the Tables container listed the names of the tables in the DataSet, but it doesn’t so we will have to write a little script to see what is there.

PS C:\demos> $dataset.Tables | %{$_.TableName}
Table

The Tables property of the $dataset is a container, so if we pass it into a pipe it sends each of its tables down the pipe one at a time. The script inside of %{} is executed once for each item that passes through the pipe and $_ is the item being passed. $_.TableName gets the name of each table that passes through the pipe. Of course there is only one in this case.

As expected, the first table that a DataAdapter loads into a DataSet is named “Table”. The Tables collection acts like an array so we can access the individual tables in the Tables collection and set their names to whatever we want.

PS C:\demos> $dataset.Tables[0].TableName = "Customers"
PS C:\demos> $dataset.Tables | %{$_.TableName}
Customers

If we want to make it a bit easier to work with the Customers table we can pull out a reference to if from the DataSet.

PS C:\demos> $customers = $dataset.Tables["Customers"]
PS C:\demos> $customers.Rows.Count
92

Note that just as we would do in a C# program we can index items in the Tables collection by their name. Once we have the $customers reference to the Customers table we can begin to work on it. We tried about the most simple thing you can do with a DataTable, we looked at how many rows there are in it.

We can look to see what kinds of columns are in the table.

PS C:\demos> $customers.Columns | Format-Table -autosize -property ColumnName, DataType
ColumnName   DataType
----------   --------
CustomerID   System.String
CompanyName  System.String
ContactName  System.String
ContactTitle System.String
Address      System.String
City         System.String
Region       System.String
PostalCode   System.String
Country      System.String
Phone        System.String
Fax          System.String

We pipe the Columns from the $customers variable into the pipe. Here we use the format-table command from PowerShell and specify that we only want to see the ColumnName and DataType properties of each column. The -autosize tells PowerShell squeeze the columns as tightly together as possible.

Lets try using PowerShell to process the data. Will start by doing something a little silly, we want the ContactNames from all of the customers whose CompanyName has at least 30 characters in it.

PS C:\demos> $customers | ?{$_["CompanyName"].Length -ge 30} | %{$_["ContactName"]}
Ana Trujillo
Diego Roel
Lino Rodriguez
Helvetius Nagy

When you pass a DataTable into a pipe it passes each of the DataTable’s rows in one at a time. The script inside of ?{} is a test, if the calculation done inside of the ?{} is true, then the object that was passed into this segment of the pipe is passed to the next. So ?{} acts as a filter and it is testing each row which is represented by $_. The last segment of the pipe just returns the ContactName from the rows that made it through the filter.

Notice that we accessed the columns of the row by using an indexer and the name of the column we wanted. That is way you access individual columns of a row when you write a C# program. However PowerShell gives us an alternate way to access a column, it add properties to the row whose names are the names of the columns. The value of each one of these column properties is the value of the column it represents. The following script does the same thing as the previous but uses the column properties.

PS C:\demos> $customers | ?{$_.CompanyName.Length -ge 30} | %{$_.ContactName}
Ana Trujillo
Diego Roel
Lino Rodriguez
Helvetius Nagy

You might find this more convient. There are a couple of gotcha’s here though. One thing to keep in mind is that the scripting language in PowerShell is very simlar to C# on purpose. It’s meant to make it straightforward to move back and forth from using PowerShell to C# without constantly have to switch your mental typing context. Another reason for this similarity is that it makes it a reasonably straightforward mechanical transformation to move code from C# to PowerShell and vica versa. So if you use this shortcut you may end up unconciously typing errors in you C# code when you are working with ADO.NET… pays your money and takes your choice.

One more gotcha with this feature is that sometimes column names in a database table have spaces in them and these have to be handled a bit differently when you use this PowerShell property short cut. Here is a simple table definition for a table in the Scratch database.

Create table TestPSData
(
[my id] INT
)

Notice that is has a single column named “my id”, which has a space in it. We can quickly fix up our data adapter to access this table and add it to our dataset.

PS C:\demos> $da.SelectCommand.CommandText="SELECT * FROM Scratch..TestPSData"
PS C:\demos> $da.Fill($dataset)
2

We can print out the ID’s in the TestPSData table using the ADO.NET indexer method.

PS C:\demos> $dataset.Tables[1] | %{$_["my id"]}
1
1

As you can see it printed out the “my id” column value for the two rows in the table.

If we try to use the PowerShell column property method we may get an error…

PS C:\demos> $dataset.Tables[1] | %{$_.my id}
Unexpected token 'id' in expression or statement.
At line:1 char:32
+ $dataset.Tables[1] | %{$_.my id} <<<<

because the space in the name makes PowerShell read an extra value that it doesn’t know how to handle. But PowerShell is a scriping language so this is easy to fix.

PS C:\demos> $dataset.Tables[1] | %{$_."my id"}
1
1

Just put some quotes around the property name so PowerShell treats it as a single word.

Columns names in SQL Server my also begin with a “$” character and this requires a bit more effort to handle, even when you are using the ADO indexer to get the column. Here is another table definition

Create table TestPSData2
(
[$myid] INT
)

Assuming we have filled out dataset up with this table instead of the previous TestPSData table then an attempt to access the $myid column can result in an error in a number of ways.

PS C:\demos> $ds2.Tables[0] | %{$_."$myid"}
PS C:\demos>

Notice that this did not produce any results but it should have produced two rows. That is becuase $myid is interpreted by PowerShell as a variable which has no value, so it is the same as using [“”]

If we try to use indexer then we get a hard error.

PS C:\demos> $ds2.Tables[0] | %{$_["$myid"]}
Unable to index into an object of type System.Data.DataRow.
At line:1 char:23
+ $ds2.Tables[0] | %{$_[" <<<< $myid"]}
Unable to index into an object of type System.Data.DataRow.
At line:1 char:23
+ $ds2.Tables[0] | %{$_[" <<<< $myid"]}
PS C:\demos>

That is because there is no column named “”.

The way to handle this problem in an both cases is to escape the $ with a backtick and enclose the whole thing in quotes.

PS C:\demos> $ds2.Tables[0] | %{$_["`$myid"]}
1
2
PS C:\demos> $ds2.Tables[0] | %{$_."`$myid"}
1
2
PS C:\demos>

OK, enough property name trivia for now.

DataTable supports a SQL-like syntax for queries. We can use that feature from PowerShell to redo are “long company name” query from before.

PS C:\demos> $customers.Select("len(CompanyName) > 30") | %{$_["ContactName"]}
Ana Trujillo
Diego Roel
Helvetius Nagy

The Select method of a Table returns a rowset. When you put a rowset into a pipe, each row is passed down the pipe as it is for a Table. In this example we, as we did before, pull the ContactNames from the rows that made it through the ADO Select filter.

We can use a rowset in PowerShell the same way we would in any other language that supports ADO.NET. For example we can use it to fill a DataTable.

Let’s start by building a DataTable from scratch.

PS C:\demos> $datatable = new-object "System.Data.DataTable"
PS C:\demos> $datatable
PS C:\demos>

It looks like we didn’t build a DataTable, but it just that PowerShell is giving us the rows of the table instead its properties and there are not yet any rows in this datatable. Let’s proceed as though there is a table and make a column for it. In fact lets make this datatable the same a the TestPSData table we used previously.

PS C:\demos> $column = new-object "System.Data.DataColumn" ("my id", [Int32])
PS C:\demos> $datatable.Columns.Add($column)
PS C:\demos> $datatable.Columns | format-table -autosize -property ColumnName, DataType
ColumnName DataType
---------- --------
my id      System.Int32

We started by making a DataColumn whose name is “my id” with datatype Int32. When you need to specify a type in PowerShell [Int32] is the equivalent of typeof(Int32) in C#.

We added the column to the $datatable columns collection then, as we did previously for the Customers table, check to see what columns the table has. Next let’s fill this datatable with the results captured in a DataReader.

The typical way we get a DataReader is to use ExecuteReader method on a SqlCommand. So lets build and execute a SqlCommand. First we will need a connection. We can use the one that is already in our DataAdapter.

PS C:\demos> $conn = $da.SelectCommand.Connection
PS C:\demos> $conn

StatisticsEnabled                : False
ConnectionString                 : server=.;integrated security=true;database=northwind
ConnectionTimeout                : 15
Database                         : northwind
DataSource                       : .
PacketSize                       : 8000
ServerVersion                    :
WorkstationId                    : PARSEC5
FireInfoMessageEventOnUserErrors : False
State                            : Closed
Site                             :
Container                        :

All we did here was to pull a reference to the SqlConnection that our DataAdapter was using. Note that the connection is closed.

Next we need a command to execute. That easy once we have a connection.

PS C:\demos> $cmd = New-Object "System.Data.SqlClient.SqlCommand" ("SELECT * FROM Scratch..TestPSData", $conn)
PS C:\demos> $cmd

Connection             : System.Data.SqlClient.SqlConnection
NotificationAutoEnlist : True
Notification           :
Transaction            :
CommandText            : SELECT * FROM Scratch..TestPSData
CommandTimeout         : 30
CommandType            : Text
DesignTimeVisible      : True
Parameters             : {}
UpdatedRowSource       : Both
Site                   :
Container              :

Now we need to execute the command and capture the results. Of course we want to be sure that the $conn is open before we use it.

PS C:\demos> $conn.Open()
PS C:\demos> $rdr = $cmd.ExecuteReader()
PS C:\demos>

What can we do with a reader? If we were writing a C# program we would probably put $rdr in a loop and call the Read() method on it until we got to the last row. We could do that in PowerShell too if we wanted to, but there is an alternate way which is more “pipeline” oriented.

PS C:\demos> $rdr | &{
>> begin{$values = new-object "System.Object[]" $rdr.FieldCount}
>> process {$_.GetValues($values); $datatable.Rows.Add($values)}
>> }
>>
1
                                                                          my id
                                                                          -----
                                                                              1
1
                                                                              1
PS C:\demos> $datatable | format-table -autosize
my id
-----
    1
    1

We start by putting the DataReader we got in previous step into the pipeline. This causes a DbDataRecord to be sent down the pipeline once for each row in the result. We process the DbDataRecords in an inline function, that is the script inside of &{}. The begin part of the function runs just once before anything is passed into the segement of the pipeline it occupies. It uses the FieldCount of the DataReader to initialize $values variable to an array that can hold all of the values of the columns in a row.

Once the begin part of the inline function has completed the process part is executed once for each DbDataRecord. It uses the GetValues method of the DbDataRecord to extract all of the values of the row and put them into the $values variable. Then it uses the Add method of the Rows property of the $datatable to add a row to the table.

Lastly we send the $datatable variable to format-table and see the the two rows from the TestPSData table in the database have been added to it.

Note that there is one artifact between the filling of the $datatable and the final command… we see the content of the table and some other stuff. This is due to the fact that the GetValues and Rows.Add methods both return values and we did not capture so they “leaked” out of the script. We can fix that just by adding a dummy variable to capture that output.

PS C:\demos> $rdr | &{
>> begin{$values = new-object "System.Object[]" $rdr.FieldCount}
>> process {$d=$_.GetValues($values); $d=$datatable.Rows.Add($values)}
>> }
>>
PS C:\demos>

So we have seen that we can use ADO.NET inside of PowerShell. It a quick way to work out some ad hoc queries and do some processing on the results. You can easily do ad hocSQL queries form SQL Server management Studio or SqlCmd, but with PowerShell you can mix the results of those queries with either pipeline or conventional procedural programming.

Dan dan@pluralsight.com

PS Thanks to Jeffery Stover for pointing out the column properties on DataTables.


Posted Oct 29 2006, 02:48 PM by dan-sullivan
Filed under: ,

Comments

Ross Neilson wrote re: PowerShell and ADO.NET
on 04-05-2007 7:09 AM
Thanks for a useful tutorial Dan- I enjoyed following it and it is an excellent intro to PowerShell.
Gregory Barton wrote re: PowerShell and ADO.NET
on 05-29-2007 1:41 PM
Thanks for this info. Although Powershell is built on .NET, it seems all of Microsoft's examples tend to still use COM objects. I figured they would try to move scripters away from that, but the evidence is not there.
Cash Foley wrote re: PowerShell and ADO.NET
on 06-30-2007 8:17 AM
Gregory (from previous comment). This is not COM objects. This is using .NET ADO.NET

As to the broad objection that MS is not "trying to move scripters away from [COM]", they are making it more unnecessary to use COM. Powershell enables access to many "Object" technologies on the platform including COM and WMI. Access to .Net classes opens huge alternatives.

I'm glad MS didn't take an approach of steering me away from COM by not supporting it in Powershell. Powershell is all about "getting things done" and its seamless bridging of these technologies is amazing. If you don't like COM, you dont' have to use it.
Dmitri Zimin(e) wrote re: PowerShell and ADO.NET
on 07-31-2007 2:46 PM
Dan
thank you for a good walk-through, it is great to show the powererful way of using .NET directly, and very practical - I was looking for exactly this - the way to use ADO.NET rather then ADO for hooking into DB.

Bug report:
$connString = "server=.;integrated security;database=northwind"
but should be:
$connString = "server=.;integrated security=true;database=northwind"
bk wrote re: PowerShell and ADO.NET
on 03-18-2008 3:07 PM
Awesome tutorial!!
Nick wrote re: PowerShell and ADO.NET
on 08-01-2008 12:10 PM

Hey,

Thanks for this blog entry, I figured out how to add a column to an existing dataset table. Now how would I populate the values in the column / rows (I need to combine a 1st name field and a last name field into a "full_name" field).

Mosaic wrote re: PowerShell and ADO.NET
on 09-19-2008 3:31 PM

Great - easy to find and just what I was looking for!

Showing how to use both the DataTable and the DataReader allows me to handle both small and very large sets of data.

Many thanks

Add a Comment

(required)  
(optional)
(required)  
Remember Me?