Extending Application Discovery in vSphere with VMware Tools 11

Application Discovery

I read with great interest the recent article from William Lam regarding application discovery being included in VMware Tools version 11.  I’ve worked for many organisations in the past where having access to this information has been crucial for both strategic planning and operational activities.  Previously to get that information to support those activities I’ve used tools such as invoke-script or agents already installed on servers to run scripts that return the required information.

I wanted to investigate what we could do to gather and present this data in such a way that would make it useful to inform operational or strategic decision making.  Luckily for me, I have a colleague Dean Lewis , author of the blog site vEducate, who I happened to know had been working on a VMC horizon deployment so I asked if I could ‘borrow’ his lab.

The Lab Setup

It didn’t really matter to me how or what things had been configured in the borrowed lab, just that there were some virtual machines available that had been updated to VMware tools version 11.  In the lab we had 22 machines available to me, of which 20 acted as remote desktop sources for Horizon and 2 provided supporting services.

The Goal

Use William Lam’s Get-VMApplicationInfo function to collect the application data from all the servers in the targeted infrastructure and use that collected information to build a simple visualisation of what was installed in the lab.

I planned to accomplish this by following the following steps;

  1. Build a simple “foreach” script that used the Get-VMApplicationInfo function.
  2. Manipulate the gathered *.csv files, inject the filename to identify source server.
  3. Merge to a single *.csv file.
  4. Build a visualisation

In order to make this work I needed to make a small alteration to the Get-VMApplicationInfo function.  In the current format it outputs the collected CSV information into a name derived from the virtual machine name and a version counter with additional standard text.

“($VM.name)-version-$($appUpdateVersion)-apps”

As I intended to feed the filename back into the CSV, I knew that this format may be a little difficult to work with, especially if any of the virtual machines followed a non standard naming pattern.  Therefore for testing I changed the output file name to equal just the virtual machine name.  In effect just editing lines 35 and 40 to;

Line 35 - $fileOutputName = "$($VM.name).csv"

Line 40 - $fileOutputName = "$($VM.name).json" 

To see the full function refer to William’s original article.

Step One – Build a simple “foreach” script

After connecting to the target virtual centre and loading the function into my PowerShell session I could start.  The first step was to target the entire virtual centre that I was connected to, so that I pulled the application data for each virtual machine in the lab estate.  For simplicity I built a simple “foreach” script that called on the Get-VMApplicationInfo function.

 foreach($vm in Get-VM){
Get-VMApplicationInfo -VM ($vm) -Output CSV
} 

I told you it was simple 🙂

If you’ve loaded the function correctly and assuming you are targeting Virtual Machines with VMware Tools version 11 then when executing this script you should be greeted with the following;

This indicates that the function was successfully run across the estate and has pulled our data back to our working directory.

Step 2 – Inject the Filename

Since we’ve used the virtual machine name as the filename, I could make use of a snippet of PowerShell I’d previously used to inject the file name into the file as a new column.  I’d used this in the past when working with multiple large data exported data sets to help me identify the data source during later manipulation.

 Get-ChildItem *.csv | ForEach-Object {
    $CSV = Import-CSV -Path $_.FullName -Delimiter ","
    $FileName = $_.Name

    $CSV | Select-Object *,@{N='Filename';E={$FileName}} | Export-CSV $_.FullName -NTI -Delimiter ","
    }

This PowerShell is looking for any and all *.csv files in the working directory and adding the file name as a new column of data under the “filename” heading.

I’ve now got 22 csv files each with 3 columns of data, “a” for application, “v” for version and “Filename” for source virtual machine.

Step 3 – Merge

Merging the information into a single *.csv not only makes it easier to work with, but it will allow us to start interrogating the information and gaining insight into the applications running in the lab estate.

 $getFirstLine = $true
$timestamp = Get-Date -Format yy-MM-dd-hh-mm

get-childItem "\path\to\source\directory\" | foreach {
    $filePath = $_

    $lines = Get-Content $filePath  
    $linesToWrite = switch($getFirstLine) {
           $true  {$lines}
           $false {$lines | Select -Skip 1}

    }

    $getFirstLine = $false
    Add-Content "\path\to\export\directory\merge-$timestamp.csv" $linesToWrite
    } 

The above will merge any and all *.csv files located in the source directory and save them to a *.csv file identified in the export directory.  My original script didn’t include a timestamp in the export file name.  Dean suggested that it would be a good idea and it was simple enough to include it after we found the correct syntax for the Get-Date cmdlet.

Interestingly including the timestamp opens up a future use case where we could use the gathered information to identify changes over time, although that’s something for another blog post.

Step 4 – Visualisations

In principle you could use excel and pivot tables to gain some insight into the data that has been gathered.  However, I have access to PowerBI so I’m going to go a step further and use that.  After launching PowerBi desktop I need to import my merged data so that I can work with it to create visualisations.  PowerBI works with multiple data sources ranging from direct connections to salesforce, Azure hosted DBs, SQL, excel and many more.  Luckily for us it also works with our humble *.csv file.

Once imported we can use PowerQuery to tidy up the data, provide some friendlier column names and remove some of the superfluous information (such as the trailing *.csv)

let
    Source = Csv.Document(File.Contents("C:\Users\conyardsi\OneDrive - VMware, Inc\merge-20-01-03-02-46.csv"),[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
    #"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"a", type text}, {"v", type text}, {"Filename", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"a", "App_Exe"}, {"v", "Version"}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Renamed Columns", "Filename", Splitter.SplitTextByEachDelimiter({"."}, QuoteStyle.Csv, true), {"Filename.1", "Filename.2"}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Filename.1", type text}, {"Filename.2", type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type2",{"Filename.2"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Removed Columns",{{"Filename.1", "Hostname"}})
in
    #"Renamed Columns1"

PowerQuery comes with a powerful advanced editor, but this isn’t the only way of accomplishing tasks within the tool.  What I’ve show above is the query I built to tidy the data from the *.csv, as it is represented in the advanced editor.  For more details on building queries in PowerBI, take a look at this Microsoft Documentation.

Once the query has been built out, PowerBI is really very simple to use to build meaningful visualisations.  The interactive nature of a well crafted visualisation means that selecting various issues and anomalies will provide insight there and then.  Rather than me building a step by step guide in how to create PowerBi visualisations, the official Microsoft documentation can be found here

Results and Findings

I’m happy with the results.  From reading the blog post to pulling this together took less than a couple of hours.  to be honest writing this blog post has taken longer than gathering and presenting the data!

The first run of the data provided output for the 20 rd-farm servers that formed part of the Horizon environment Dean was building, in an ideal world these should each be identical.

The visualisation above consists of 4 very simple interactive elements, a tabular view of the raw data, a slicer set for filtering by application name, a count of applications running on each host and a count of unique application versions.  There are many different ways the data could have been presented, this was just the first pass.

And we already gained some insight…

As we can see from the visualisation, 5 servers are running an additional process, 81 instead of 80.  It was very straightforward to interact with the visualisation and see that there where a few services that differed from server to server, namely GoogleUpdate, VMwareView-RDEServer and WmiApSrv.

By selecting host rd-farm 11, it highlighted the discrepancies in the application version view.  Adding the data for the final two servers provides yet more insight.

When the data from the final 2 servers is added, the same visualisation provides more insight. Now we’re not correlating this data with other sources such as OS version, but from here we can see that  server dc2 is running a different version of services.exe, so is therefore likely running a different build of the windows OS.  Easy enough to check, so I asked Dean to have a quick look and sure enough;

The AD server for the environment is running at a different OS patch level than everything else.

Summary

In a little less than a morning using William Lam’s Get-VMApplicationInfo in conjunction with a few snippets of PowerShell and PowerBI, it was possible to get very granular, highly valuable insight that wasn’t immediately available to the infrastructure owner.

This again highlights the value of VMware Tools, the importance of keeping it up to date and the excellent engineering work that goes into expanding its functionality.

Thanks

Simon