Friday, November 23, 2012

Saving PowerPoint slides to PDF with PowerShell

Today, I’m having a bit of a catch-up day. One task I’ve been meaning to do for a while was to covert some PowerPoint slides to PDF for distribution and printing.

The basic task of converting a presentation to PDF is simple in PowerPoint. First you start PowerPoint and open the presentation. Then you save it as PDF using the Save-As dialog (and close PowerPoint).

The old saying if you have to do some thing more than once, write a script kicked in and so I did (write a script!). In theory, it should be relatively trivial to write a script that did just that. And once you know the complex PowerPoint object model, so it was! The full script is over on my scripts blog at: http://pshscripts.blogspot.co.uk/2012/11/convert-pptxtopdfps1.html.

It turns out that writing the script was a bit of a walk on the dark side, as in working with com and the Office COM objects. While what I wanted to do was simple, I needed to do it the way that PowerPoint's COM object wanted to – which was  bit different to working with some other API sets.

Working with Office, in particular the Office object model, from PowerShell is a bit painful. The PowerPoint object model is huge. When you instantiate the PowerPoint.Application object, you get a new object that is extremely rich and deep. You get loads of methods and properties. Many of these properties are themselves objects with methods properties (that can also be objects), etc., etc., etc. Since they are COM, the normal goodness of .NET reflection (i.e. Get-Member) is missing. Using a good search engine and applying a little ingenuity is the key!

In terms of getting this to work, the first issue I hit was the need to add some assemblies into PowerShell, like this:

Add-type -AssemblyName office -ErrorAction SilentlyContinue
Add-Type -AssemblyName microsoft.office.interop.powerpoint `
          -ErrorAction SilentlyContinue

Next, you open PowerPoint by instantiating the PowerPoint.Application COM object, and making it visible is relatively easy:

$ppt = new-object -com powerpoint.application
$ppt.visible = [Microsoft.Office.Core.MsoTriState]::msoTrue

An important point here is that you can’t use $True here to make PowerPoint visible. The ‘true’ you have to pass to PowerPoint is based on the enum, not on the normal .NET Boolean true. Finding these enums’s full name for PowerShell requires some Search Engine foo as the full class name is missing in the MSDN documentation.

Once you have PowerPoint open, you need to open the presentation and just save it as PDF. This is also easy, although you have to use the right enum (the string ‘pdf’ is not adequate):

$pres = $ppt.Presentations.Open($ifile)
$opt= [Microsoft.Office.Interop.PowerPoint.PpSaveAsFileType]::ppSaveAsPDF
$pres.SaveAs($ofile,$opt)

In addition to the code here, you also need to clean up and close PowerPoint. You can see how to do that in the script on the PowerShell scripts blog

To test it, I ran this script (which is part of the script posted on the scripts blog!).

$ipath = "E:\SkyDrive\PowerShell V3 Geek Week\"

Foreach ($ifile in $(ls $ipath -Filter "*.pptx")) {
  # Build name of output file
  $pathname = split-path $ifile
  $filename = split-path $ifile -leaf
  $file     = $filename.split(".")[0]
  $ofile    = $pathname + $file + ".pdf"

  # Convert _this_ file to PDF
   Convert-PptxToPDF -ifile $ifile -OFile $ofile
}

So all things considered, it was relatively easy to create a script to do this conversion. And as this sort of thing is something I do all too often, this script will save me time in the future!

Technorati Tags:

4 comments:

Power Point Presentations said...

I’ve recently started a blog, the information you provide on this site has helped me tremendously. Thank you for all of your time & work.
Electrical Engineering

Mr. X said...

For some reason was unable to get the script working for. Found a similar script and merged it with this one. Posting here incase anyone else had problem.

# Add the PowerPoint assemblies that we'll need
Add-type -AssemblyName office -ErrorAction SilentlyContinue
Add-Type -AssemblyName microsoft.office.interop.powerpoint -ErrorAction SilentlyContinue

# Start PowerPoint
$ppt = new-object -com powerpoint.application
$ppt.visible = [Microsoft.Office.Core.MsoTriState]::msoTrue

# Set the locations where to find the PowerPoint files, and where to store the thumbnails
$pptPath = "C:\Users\justin\Desktop\test\"


# Loop through each PowerPoint File
Foreach($iFile in $(ls $pptPath -Filter "*.ppt")){
Set-ItemProperty ($pptPath + $iFile) -name IsReadOnly -value $false
$filename = Split-Path $iFile -leaf
$file = $filename.Split(".")[0]
$oFile = $pptPath + $file + ".pdf"

# Open the PowerPoint file
$pres = $ppt.Presentations.Open($pptPath + $iFile)

# Now save it away as PDF
$opt= [Microsoft.Office.Interop.PowerPoint.PpSaveAsFileType]::ppSaveAsPDF
$pres.SaveAs($ofile,$opt)


# and Tidy-up
$pres.Close();

}

#Clean Up
$ppt.quit();
$ppt = $null
[gc]::Collect();
[gc]::WaitForPendingFinalizers();

Anonymous said...

Thank You for the powershell script. This will help me a lot! But, how can I save it in a way, 6 slides shows in 1 Page. Just like when you print it, you have an option to print it to 2 slides/page, 4 slides/page, 6slides/page and so on...? Please help! Thank you

Alex Marshall said...

If you're interested in extending this script, it all just uses Microsoft's COM technology under the hood. The $ppt object is just a COM object of type 'PowerPoint.Application'. You can find the documentation of all the methods and properties for that object here:

https://msdn.microsoft.com/en-us/vba/powerpoint-vba/articles/application-object-powerpoint

From there you can figure out what methods to call and properties to set to use options.