To build the application
- Start Microsoft Visual Studio 2010.
- On the File menu, point to New, and then click Project.
- In the New Project dialog box, in the Recent Template pane, expand Visual C#, and then click Windows.
- To the right side of the Recent Template pane, click Console Application.
- By default, Visual Studio creates a project that targets .NET Framework 4. However, you must target .NET Framework 3.5. From the list at the upper part of the File Open dialog box, select .NET Framework 3.5.
- In the Name box, type the name that you want to use for your project, such as FirstWordAutomationServicesApplication.
- In the Location box, type the location where you want to place the project.
Figure 1. Creating a solution in the New Project dialog box - Click OK to create the solution.
- By default, Visual Studio 2010 creates projects that target x86 CPUs, but to build SharePoint Server applications, you must target any CPU.
- If you are building a Microsoft Visual C# application, in Solution Explorer window, right-click the project, and then click Properties.
- In the project properties window, click Build.
- Point to the Platform Target list, and select Any CPU.
Figure 2. Target Any CPU when building a C# console application - If you are building a Microsoft Visual Basic .NET Framework application, in the project properties window, click Compile.
Figure 3. Compile options for a Visual Basic application - Click Advanced Compile Options.
Figure 4. Advanced Compiler Settings dialog box - Point to the Platform Target list, and then click Any CPU.
- To add a reference to the Microsoft.Office.Word.Server assembly, on the Project menu, click Add Reference to open the Add Reference dialog box.
- Select the .NET tab, and add the component named Microsoft Office 2010 component.
Figure 5. Adding a reference to Microsoft Office 2010 component - Next, add a reference to the Microsoft.SharePoint assembly.
Figure 6. Adding a reference to Microsoft SharePoint
The following examples provide the complete C# and Visual Basic listings for the simplest Word Automation Services application.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.SharePoint;
using Microsoft.Office.Word.Server.Conversions;
class Program
{
static void Main(string[] args)
{
string siteUrl = "http://localhost";
// If you manually installed Word automation services, then replace the name
// in the following line with the name that you assigned to the service when
// you installed it.
string wordAutomationServiceName = "Word Automation Services";
using (SPSite spSite = new SPSite(siteUrl))
{
ConversionJob job = new ConversionJob(wordAutomationServiceName);
job.UserToken = spSite.UserToken;
job.Settings.UpdateFields = true;
job.Settings.OutputFormat = SaveFormat.PDF;
job.AddFile(siteUrl + "/Shared%20Documents/Test.docx",
siteUrl + "/Shared%20Documents/Test.pdf");
job.Start();
}
}
}
To build and run the example
- Add a Word document named Test.docx to the Shared Documents folder in the SharePoint site.
- Build and run the example.
- After waiting one minute for the conversion process to run, navigate to the Shared Documents folder in the SharePoint site, and refresh the page. The document library now contains a new PDF document, Test.pdf.
Monitoring Conversion Status
In many scenarios, you want to monitor the status of conversions, to inform the user when the conversion process is complete, or to process the converted documents in additional ways. You can use the ConversionJobStatus class to query Word Automation Services about the status of a conversion job. You pass the name of the WordServiceApplicationProxy class as a string (by default, "Word Automation Services"), and the conversion job identifier, which you can get from the ConversionJob object. You can also pass a GUID that specifies a tenant partition. However, if the SharePoint Server farm is not configured for multiple tenants, you can pass null (Nothing in Visual Basic) as the argument for this parameter.
After you instantiate a ConversionJobStatus object, you can access several properties that indicate the status of the conversion job. The following are the three most interesting properties.
ConversionJobStatus Properties
Property | Return Value |
---|---|
Count | Number of documents currently in the conversion job. |
Succeeded | Number of documents successfully converted. |
Failed | The number of documents that failed conversion. |
Whereas the first example specified a single document to convert, the following example converts all documents in a specified document library. You have the option of creating all converted documents in a different document library than the source library, but for simplicity, the following example specifies the same document library for both the input and output document libraries. In addition, the following example specifies that the conversion job should overwrite the output document if it already exists.
Console.WriteLine("Starting conversion job");
ConversionJob job = new ConversionJob(wordAutomationServiceName);
job.UserToken = spSite.UserToken;
job.Settings.UpdateFields = true;
job.Settings.OutputFormat = SaveFormat.PDF;
job.Settings.OutputSaveBehavior = SaveBehavior.AlwaysOverwrite;
SPList listToConvert = spSite.RootWeb.Lists["Shared Documents"];
job.AddLibrary(listToConvert, listToConvert);
job.Start();
Console.WriteLine("Conversion job started");
ConversionJobStatus status = new ConversionJobStatus(wordAutomationServiceName,
job.JobId, null);
Console.WriteLine("Number of documents in conversion job: {0}", status.Count);
while (true)
{
Thread.Sleep(5000);
status = new ConversionJobStatus(wordAutomationServiceName, job.JobId,
null);
if (status.Count == status.Succeeded + status.Failed)
{
Console.WriteLine("Completed, Successful: {0}, Failed: {1}",
status.Succeeded, status.Failed);
break;
}
Console.WriteLine("In progress, Successful: {0}, Failed: {1}",
status.Succeeded, status.Failed);
}
To run this example, add some WordprocessingML documents in the Shared Documents library. When you run this example, you see output similar to this code snippet,
Starting conversion job
Conversion job started
Number of documents in conversion job: 4
In progress, Successful: 0, Failed: 0
In progress, Successful: 0, Failed: 0
Completed, Successful: 4, Failed: 0
Identifying Documents That Failed to Convert
You may want to determine which documents failed conversion, perhaps to inform the user, or take remedial action such as removing the invalid document from the input document library. You can call the GetItems method, which returns a collection of ConversionItemInfo objects. When you call the GetItems method, you pass a parameter that specifies whether you want to retrieve a collection of failed conversions or successful conversions.
C#
Console.WriteLine("Starting conversion job");
ConversionJob job = new ConversionJob(wordAutomationServiceName);
job.UserToken = spSite.UserToken;
job.Settings.UpdateFields = true;
job.Settings.OutputFormat = SaveFormat.PDF;
job.Settings.OutputSaveBehavior = SaveBehavior.AlwaysOverwrite;
SPList listToConvert = spSite.RootWeb.Lists["Shared Documents"];
job.AddLibrary(listToConvert, listToConvert);
job.Start();
Console.WriteLine("Conversion job started");
ConversionJobStatus status = new ConversionJobStatus(wordAutomationServiceName,
job.JobId, null);
Console.WriteLine("Number of documents in conversion job: {0}", status.Count);
while (true)
{
Thread.Sleep(5000);
status = new ConversionJobStatus(wordAutomationServiceName, job.JobId, null);
if (status.Count == status.Succeeded + status.Failed)
{
Console.WriteLine("Completed, Successful: {0}, Failed: {1}",
status.Succeeded, status.Failed);
ReadOnlyCollection<ConversionItemInfo> failedItems =
status.GetItems(ItemTypes.Failed);
foreach (var failedItem in failedItems)
Console.WriteLine("Failed item: Name:{0}", failedItem.InputFile);
break;
}
Console.WriteLine("In progress, Successful: {0}, Failed: {1}", status.Succeeded,
status.Failed);
}
To run this example, create an invalid document and upload it to the document library. An easy way to create an invalid document is to rename the WordprocessingML document, appending .zip to the file name. Then delete the main document part (known as
document.xml
), which is in the Word folder of the package. Rename the document, removing the .zip extension so that it contains the normal .docx extension.When you run this example, it produces output similar to the following.
Starting conversion job
Conversion job started
Number of documents in conversion job: 5
In progress, Successful: 0, Failed: 0
In progress, Successful: 0, Failed: 0
In progress, Successful: 4, Failed: 0
In progress, Successful: 4, Failed: 0
In progress, Successful: 4, Failed: 0
Completed, Successful: 4, Failed: 1
Failed item: Name:http://intranet.contoso.com/Shared%20Documents/IntentionallyInvalidDocument.docx
Another approach to monitoring a conversion process is to use event handlers on a SharePoint list to determine when a converted document is added to the output document library.
Deleting Source Files after Conversion
Console.WriteLine("Starting conversion job");
ConversionJob job = new ConversionJob(wordAutomationServiceName);
job.UserToken = spSite.UserToken;
job.Settings.UpdateFields = true;
job.Settings.OutputFormat = SaveFormat.PDF;
job.Settings.OutputSaveBehavior = SaveBehavior.AlwaysOverwrite;
SPFolder folderToConvert = spSite.RootWeb.GetFolder("Shared Documents");
job.AddFolder(folderToConvert, folderToConvert, false);
job.Start();
Console.WriteLine("Conversion job started");
ConversionJobStatus status = new ConversionJobStatus(wordAutomationServiceName,
job.JobId, null);
Console.WriteLine("Number of documents in conversion job: {0}", status.Count);
while (true)
{
Thread.Sleep(5000);
status = new ConversionJobStatus(wordAutomationServiceName, job.JobId, null);
if (status.Count == status.Succeeded + status.Failed)
{
Console.WriteLine("Completed, Successful: {0}, Failed: {1}",
status.Succeeded, status.Failed);
Console.WriteLine("Deleting only items that successfully converted");
ReadOnlyCollection<ConversionItemInfo> convertedItems =
status.GetItems(ItemTypes.Succeeded);
foreach (var convertedItem in convertedItems)
{
Console.WriteLine("Deleting item: Name:{0}", convertedItem.InputFile);
folderToConvert.Files.Delete(convertedItem.InputFile);
}
break;
}
Console.WriteLine("In progress, Successful: {0}, Failed: {1}",
status.Succeeded, status.Failed);
}
Console.WriteLine("Starting conversion job")
Dim job As ConversionJob = New ConversionJob(wordAutomationServiceName)
job.UserToken = spSite.UserToken
job.Settings.UpdateFields = True
job.Settings.OutputFormat = SaveFormat.PDF
job.Settings.OutputSaveBehavior = SaveBehavior.AlwaysOverwrite
Dim folderToConvert As SPFolder = spSite.RootWeb.GetFolder("Shared Documents")
job.AddFolder(folderToConvert, folderToConvert, False)
job.Start()
Console.WriteLine("Conversion job started")
Dim status As ConversionJobStatus = _
New ConversionJobStatus(wordAutomationServiceName, job.JobId, Nothing)
Console.WriteLine("Number of documents in conversion job: {0}", status.Count)
While True
Thread.Sleep(5000)
status = New ConversionJobStatus(wordAutomationServiceName, job.JobId, _
Nothing)
If status.Count = status.Succeeded + status.Failed Then
Console.WriteLine("Completed, Successful: {0}, Failed: {1}", _
status.Succeeded, status.Failed)
Console.WriteLine("Deleting only items that successfully converted")
Dim convertedItems As ReadOnlyCollection(Of ConversionItemInfo) = _
status.GetItems(ItemTypes.Succeeded)
For Each convertedItem In convertedItems
Console.WriteLine("Deleting item: Name:{0}", convertedItem.InputFile)
folderToConvert.Files.Delete(convertedItem.InputFile)
Next
Exit While
End If
Console.WriteLine("In progress, Successful: {0}, Failed: {1}",
status.Succeeded, status.Failed)
End While
Integrating with the Open XML SDK
The power of using Word Automation Services becomes clear when you use it in combination with the Welcome to the Open XML SDK 2.0 for Microsoft Office. You can programmatically modify a document in a document library by using the Welcome to the Open XML SDK 2.0 for Microsoft Office, and then use Word Automation Services to perform one of the difficult tasks by using the Open XML SDK. A common need is to programmatically generate a document, and then generate or update the table of contents of the document. Consider the following document, which contains a table of contents.
Figure 7. Document with a table of contents
Let’s assume you want to modify this document, adding content that should be included in the table of contents. This next example takes the following steps.
- Opens the site and retrieves the
Test.docx
document by using a Collaborative Application Markup Language (CAML) query. - Opens the document by using the Open XML SDK 2.0, and adds a new paragraph styled as Heading 1 at the beginning of the document.
- Starts a conversion job, converting
Test.docx
toTestWithNewToc.docx
. It waits for the conversion to complete, and reports whether it was converted successfully.
C#
Console.WriteLine("Querying for Test.docx");
SPList list = spSite.RootWeb.Lists["Shared Documents"];
SPQuery query = new SPQuery();
query.ViewFields = @"<FieldRef Name='FileLeafRef' />";
query.Query =
@"<Where>
<Eq>
<FieldRef Name='FileLeafRef' />
<Value Type='Text'>Test.docx</Value>
</Eq>
</Where>";
SPListItemCollection collection = list.GetItems(query);
if (collection.Count != 1)
{
Console.WriteLine("Test.docx not found");
Environment.Exit(0);
}
Console.WriteLine("Opening");
SPFile file = collection[0].File;
byte[] byteArray = file.OpenBinary();
using (MemoryStream memStr = new MemoryStream())
{
memStr.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open(memStr, true))
{
Document document = wordDoc.MainDocumentPart.Document;
Paragraph firstParagraph = document.Body.Elements<Paragraph>()
.FirstOrDefault();
if (firstParagraph != null)
{
Paragraph newParagraph = new Paragraph(
new ParagraphProperties(
new ParagraphStyleId() { Val = "Heading1" }),
new Run(
new Text("About the Author")));
Paragraph aboutAuthorParagraph = new Paragraph(
new Run(
new Text("Eric White")));
firstParagraph.Parent.InsertBefore(newParagraph, firstParagraph);
firstParagraph.Parent.InsertBefore(aboutAuthorParagraph,
firstParagraph);
}
}
Console.WriteLine("Saving");
string linkFileName = file.Item["LinkFilename"] as string;
file.ParentFolder.Files.Add(linkFileName, memStr, true);
}
Console.WriteLine("Starting conversion job");
ConversionJob job = new ConversionJob(wordAutomationServiceName);
job.UserToken = spSite.UserToken;
job.Settings.UpdateFields = true;
job.Settings.OutputFormat = SaveFormat.Document;
job.AddFile(siteUrl + "/Shared%20Documents/Test.docx",
siteUrl + "/Shared%20Documents/TestWithNewToc.docx");
job.Start();
Console.WriteLine("After starting conversion job");
while (true)
{
Thread.Sleep(5000);
Console.WriteLine("Polling...");
ConversionJobStatus status = new ConversionJobStatus(
wordAutomationServiceName, job.JobId, null);
if (status.Count == status.Succeeded + status.Failed)
{
Console.WriteLine("Completed, Successful: {0}, Failed: {1}",
status.Succeeded, status.Failed);
break;
}
}
After running this example with a document similar to the one used earlier in this section, a new document is produced, as shown in Figure 8.
Figure 8. Document with updated table of contents
Conclusion
The Open XML SDK 2.0 is a powerful tool for building server-side document generation and document processing systems. However, there are aspects of document manipulation that are difficult, such a document conversions, and updating of fields, table of contents, and more. Word Automation Services fills this gap with a high-performance solution that can scale out to your requirements. Using the Open XML SDK 2.0 in combination with Word Automation Services enables many scenarios that are difficult when using only the Open XML SDK 2.0.
No comments:
Post a Comment