Saturday, November 1, 2008

C# Tutorial - Combine Multiple Word Documents into One Word Document


Digg!


Click the link to download the source code for this post

Licensing and Warranty

You may use the code as you wish - it may be used in commercial or other applications, and it may be redistributed and modified. The code is provided "as-is". No claim of suitability, guarantee, or any warranty whatsoever is provided. By downloading the code, you agree to defend, indemnify, and hold harmless the Author and the Publisher from and against any claims, suits, losses, damages, liabilities, costs, and expenses (including reasonable legal or attorneys' fees) resulting from or relating to any use of the code by you.




One of the tasks that I have to do at work is combine a few Word documents into one Word document, convert it to a PDF, and publish the PDF on the web. I really didn't like to manually combine the documents, so I wrote a quick program that allows me to combine the documents into one document.

Overview


The command-line program uses the Microsoft.Office.Core library to reference Microsoft Word.

In a nutshell, the program does the following:

  • It gets the values that the person entered at the command line

  • It opens a new Microsoft Word file

  • It loops through the paramters, opens each file, and appends the file to the new Microsoft Word file

  • It saves the new file



Highlights


Using an "alias" in the using statement
A quick trick that you can do when referencing a library is assigning a quick "alias". You can then reference the alias throughout your program. Here is an example of how to use an alias in the using statement, and here is an example of referencing the alias:

using WORD = Microsoft.Office.Interop.Word;
:
object sectionBreak = WORD.WdBreakType.wdSectionBreakNextPage;


Capturing the Files to Combine
In order to find out which files to combine, we need to start from position 2 in the array to find out which files we need to combine. What I did in the code is starting from position 2 in the parameters, I build a new array (line 22 in the source). Here is the code snippet:

String[] parms = new String[args.Length - 2]; // subtract 2 - the parm starts @ pos 2, not pos 0
for (int x = 0; x < parms.Length; x++)
{
 parms[x] = args[x + 2];
}


Setting Up the Word Document
Below is the code snippet (line 29 in the source):

// Set up the word object
object template = @"Normal.dot";
String fullName = destDir + "\\" + destFile;
//object fname = @"F:\temp\new_file.doc";
object fname = fullName;
object missing = System.Type.Missing;
//object pageBreak = WORD.WdBreakType.wdPageBreak;
object sectionBreak = WORD.WdBreakType.wdSectionBreakNextPage;

// Create Word application
WORD._Application app = new WORD.Application();

In the first 5 lines, I am setting up variables for the following:

  • The Word template to use

  • The destination (new) file. Note that I have used double backslashes, because the backslash is an escape sequence character in C#. If I used one backslash, I would get a run-time error.

  • A reference to the type of section break in Word. Note that if you want a page break instead of a section break, make a reference to the page break instead.


The final code line creates an instance of the actual Word application.

The Process of Appending the Document
The code to append the documents into one document starts from line 40 in the source. Below is the code snippet:

try
{
  // Create new file
  WORD._Document doc = app.Documents.Add(ref template, ref missing, ref missing, ref missing);

  WORD.Selection selection = app.Selection;

  // Write text
  //selection.TypeText("Write something...");
  //selection.TypeParagraph();

  // New page
  //selection.InsertBreak(ref pageBreak);

  // Insert file
  String insertFile = "";
  for (int y = 0; y < parms.Length; y++)
  {
    insertFile = destDir + "\\" + parms[y];
    selection.InsertFile(insertFile, ref missing, ref missing, ref missing, ref missing);
    //selection.InsertBreak(ref pageBreak);
    selection.InsertBreak(ref sectionBreak);
  }

  // SaveAs
  doc.SaveAs(ref fname, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
}
finally
{
  // Close Word application
  app.Quit(ref missing, ref missing, ref missing);
}


Let me explain various code blocks:

  // Create new file
  WORD._Document doc = app.Documents.Add(ref template, ref missing, ref missing, ref missing);

  WORD.Selection selection = app.Selection;

This code block creates a new document and sets up the cursor.


    insertFile = destDir + "\\" + parms[y];
    selection.InsertFile(insertFile, ref missing, ref missing, ref missing, ref missing);
    //selection.InsertBreak(ref pageBreak);
    selection.InsertBreak(ref sectionBreak);

This code inserts the contents of the file in our new document. Then, it inserts a section break. Again, if you want to enter a page break instead of a section break, you will need to reference the page break instead of the section break. Also note that you don't have to insert any breaks if you don't want to.


  doc.SaveAs(ref fname, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);

This line saves the new document.

This code has a boat load of possibilities. For example, if your final application is required to write text to a new Word document, this code will work. Note that you'll have to comment out the logic that reads the contents of the other Word files, and you'll have to uncomment and modify the logic that writes text (line 47 in the source)

// Write text
//selection.TypeText("Write something...");
//selection.TypeParagraph();



If you have any questions, please post a comment or just ask. :)


Quick Note About the Program: If you are using spaces in the names (ex: C:\documents and settings), you must surround the values with double quotes (ex: "C:\documents and settings") because the C# command prompt treats each space-delimited value as a parameter.

3 comments:

Kevin said...

Hi looks like a good application! I am having trouble getting it to run though. I have been trying to run it from the command prompt via:

Append Documents C:\ConvRel new_file.doc 1.doc 2.doc but I get a 'Word was unable to read this document' error. Am I using the the correct command lien syntax.

Thanks

Jennifer said...

Hi Kevin,

Yes you are using the correct syntax for the application.

That error usually means that Word can't read one of your documents (1.doc or 2.doc) because it's corrupted OR it was created in an earlier version of Word.

If you are using Word 2003 (note: this app was written for Word 2003), you can take a look at these two posts from the Microsoft KB:
- http://support.microsoft.com/kb/889409
- http://support.microsoft.com/kb/903739

André said...

Hi Jennifer, I found the Internet an application very similar to yours, just adapted to what I wanted. That the case is a breach of sesção and change the page orientation to landscape. The problem is that when I do this, it creates a new blank page at the end. I would say why and how remove it. Sorry for my english, I'm from Brazil.