Retrieve the content of Microsoft Word document using OpenXml and C#
One of the tasks involves me to retrieve the contents of Microsoft Word document (word2007 above). I try to search for some resources online with not much luck; most of the examples are for writing contents to word document using OpenXml. I decide to blog this as my reference and hopefully people who read this post will find it useful as well.
To retrieve the contents of Microsoft Word document using XML is extremely simple.
1. Firstly, you need to download and install the Open XML SDK 2.0 for Microsoft Office. (Download link)
2. Create a Console application then add the DocumentFormat.OpenXml.dll and WindowsBase.dll to the project, you can find these dlls in the .NET tab of the Add Reference window.
3. Write the following code to grab the contents from the word document and display it on the console window.
4. To write contents to Word document is very easy too.All you have to do is follow:
You can download the complete source code here.