H5 Question: I want to be able to read in the contents of a MHT file and extract the text or HTML or attachments from this file. Where do I start?
Well to start with it’s worth knowing that a .mht and a .eml file are almost identical in format. The .mht file is a complete HTML file (including images etc) that Internet Explorer can create when saving a web page. The only differences are in the headers (e.g. .mht files don’t have a To recipient, and they have a blank Subject), and the .mht files typically don’t have a text only portion (but they do have an HTML portion). You can use the NetEmailReceive._ProcessSplitWholeMessage() method (see the code in the NetDemo.app – Get Email – Load From File) to process a .mht file. This can give you the html portion of the HTML file. (You can turn the HTML portion into text using the NetWebClient.TextOnly() methods (or the methods that .TextOnly calls) but it may look a little rough and unformatted.).