How to load HTML into mshtml.HTMLDocumentClass with UCOMIPersistFile and my ignorance

[ 2010-02-11 12:26:26 | Author: sunrise_chen ]
Font Size: Large | Medium | Small

What a weird one.  I'm looking at the source for NDoc.Document.HtmlHelp2.Compiler.HtmlHelpFile.  It uses the Microsoft.mshtml interop Assembly to load an HTML file into the HTMLDocumentClass for easy parsing.

It's code looks like this (DOESN'T WORK):

private HTMLDocumentClass GetHtmlDocument( FileInfo f )
{
  HTMLDocumentClass doc = null;
  try
  {
    doc = new HTMLDocumentClass();
    UCOMIPersistFile persistFile = (UCOMIPersistFile)doc;
    persistFile.Load( f.FullName, 0 );
    int start = Environment.TickCount;
    while( doc.body == null ) 
    {
      if ( Environment.TickCount - start > 10000 )
      {
        throw new Exception( string
.Format( "The document {0} timed out while loading", f.Name ) );
      }
    }
  }
}

I went searching as it was taking up 100% CPU for an hour and never completed.  Now I know why! :)

What's weird is this, the only way I could get it to work (as IPersistFile is loading on another Thread) was with this change (NOW IT WORKS):

private HTMLDocumentClass GetHtmlDocument( FileInfo f )
{
  HTMLDocumentClass doc = null;
  try
  {
    doc = new HTMLDocumentClass();
    UCOMIPersistFile persistFile = (UCOMIPersistFile)doc;
    persistFile.Load( f.FullName, 0 );
    int start = Environment.TickCount;
    while( doc.readyState != "complete" )
  

     
System.Windows.Forms.Application.DoEvents();
      if ( Environment.TickCount - start > 10000 )
      {
        throw new Exception( string.Format( "The document {0} timed out while loading", f.Name ) );
      }
    }
  }
}

When I Reflector into DoEvents() I can see that it's doing more than a Sleep(0) (yield), it's actually running the message pump.  Am I missing something?  Apparently IPersistFile needs the message pump?  Well, it works, but it's gross.

[Last Modified By sunrise_chen, at 2010-02-11 12:30:39]
Comments Feed Comments Feed: http://www.ccopus.com/blog/feed.asp?q=comment&id=19

There is no comment on this article.

Post Comment
Smilies
[smile] [confused] [cool] [cry]
[eek] [angry] [wink] [sweat]
[lol] [stun] [razz] [redface]
[rolleyes] [sad] [yes] [no]
[heart] [star] [music] [idea]
Enable UBB Codes
Auto Convert URL
Show Smilies
Hidden Comment
Username:   Password:   Register Now?
Security Code * Please Enter the Security Code