Convert Word documents to clean HTML.
Remove empty paragraphs
Convert
<b>
to
<strong>
,
<i>
to
<em>
Replace non-ascii with HTML entities
Replace smart quotes with ascii equivalents
Indent with tabs, not spaces
Replace non-breaking spaces with ordinary spaces
convert to clean html