This project has moved and is read-only. For the latest updates, please go here.

Expanding HtmlConvert.cs to handle List Items

Topics: Developer Forum
Jul 12, 2012 at 1:49 PM

Hello there

We've tried using the excellent HtmlToTxt sample:

Whilst it converts text great, removed the CSS etc, it fails to handle <li> tags, specifically, they all apear on 1 line, how would one adapt HtmlConvert.cs to replace <li></li> with a linebreak?

This would also need to handle a missing </li>.

Any ideas?

Many thanks



Apr 28, 2016 at 8:53 PM
Edited Apr 28, 2016 at 8:54 PM
This works however all the lists are numerical. If you can figure out how to have the style change (decimal then alpha then roman...) for sub-ordered lists let me know.

Add the following to: internal static void ConvertTo(HtmlNode node, TextWriter outText, PreceedingDomTextInfo textInfo)
case "li":
                                if (textInfo.ListIndex > 0)
                                    outText.Write("\r\n\t{0}.", textInfo.ListIndex++);
                                    outText.Write("\r\n\t•"); //using '*' as bullet char, with tab after, but whatever you want eg "\t->", if utf-8 0x2022
                                isInline = false;
                            case "ol":
                                listIndex = 1;
                                goto case "ul";
                            case "ul": //not handling nested lists any differently at this stage - that is getting close to rendering problems
                                endElementString = "\r\n";
                                isInline = false;