How do I preserve line breaks when using jsoup to convert html to plain text?

Reference:How do I preserve line breaks when using jsoup to convert html to plain text?

1
2
3
4
5
6
7
8
9
10
public static String br2nl(String html) {
if(html==null)
return html;
Document document = Jsoup.parse(html);
document.outputSettings(new Document.OutputSettings().prettyPrint(false));//makes html() preserve linebreaks and spacing
document.select("br").append("\\n");
document.select("p").prepend("\\n\\n");
String s = document.html().replaceAll("\\\\n", "\n");
return Jsoup.clean(s, "", Whitelist.none(), new Document.OutputSettings().prettyPrint(false));
}

It satisfies the following requirements:

  1. if the original html contains newline(\n), it gets preserved
  2. if the original html contains br or p tags, they gets translated to newline(\n).
0%