A simple Golang package to convert HTML to plain text.
It converts HTML tags to text and also parses HTML entities into characters they represent. A header section of the HTML document is stripped out, most tags are stripped but links are properly converted into their href attribute.
It can be used for converting HTML emails into text.
Some tests are installed as well.
Fell free to publish a pull request if you have suggestions for improvement.
go get github.com/k3a/html2text
package main
import (
"fmt"
"github.com/k3a/html2text"
)
func main() {
html := `<html><head><title>Good</title></head><body><strong>clean</strong> text</body>`
plain := html2text.HTML2Text(html)
fmt.Println(plain)
}
/* Outputs:
clean text
*/
MIT