summaryrefslogtreecommitdiffstats
path: root/vendor/github.com/jaytaylor
diff options
context:
space:
mode:
authorChristopher Speller <crspeller@gmail.com>2017-03-13 12:54:22 -0400
committerGitHub <noreply@github.com>2017-03-13 12:54:22 -0400
commitc281ee3b61e8ab53ff118866d72618ae8cce582b (patch)
tree776e7bdf6c8bfbb9a1dee5976496ab065959991f /vendor/github.com/jaytaylor
parent3ada7a41a7fb13abef19dd63dc56b720900dbaa9 (diff)
downloadchat-c281ee3b61e8ab53ff118866d72618ae8cce582b.tar.gz
chat-c281ee3b61e8ab53ff118866d72618ae8cce582b.tar.bz2
chat-c281ee3b61e8ab53ff118866d72618ae8cce582b.zip
Updating server dependancies. Also adding github.com/jaytaylor/html2text and gopkg.in/gomail.v2 (#5748)
Diffstat (limited to 'vendor/github.com/jaytaylor')
-rw-r--r--vendor/github.com/jaytaylor/html2text/.gitignore24
-rw-r--r--vendor/github.com/jaytaylor/html2text/.travis.yml14
-rw-r--r--vendor/github.com/jaytaylor/html2text/LICENSE22
-rw-r--r--vendor/github.com/jaytaylor/html2text/README.md116
-rw-r--r--vendor/github.com/jaytaylor/html2text/html2text.go312
-rw-r--r--vendor/github.com/jaytaylor/html2text/html2text_test.go674
-rwxr-xr-xvendor/github.com/jaytaylor/html2text/testdata/utf8.html22
-rwxr-xr-xvendor/github.com/jaytaylor/html2text/testdata/utf8_with_bom.xhtml24
8 files changed, 1208 insertions, 0 deletions
diff --git a/vendor/github.com/jaytaylor/html2text/.gitignore b/vendor/github.com/jaytaylor/html2text/.gitignore
new file mode 100644
index 000000000..daf913b1b
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/.gitignore
@@ -0,0 +1,24 @@
+# Compiled Object files, Static and Dynamic libs (Shared Objects)
+*.o
+*.a
+*.so
+
+# Folders
+_obj
+_test
+
+# Architecture specific extensions/prefixes
+*.[568vq]
+[568vq].out
+
+*.cgo1.go
+*.cgo2.c
+_cgo_defun.c
+_cgo_gotypes.go
+_cgo_export.*
+
+_testmain.go
+
+*.exe
+*.test
+*.prof
diff --git a/vendor/github.com/jaytaylor/html2text/.travis.yml b/vendor/github.com/jaytaylor/html2text/.travis.yml
new file mode 100644
index 000000000..6c7f48efd
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/.travis.yml
@@ -0,0 +1,14 @@
+language: go
+go:
+ - tip
+ - 1.8
+ - 1.7
+ - 1.6
+ - 1.5
+ - 1.4
+ - 1.3
+ - 1.2
+notifications:
+ email:
+ on_success: change
+ on_failure: always
diff --git a/vendor/github.com/jaytaylor/html2text/LICENSE b/vendor/github.com/jaytaylor/html2text/LICENSE
new file mode 100644
index 000000000..24dc4abec
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/LICENSE
@@ -0,0 +1,22 @@
+The MIT License (MIT)
+
+Copyright (c) 2015 Jay Taylor
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+
diff --git a/vendor/github.com/jaytaylor/html2text/README.md b/vendor/github.com/jaytaylor/html2text/README.md
new file mode 100644
index 000000000..ac1124739
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/README.md
@@ -0,0 +1,116 @@
+# html2text
+
+[![Documentation](https://godoc.org/github.com/jaytaylor/html2text?status.svg)](https://godoc.org/github.com/jaytaylor/html2text)
+[![Build Status](https://travis-ci.org/jaytaylor/html2text.svg?branch=master)](https://travis-ci.org/jaytaylor/html2text)
+[![Report Card](https://goreportcard.com/badge/github.com/jaytaylor/html2text)](https://goreportcard.com/report/github.com/jaytaylor/html2text)
+
+### Converts HTML into text
+
+
+## Introduction
+
+Ensure your emails are readable by all!
+
+Turns HTML into raw text, useful for sending fancy HTML emails with a equivalently nicely formatted TXT document as a fallback (e.g. for people who don't allow HTML emails or have other display issues).
+
+html2text is a simple golang package for rendering HTML into plaintext.
+
+There are still lots of improvements to be had, but FWIW this has worked fine for my [basic] HTML-2-text needs.
+
+It requires go 1.x or newer ;)
+
+
+## Download the package
+
+```bash
+go get github.com/jaytaylor/html2text
+```
+
+## Example usage
+
+```go
+package main
+
+import (
+ "fmt"
+
+ "github.com/jaytaylor/html2text"
+)
+
+func main() {
+ inputHtml := `
+ <html>
+ <head>
+ <title>My Mega Service</title>
+ <link rel=\"stylesheet\" href=\"main.css\">
+ <style type=\"text/css\">body { color: #fff; }</style>
+ </head>
+
+ <body>
+ <div class="logo">
+ <a href="http://mymegaservice.com/"><img src="/logo-image.jpg" alt="Mega Service"/></a>
+ </div>
+
+ <h1>Welcome to your new account on my service!</h1>
+
+ <p>
+ Here is some more information:
+
+ <ul>
+ <li>Link 1: <a href="https://example.com">Example.com</a></li>
+ <li>Link 2: <a href="https://example2.com">Example2.com</a></li>
+ <li>Something else</li>
+ </ul>
+ </p>
+ </body>
+ </html>
+ `
+
+ text, err := html2text.FromString(inputHtml)
+ if err != nil {
+ panic(err)
+ }
+ fmt.Println(text)
+}
+```
+
+Output:
+```
+Mega Service ( http://mymegaservice.com/ )
+
+******************************************
+Welcome to your new account on my service!
+******************************************
+
+Here is some more information:
+
+* Link 1: Example.com ( https://example.com )
+* Link 2: Example2.com ( https://example2.com )
+* Something else
+```
+
+
+## Unit-tests
+
+Running the unit-tests is straightforward and standard:
+
+```bash
+go test
+```
+
+
+# License
+
+Permissive MIT license.
+
+
+## Contact
+
+You are more than welcome to open issues and send pull requests if you find a bug or want a new feature.
+
+If you appreciate this library please feel free to drop me a line and tell me! It's always nice to hear from people who have benefitted from my work.
+
+Email: jay at (my github username).com
+
+Twitter: [@jtaylor](https://twitter.com/jtaylor)
+
diff --git a/vendor/github.com/jaytaylor/html2text/html2text.go b/vendor/github.com/jaytaylor/html2text/html2text.go
new file mode 100644
index 000000000..2a013a039
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/html2text.go
@@ -0,0 +1,312 @@
+package html2text
+
+import (
+ "bytes"
+ "io"
+ "regexp"
+ "strings"
+ "unicode"
+
+ "github.com/ssor/bom"
+
+ "golang.org/x/net/html"
+ "golang.org/x/net/html/atom"
+)
+
+var (
+ spacingRe = regexp.MustCompile(`[ \r\n\t]+`)
+ newlineRe = regexp.MustCompile(`\n\n+`)
+)
+
+type textifyTraverseCtx struct {
+ Buf bytes.Buffer
+
+ prefix string
+ blockquoteLevel int
+ lineLength int
+ endsWithSpace bool
+ endsWithNewline bool
+ justClosedDiv bool
+}
+
+func (ctx *textifyTraverseCtx) traverse(node *html.Node) error {
+ switch node.Type {
+ default:
+ return ctx.traverseChildren(node)
+
+ case html.TextNode:
+ data := strings.Trim(spacingRe.ReplaceAllString(node.Data, " "), " ")
+ return ctx.emit(data)
+
+ case html.ElementNode:
+ return ctx.handleElementNode(node)
+ }
+}
+
+func (ctx *textifyTraverseCtx) handleElementNode(node *html.Node) error {
+ ctx.justClosedDiv = false
+ switch node.DataAtom {
+ case atom.Br:
+ return ctx.emit("\n")
+
+ case atom.H1, atom.H2, atom.H3:
+ subCtx := textifyTraverseCtx{}
+ if err := subCtx.traverseChildren(node); err != nil {
+ return err
+ }
+
+ str := subCtx.Buf.String()
+ dividerLen := 0
+ for _, line := range strings.Split(str, "\n") {
+ if lineLen := len([]rune(line)); lineLen-1 > dividerLen {
+ dividerLen = lineLen - 1
+ }
+ }
+ divider := ""
+ if node.DataAtom == atom.H1 {
+ divider = strings.Repeat("*", dividerLen)
+ } else {
+ divider = strings.Repeat("-", dividerLen)
+ }
+
+ if node.DataAtom == atom.H3 {
+ return ctx.emit("\n\n" + str + "\n" + divider + "\n\n")
+ }
+ return ctx.emit("\n\n" + divider + "\n" + str + "\n" + divider + "\n\n")
+
+ case atom.Blockquote:
+ ctx.blockquoteLevel++
+ ctx.prefix = strings.Repeat(">", ctx.blockquoteLevel) + " "
+ if err := ctx.emit("\n"); err != nil {
+ return err
+ }
+ if ctx.blockquoteLevel == 1 {
+ if err := ctx.emit("\n"); err != nil {
+ return err
+ }
+ }
+ if err := ctx.traverseChildren(node); err != nil {
+ return err
+ }
+ ctx.blockquoteLevel--
+ ctx.prefix = strings.Repeat(">", ctx.blockquoteLevel)
+ if ctx.blockquoteLevel > 0 {
+ ctx.prefix += " "
+ }
+ return ctx.emit("\n\n")
+
+ case atom.Div:
+ if ctx.lineLength > 0 {
+ if err := ctx.emit("\n"); err != nil {
+ return err
+ }
+ }
+ if err := ctx.traverseChildren(node); err != nil {
+ return err
+ }
+ var err error
+ if ctx.justClosedDiv == false {
+ err = ctx.emit("\n")
+ }
+ ctx.justClosedDiv = true
+ return err
+
+ case atom.Li:
+ if err := ctx.emit("* "); err != nil {
+ return err
+ }
+
+ if err := ctx.traverseChildren(node); err != nil {
+ return err
+ }
+
+ return ctx.emit("\n")
+
+ case atom.B, atom.Strong:
+ subCtx := textifyTraverseCtx{}
+ subCtx.endsWithSpace = true
+ if err := subCtx.traverseChildren(node); err != nil {
+ return err
+ }
+ str := subCtx.Buf.String()
+ return ctx.emit("*" + str + "*")
+
+ case atom.A:
+ // If image is the only child, take its alt text as the link text
+ if img := node.FirstChild; img != nil && node.LastChild == img && img.DataAtom == atom.Img {
+ if altText := getAttrVal(img, "alt"); altText != "" {
+ ctx.emit(altText)
+ }
+ } else if err := ctx.traverseChildren(node); err != nil {
+ return err
+ }
+
+ hrefLink := ""
+ if attrVal := getAttrVal(node, "href"); attrVal != "" {
+ attrVal = ctx.normalizeHrefLink(attrVal)
+ if attrVal != "" {
+ hrefLink = "( " + attrVal + " )"
+ }
+ }
+
+ return ctx.emit(hrefLink)
+
+ case atom.P, atom.Ul, atom.Table:
+ if err := ctx.emit("\n\n"); err != nil {
+ return err
+ }
+
+ if err := ctx.traverseChildren(node); err != nil {
+ return err
+ }
+
+ return ctx.emit("\n\n")
+
+ case atom.Tr:
+ if err := ctx.traverseChildren(node); err != nil {
+ return err
+ }
+
+ return ctx.emit("\n")
+
+ case atom.Style, atom.Script, atom.Head:
+ // Ignore the subtree
+ return nil
+
+ default:
+ return ctx.traverseChildren(node)
+ }
+}
+func (ctx *textifyTraverseCtx) traverseChildren(node *html.Node) error {
+ for c := node.FirstChild; c != nil; c = c.NextSibling {
+ if err := ctx.traverse(c); err != nil {
+ return err
+ }
+ }
+
+ return nil
+}
+
+func (ctx *textifyTraverseCtx) emit(data string) error {
+ if len(data) == 0 {
+ return nil
+ }
+ lines := ctx.breakLongLines(data)
+ var err error
+ for _, line := range lines {
+ runes := []rune(line)
+ startsWithSpace := unicode.IsSpace(runes[0])
+ if !startsWithSpace && !ctx.endsWithSpace {
+ ctx.Buf.WriteByte(' ')
+ ctx.lineLength++
+ }
+ ctx.endsWithSpace = unicode.IsSpace(runes[len(runes)-1])
+ for _, c := range line {
+ _, err = ctx.Buf.WriteString(string(c))
+ if err != nil {
+ return err
+ }
+ ctx.lineLength++
+ if c == '\n' {
+ ctx.lineLength = 0
+ if ctx.prefix != "" {
+ _, err = ctx.Buf.WriteString(ctx.prefix)
+ if err != nil {
+ return err
+ }
+ }
+ }
+ }
+ }
+ return nil
+}
+
+func (ctx *textifyTraverseCtx) breakLongLines(data string) []string {
+ // only break lines when we are in blockquotes
+ if ctx.blockquoteLevel == 0 {
+ return []string{data}
+ }
+ var ret []string
+ runes := []rune(data)
+ l := len(runes)
+ existing := ctx.lineLength
+ if existing >= 74 {
+ ret = append(ret, "\n")
+ existing = 0
+ }
+ for l+existing > 74 {
+ i := 74 - existing
+ for i >= 0 && !unicode.IsSpace(runes[i]) {
+ i--
+ }
+ if i == -1 {
+ // no spaces, so go the other way
+ i = 74 - existing
+ for i < l && !unicode.IsSpace(runes[i]) {
+ i++
+ }
+ }
+ ret = append(ret, string(runes[:i])+"\n")
+ for i < l && unicode.IsSpace(runes[i]) {
+ i++
+ }
+ runes = runes[i:]
+ l = len(runes)
+ existing = 0
+ }
+ if len(runes) > 0 {
+ ret = append(ret, string(runes))
+ }
+ return ret
+}
+
+func (ctx *textifyTraverseCtx) normalizeHrefLink(link string) string {
+ link = strings.TrimSpace(link)
+ link = strings.TrimPrefix(link, "mailto:")
+ return link
+}
+
+func getAttrVal(node *html.Node, attrName string) string {
+ for _, attr := range node.Attr {
+ if attr.Key == attrName {
+ return attr.Val
+ }
+ }
+
+ return ""
+}
+
+func FromHtmlNode(doc *html.Node) (string, error) {
+ ctx := textifyTraverseCtx{
+ Buf: bytes.Buffer{},
+ }
+ if err := ctx.traverse(doc); err != nil {
+ return "", err
+ }
+
+ text := strings.TrimSpace(newlineRe.ReplaceAllString(
+ strings.Replace(ctx.Buf.String(), "\n ", "\n", -1), "\n\n"))
+ return text, nil
+
+}
+
+func FromReader(reader io.Reader) (string, error) {
+ newReader, err := bom.NewReaderWithoutBom(reader)
+ if err != nil {
+ return "", err
+ }
+ doc, err := html.Parse(newReader)
+ if err != nil {
+ return "", err
+ }
+ return FromHtmlNode(doc)
+}
+
+func FromString(input string) (string, error) {
+ bs := bom.CleanBom([]byte(input))
+ text, err := FromReader(bytes.NewReader(bs))
+ if err != nil {
+ return "", err
+ }
+ return text, nil
+}
diff --git a/vendor/github.com/jaytaylor/html2text/html2text_test.go b/vendor/github.com/jaytaylor/html2text/html2text_test.go
new file mode 100644
index 000000000..b30d68ac9
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/html2text_test.go
@@ -0,0 +1,674 @@
+package html2text
+
+import (
+ "bytes"
+ "fmt"
+ "io/ioutil"
+ "path"
+ "regexp"
+ "strings"
+ "testing"
+)
+
+const (
+ destPath = "testdata"
+)
+
+func TestParseUTF8(t *testing.T) {
+ htmlFiles := []struct {
+ file string
+ keywordShouldNotExist string
+ keywordShouldExist string
+ }{
+ {
+ "utf8.html",
+ "学习之道:美国公认学习第一书title",
+ "次世界冠军赛上,我几近疯狂",
+ },
+ {
+ "utf8_with_bom.xhtml",
+ "1892年波兰文版序言title",
+ "种新的波兰文本已成为必要",
+ },
+ }
+
+ for _, htmlFile := range htmlFiles {
+ bs, err := ioutil.ReadFile(path.Join(destPath, htmlFile.file))
+ if err != nil {
+ t.Fatal(err)
+ }
+ text, err := FromReader(bytes.NewReader(bs))
+ if err != nil {
+ t.Fatal(err)
+ }
+ if !strings.Contains(text, htmlFile.keywordShouldExist) {
+ t.Fatalf("keyword %s should exists in file %s", htmlFile.keywordShouldExist, htmlFile.file)
+ }
+ if strings.Contains(text, htmlFile.keywordShouldNotExist) {
+ t.Fatalf("keyword %s should not exists in file %s", htmlFile.keywordShouldNotExist, htmlFile.file)
+ }
+ }
+}
+
+func TestStrippingWhitespace(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "test text",
+ "test text",
+ },
+ {
+ " \ttext\ntext\n",
+ "text text",
+ },
+ {
+ " \na \n\t \n \n a \t",
+ "a a",
+ },
+ {
+ "test text",
+ "test text",
+ },
+ {
+ "test&nbsp;&nbsp;&nbsp; text&nbsp;",
+ "test    text",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+}
+
+func TestParagraphsAndBreaks(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "Test text",
+ "Test text",
+ },
+ {
+ "Test text<br>",
+ "Test text",
+ },
+ {
+ "Test text<br>Test",
+ "Test text\nTest",
+ },
+ {
+ "<p>Test text</p>",
+ "Test text",
+ },
+ {
+ "<p>Test text</p><p>Test text</p>",
+ "Test text\n\nTest text",
+ },
+ {
+ "\n<p>Test text</p>\n\n\n\t<p>Test text</p>\n",
+ "Test text\n\nTest text",
+ },
+ {
+ "\n<p>Test text<br/>Test text</p>\n",
+ "Test text\nTest text",
+ },
+ {
+ "\n<p>Test text<br> \tTest text<br></p>\n",
+ "Test text\nTest text",
+ },
+ {
+ "Test text<br><BR />Test text",
+ "Test text\n\nTest text",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+}
+
+func TestTables(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "<table><tr><td></td><td></td></tr></table>",
+ "",
+ },
+ {
+ "<table><tr><td>cell1</td><td>cell2</td></tr></table>",
+ "cell1 cell2",
+ },
+ {
+ "<table><tr><td>row1</td></tr><tr><td>row2</td></tr></table>",
+ "row1\nrow2",
+ },
+ {
+ `<table>
+ <tr><td>cell1-1</td><td>cell1-2</td></tr>
+ <tr><td>cell2-1</td><td>cell2-2</td></tr>
+ </table>`,
+ "cell1-1 cell1-2\ncell2-1 cell2-2",
+ },
+ {
+ "_<table><tr><td>cell</td></tr></table>_",
+ "_\n\ncell\n\n_",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+}
+
+func TestStrippingLists(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "<ul></ul>",
+ "",
+ },
+ {
+ "<ul><li>item</li></ul>_",
+ "* item\n\n_",
+ },
+ {
+ "<li class='123'>item 1</li> <li>item 2</li>\n_",
+ "* item 1\n* item 2\n_",
+ },
+ {
+ "<li>item 1</li> \t\n <li>item 2</li> <li> item 3</li>\n_",
+ "* item 1\n* item 2\n* item 3\n_",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+}
+
+func TestLinks(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ `<a></a>`,
+ ``,
+ },
+ {
+ `<a href=""></a>`,
+ ``,
+ },
+ {
+ `<a href="http://example.com/"></a>`,
+ `( http://example.com/ )`,
+ },
+ {
+ `<a href="">Link</a>`,
+ `Link`,
+ },
+ {
+ `<a href="http://example.com/">Link</a>`,
+ `Link ( http://example.com/ )`,
+ },
+ {
+ `<a href="http://example.com/"><span class="a">Link</span></a>`,
+ `Link ( http://example.com/ )`,
+ },
+ {
+ "<a href='http://example.com/'>\n\t<span class='a'>Link</span>\n\t</a>",
+ `Link ( http://example.com/ )`,
+ },
+ {
+ "<a href='mailto:contact@example.org'>Contact Us</a>",
+ `Contact Us ( contact@example.org )`,
+ },
+ {
+ "<a href=\"http://example.com:80/~user?aaa=bb&amp;c=d,e,f#foo\">Link</a>",
+ `Link ( http://example.com:80/~user?aaa=bb&c=d,e,f#foo )`,
+ },
+ {
+ "<a title='title' href=\"http://example.com/\">Link</a>",
+ `Link ( http://example.com/ )`,
+ },
+ {
+ "<a href=\" http://example.com/ \"> Link </a>",
+ `Link ( http://example.com/ )`,
+ },
+ {
+ "<a href=\"http://example.com/a/\">Link A</a> <a href=\"http://example.com/b/\">Link B</a>",
+ `Link A ( http://example.com/a/ ) Link B ( http://example.com/b/ )`,
+ },
+ {
+ "<a href=\"%%LINK%%\">Link</a>",
+ `Link ( %%LINK%% )`,
+ },
+ {
+ "<a href=\"[LINK]\">Link</a>",
+ `Link ( [LINK] )`,
+ },
+ {
+ "<a href=\"{LINK}\">Link</a>",
+ `Link ( {LINK} )`,
+ },
+ {
+ "<a href=\"[[!unsubscribe]]\">Link</a>",
+ `Link ( [[!unsubscribe]] )`,
+ },
+ {
+ "<p>This is <a href=\"http://www.google.com\" >link1</a> and <a href=\"http://www.google.com\" >link2 </a> is next.</p>",
+ `This is link1 ( http://www.google.com ) and link2 ( http://www.google.com ) is next.`,
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+}
+
+func TestImageAltTags(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ `<img />`,
+ ``,
+ },
+ {
+ `<img src="http://example.ru/hello.jpg" />`,
+ ``,
+ },
+ {
+ `<img alt="Example"/>`,
+ ``,
+ },
+ {
+ `<img src="http://example.ru/hello.jpg" alt="Example"/>`,
+ ``,
+ },
+ // Images do matter if they are in a link
+ {
+ `<a href="http://example.com/"><img src="http://example.ru/hello.jpg" alt="Example"/></a>`,
+ `Example ( http://example.com/ )`,
+ },
+ {
+ `<a href="http://example.com/"><img src="http://example.ru/hello.jpg" alt="Example"></a>`,
+ `Example ( http://example.com/ )`,
+ },
+ {
+ `<a href='http://example.com/'><img src='http://example.ru/hello.jpg' alt='Example'/></a>`,
+ `Example ( http://example.com/ )`,
+ },
+ {
+ `<a href='http://example.com/'><img src='http://example.ru/hello.jpg' alt='Example'></a>`,
+ `Example ( http://example.com/ )`,
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+}
+
+func TestHeadings(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "<h1>Test</h1>",
+ "****\nTest\n****",
+ },
+ {
+ "\t<h1>\nTest</h1> ",
+ "****\nTest\n****",
+ },
+ {
+ "\t<h1>\nTest line 1<br>Test 2</h1> ",
+ "***********\nTest line 1\nTest 2\n***********",
+ },
+ {
+ "<h1>Test</h1> <h1>Test</h1>",
+ "****\nTest\n****\n\n****\nTest\n****",
+ },
+ {
+ "<h2>Test</h2>",
+ "----\nTest\n----",
+ },
+ {
+ "<h1><a href='http://example.com/'>Test</a></h1>",
+ "****************************\nTest ( http://example.com/ )\n****************************",
+ },
+ {
+ "<h3> <span class='a'>Test </span></h3>",
+ "Test\n----",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+
+}
+
+func TestBold(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "<b>Test</b>",
+ "*Test*",
+ },
+ {
+ "\t<b>Test</b> ",
+ "*Test*",
+ },
+ {
+ "\t<b>Test line 1<br>Test 2</b> ",
+ "*Test line 1\nTest 2*",
+ },
+ {
+ "<b>Test</b> <b>Test</b>",
+ "*Test* *Test*",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+
+}
+
+func TestDiv(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "<div>Test</div>",
+ "Test",
+ },
+ {
+ "\t<div>Test</div> ",
+ "Test",
+ },
+ {
+ "<div>Test line 1<div>Test 2</div></div>",
+ "Test line 1\nTest 2",
+ },
+ {
+ "Test 1<div>Test 2</div> <div>Test 3</div>Test 4",
+ "Test 1\nTest 2\nTest 3\nTest 4",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+
+}
+
+func TestBlockquotes(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "<div>level 0<blockquote>level 1<br><blockquote>level 2</blockquote>level 1</blockquote><div>level 0</div></div>",
+ "level 0\n> \n> level 1\n> \n>> level 2\n> \n> level 1\n\nlevel 0",
+ },
+ {
+ "<blockquote>Test</blockquote>Test",
+ "> \n> Test\n\nTest",
+ },
+ {
+ "\t<blockquote> \nTest<br></blockquote> ",
+ "> \n> Test\n>",
+ },
+ {
+ "\t<blockquote> \nTest line 1<br>Test 2</blockquote> ",
+ "> \n> Test line 1\n> Test 2",
+ },
+ {
+ "<blockquote>Test</blockquote> <blockquote>Test</blockquote> Other Test",
+ "> \n> Test\n\n> \n> Test\n\nOther Test",
+ },
+ {
+ "<blockquote>Lorem ipsum Commodo id consectetur pariatur ea occaecat minim aliqua ad sit consequat quis ex commodo Duis incididunt eu mollit consectetur fugiat voluptate dolore in pariatur in commodo occaecat Ut occaecat velit esse labore aute quis commodo non sit dolore officia Excepteur cillum amet cupidatat culpa velit labore ullamco dolore mollit elit in aliqua dolor irure do</blockquote>",
+ "> \n> Lorem ipsum Commodo id consectetur pariatur ea occaecat minim aliqua ad\n> sit consequat quis ex commodo Duis incididunt eu mollit consectetur fugiat\n> voluptate dolore in pariatur in commodo occaecat Ut occaecat velit esse\n> labore aute quis commodo non sit dolore officia Excepteur cillum amet\n> cupidatat culpa velit labore ullamco dolore mollit elit in aliqua dolor\n> irure do",
+ },
+ {
+ "<blockquote>Lorem<b>ipsum</b><b>Commodo</b><b>id</b><b>consectetur</b><b>pariatur</b><b>ea</b><b>occaecat</b><b>minim</b><b>aliqua</b><b>ad</b><b>sit</b><b>consequat</b><b>quis</b><b>ex</b><b>commodo</b><b>Duis</b><b>incididunt</b><b>eu</b><b>mollit</b><b>consectetur</b><b>fugiat</b><b>voluptate</b><b>dolore</b><b>in</b><b>pariatur</b><b>in</b><b>commodo</b><b>occaecat</b><b>Ut</b><b>occaecat</b><b>velit</b><b>esse</b><b>labore</b><b>aute</b><b>quis</b><b>commodo</b><b>non</b><b>sit</b><b>dolore</b><b>officia</b><b>Excepteur</b><b>cillum</b><b>amet</b><b>cupidatat</b><b>culpa</b><b>velit</b><b>labore</b><b>ullamco</b><b>dolore</b><b>mollit</b><b>elit</b><b>in</b><b>aliqua</b><b>dolor</b><b>irure</b><b>do</b></blockquote>",
+ "> \n> Lorem *ipsum* *Commodo* *id* *consectetur* *pariatur* *ea* *occaecat* *minim*\n> *aliqua* *ad* *sit* *consequat* *quis* *ex* *commodo* *Duis* *incididunt* *eu*\n> *mollit* *consectetur* *fugiat* *voluptate* *dolore* *in* *pariatur* *in* *commodo*\n> *occaecat* *Ut* *occaecat* *velit* *esse* *labore* *aute* *quis* *commodo*\n> *non* *sit* *dolore* *officia* *Excepteur* *cillum* *amet* *cupidatat* *culpa*\n> *velit* *labore* *ullamco* *dolore* *mollit* *elit* *in* *aliqua* *dolor* *irure*\n> *do*",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+
+}
+
+func TestIgnoreStylesScriptsHead(t *testing.T) {
+ testCases := []struct {
+ input string
+ output string
+ }{
+ {
+ "<style>Test</style>",
+ "",
+ },
+ {
+ "<style type=\"text/css\">body { color: #fff; }</style>",
+ "",
+ },
+ {
+ "<link rel=\"stylesheet\" href=\"main.css\">",
+ "",
+ },
+ {
+ "<script>Test</script>",
+ "",
+ },
+ {
+ "<script src=\"main.js\"></script>",
+ "",
+ },
+ {
+ "<script type=\"text/javascript\" src=\"main.js\"></script>",
+ "",
+ },
+ {
+ "<script type=\"text/javascript\">Test</script>",
+ "",
+ },
+ {
+ "<script type=\"text/ng-template\" id=\"template.html\"><a href=\"http://google.com\">Google</a></script>",
+ "",
+ },
+ {
+ "<script type=\"bla-bla-bla\" id=\"template.html\">Test</script>",
+ "",
+ },
+ {
+ `<html><head><title>Title</title></head><body></body></html>`,
+ "",
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertString(t, testCase.input, testCase.output)
+ }
+}
+
+func TestText(t *testing.T) {
+ testCases := []struct {
+ input string
+ expr string
+ }{
+ {
+ `<li>
+ <a href="/new" data-ga-click="Header, create new repository, icon:repo"><span class="octicon octicon-repo"></span> New repository</a>
+ </li>`,
+ `\* New repository \( /new \)`,
+ },
+ {
+ `hi
+
+ <br>
+
+ hello <a href="https://google.com">google</a>
+ <br><br>
+ test<p>List:</p>
+
+ <ul>
+ <li><a href="foo">Foo</a></li>
+ <li><a href="http://www.microshwhat.com/bar/soapy">Barsoap</a></li>
+ <li>Baz</li>
+ </ul>
+`,
+ `hi
+hello google \( https://google.com \)
+
+test
+
+List:
+
+\* Foo \( foo \)
+\* Barsoap \( http://www.microshwhat.com/bar/soapy \)
+\* Baz`,
+ },
+ // Malformed input html.
+ {
+ `hi
+
+ hello <a href="https://google.com">google</a>
+
+ test<p>List:</p>
+
+ <ul>
+ <li><a href="foo">Foo</a>
+ <li><a href="/
+ bar/baz">Bar</a>
+ <li>Baz</li>
+ </ul>
+ `,
+ `hi hello google \( https://google.com \) test
+
+List:
+
+\* Foo \( foo \)
+\* Bar \( /\n[ \t]+bar/baz \)
+\* Baz`,
+ },
+ }
+
+ for _, testCase := range testCases {
+ assertRegexp(t, testCase.input, testCase.expr)
+ }
+}
+
+type StringMatcher interface {
+ MatchString(string) bool
+ String() string
+}
+
+type RegexpStringMatcher string
+
+func (m RegexpStringMatcher) MatchString(str string) bool {
+ return regexp.MustCompile(string(m)).MatchString(str)
+}
+func (m RegexpStringMatcher) String() string {
+ return string(m)
+}
+
+type ExactStringMatcher string
+
+func (m ExactStringMatcher) MatchString(str string) bool {
+ return string(m) == str
+}
+func (m ExactStringMatcher) String() string {
+ return string(m)
+}
+
+func assertRegexp(t *testing.T, input string, outputRE string) {
+ assertPlaintext(t, input, RegexpStringMatcher(outputRE))
+}
+
+func assertString(t *testing.T, input string, output string) {
+ assertPlaintext(t, input, ExactStringMatcher(output))
+}
+
+func assertPlaintext(t *testing.T, input string, matcher StringMatcher) {
+ text, err := FromString(input)
+ if err != nil {
+ t.Error(err)
+ }
+ if !matcher.MatchString(text) {
+ t.Errorf("Input did not match expression\n"+
+ "Input:\n>>>>\n%s\n<<<<\n\n"+
+ "Output:\n>>>>\n%s\n<<<<\n\n"+
+ "Expected output:\n>>>>\n%s\n<<<<\n\n",
+ input, text, matcher.String())
+ } else {
+ t.Logf("input:\n\n%s\n\n\n\noutput:\n\n%s\n", input, text)
+ }
+}
+
+func Example() {
+ inputHtml := `
+ <html>
+ <head>
+ <title>My Mega Service</title>
+ <link rel=\"stylesheet\" href=\"main.css\">
+ <style type=\"text/css\">body { color: #fff; }</style>
+ </head>
+
+ <body>
+ <div class="logo">
+ <a href="http://mymegaservice.com/"><img src="/logo-image.jpg" alt="Mega Service"/></a>
+ </div>
+
+ <h1>Welcome to your new account on my service!</h1>
+
+ <p>
+ Here is some more information:
+
+ <ul>
+ <li>Link 1: <a href="https://example.com">Example.com</a></li>
+ <li>Link 2: <a href="https://example2.com">Example2.com</a></li>
+ <li>Something else</li>
+ </ul>
+ </p>
+ </body>
+ </html>
+ `
+
+ text, err := FromString(inputHtml)
+ if err != nil {
+ panic(err)
+ }
+ fmt.Println(text)
+
+ // Output:
+ // Mega Service ( http://mymegaservice.com/ )
+ //
+ // ******************************************
+ // Welcome to your new account on my service!
+ // ******************************************
+ //
+ // Here is some more information:
+ //
+ // * Link 1: Example.com ( https://example.com )
+ // * Link 2: Example2.com ( https://example2.com )
+ // * Something else
+}
diff --git a/vendor/github.com/jaytaylor/html2text/testdata/utf8.html b/vendor/github.com/jaytaylor/html2text/testdata/utf8.html
new file mode 100755
index 000000000..53d401ce9
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/testdata/utf8.html
@@ -0,0 +1,22 @@
+<?xml version='1.0' encoding='utf-8'?>
+<html xmlns="http://www.w3.org/1999/xhtml">
+
+<head>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
+ <title>学习之道:美国公认学习第一书title</title>
+ <link href="stylesheet.css" rel="stylesheet" type="text/css" />
+ <link href="page_styles.css" rel="stylesheet" type="text/css" />
+</head>
+
+<body class="calibre">
+ <p id="filepos9452" class="calibre_"><span class="calibre6"><span class="bold">写在前面的话</span></span>
+ </p>
+ <p class="calibre_12">在台湾的那次世界冠军赛上,我几近疯狂,直至两年后的今天,我仍沉浸在这次的经历中。这是我生平第一次如此深入地审视我自己,甚至是第一次尝试审视自己。这个过程令人很是兴奋,同时也有点感觉怪异。我重新认识了自我,看到了自己的另外一面,自己从未发觉的另外一面。为了生存,为了取胜,我成了一名角斗士,彻头彻尾,简单纯粹。我并没有意识到这一角色早已在我的心中生根发芽,呼之欲出。也许,他的出现已是不可避免。</p>
+ <p class="calibre_7">而我这全新的一面,与我一直熟识的那个乔希,那个曾经害怕黑暗的孩子,那个象棋手,那个狂热于雨水、反复诵读杰克·克鲁亚克作品的年轻人之间,又有什么样的联系呢?这些都是我正在努力弄清楚的问题。</p>
+ <p class="calibre_7">自台湾赛事之后,我急切非常,一心想要回到训练中去,摆脱自己已经达到巅峰的想法。在过去的两年中,我已经重新开始。这是一个新的起点。前方的路还很长,有待进一步的探索。</p>
+ <p class="calibre_7">这本书的创作耗费了相当多的时间和精力。在成长的过程中,我在我的小房间里从未想过等待我的会是这样的战斗。在创作中,我的思想逐渐成熟;爱恋从分崩离析,到失而复得,世界冠军头衔从失之交臂,到囊中取物。如果说在我人生的第一个二十九年中,我学到了什么,那就是,我们永远无法预测结局,无论是重要的比赛、冒险,还是轰轰烈烈的爱情。我们唯一可以肯定的只有,出乎意料。不管我们做了多么万全的准备,在生活的真实场景中,我们总是会处于陌生的境地。我们也许会无法冷静,失去理智,感觉似乎整个世界都在针对我们。在这个时候,我们所要做的是要付出加倍的努力,要表现得比预想得更好。我认为,关键在于准备好随机应变,准备好在所能想象的高压下发挥出创造力。</p>
+ <p class="calibre_7">读者朋友们,我非常希望你们在读过这本书后,可以得到启发,甚至会得到触动,从而能够根据各自的天赋与特长,去实现自己的梦想。这就是我写作此书的目的。我在字里行间所传达的理念曾经使我受益匪浅,我很希望它们可以为大家提供一个基本的框架和方向。如果我的方法言之有理,那么就请接受它,琢磨它,并加之自己的见解。忘记我的那些数字。真正的掌握需要通过自己发现一些最能够引起共鸣的信息,并将其彻底地融合进来,直至成为一体,这样我们才能随心所欲地驾驭它。</p>
+ <div class="mbp_pagebreak" id="calibre_pb_4"></div>
+</body>
+
+</html> \ No newline at end of file
diff --git a/vendor/github.com/jaytaylor/html2text/testdata/utf8_with_bom.xhtml b/vendor/github.com/jaytaylor/html2text/testdata/utf8_with_bom.xhtml
new file mode 100755
index 000000000..68f0ee707
--- /dev/null
+++ b/vendor/github.com/jaytaylor/html2text/testdata/utf8_with_bom.xhtml
@@ -0,0 +1,24 @@
+<?xml version="1.0" encoding="utf-8" ?>
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="zh-CN">
+
+<head>
+ <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" />
+ <title>1892年波兰文版序言title</title>
+ <link rel="stylesheet" href="css/stylesheet.css" type="text/css" />
+</head>
+
+<body>
+ <div id="page30" />
+ <h2 id="CHP2-6">1892年波兰文版序言<a id="wzyy_18_30" href="#wz_18_30"><sup>[18]</sup></a></h2>
+ <p>出版共产主义宣言的一种新的波兰文本已成为必要,这一事实,引起了许多感想。</p>
+ <p>首先值得注意的是,近来宣言在一定程度上已成为欧洲大陆大工业发展的一种尺度。一个国家的大工业越发展,该国工人中想认清自己作为工人阶级在有产阶级面前所处地位的要求就越增加,他们中间的社会主义运动也越扩大,因而对宣言的需求也越增长。这样,根据宣言用某国文字销行的份数,不仅能够相当确切地断定该国工人运动的状况,而且还能够相当确切地断定该国大工业发展的程度。</p>
+ <p>因此,波兰文的新版本标志着波兰工业的决定性进步。从十年前发表的上一个版本以来确实有了这种进步,对此丝毫不容置疑。俄国的波兰,会议的波兰<a id="wzyy_19_30" href="#wz_19_30"><sup>[19]</sup></a>,成了俄罗斯帝国巨大的工业区。俄国大工业是零星分散的,一部分在芬兰湾沿岸,一部分在中央区(莫斯科和弗拉基米尔),第三部分在黑海和亚速海沿岸,还有另一些散布在别处;而波兰工业则紧缩于相对狭小的地区,享受到由这种积聚引起的长处与短处。这种长处是竞争着的俄罗斯工厂主所承认的,他们要求实行保护关税以对付波兰,尽管他们渴望使波兰人俄罗斯化。这种短处,对波兰工厂主与俄罗斯政府来说,表现在社会主义思想在波兰工人中间的迅速传播和对宣言需求的增长。</p>
+ <p>但是,波兰工业的迅速发展——它超过了俄国工业——本身<a id="page31" />是波兰人民的坚强生命力的一个新证明,是波兰人民临近的民族复兴的一个新保证。而一个独立强盛的波兰的复兴,不只是一件同波兰人有关、而且是同我们大家有关的事情。只有当每个民族在自己内部完全自主时,欧洲各民族间真诚的国际合作才是可能的。1848年革命在无产阶级旗帜下,使无产阶级的战士最终只作了资产阶级的工作,这次革命通过自己遗嘱的执行者路易·波拿巴和俾斯麦也实现了意大利、德国和匈牙利的独立。然而波兰,它从1792年以来为革命做的比所有这三个国家总共做的还要多,而当它1863年失败于强大十倍的俄军的时候,人们却把它抛弃不顾了。贵族既未能保持住、也未能重新争得波兰的独立;今天波兰的独立对资产阶级至少是无所谓的。然而波兰的独立对于欧洲各民族和谐的合作是必需的。这种独立只有年轻的波兰无产阶级才能争得,而且在它的手中会很好地保持住。因为欧洲所有其余的工人都象波兰工人自己一样也需要波兰的独立。</p>
+ <p>弗·恩格斯</p>
+ <p>1892年2月10日于伦敦</p>
+ <div id="page74" />
+ <div><a id="wz_18_30" href="#wzyy_18_30">[18]</a> 恩格斯用德文为《宣言》新的波兰文本写了这篇序言。1892年由波兰社会主义者在伦敦办的《黎明》杂志社出版。序言寄出后,恩格斯写信给门德尔森(1892年2月11日),信中说,他很愿意学会波兰文,并且深入研究波兰工人运动的发展,以便能够为《宣言》的下一版写一篇更详细的序言。——第20页</div>
+ <div><a id="wz_19_30" href="#wzyy_19_30">[19]</a> 指维也纳会议的波兰,即根据1814—1815年维也纳会议的决定,以波兰王国的正式名义割给俄国的那部分波兰土地。——第20页</div>
+</body>
+
+</html> \ No newline at end of file