Skip to content

SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)

License

Notifications You must be signed in to change notification settings

scinfu/SwiftSoup

Repository files navigation

SwiftSoup

Build Status Version License Platform

SwiftSoup is a Swift library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. SwiftSoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

  • scrape and parse HTML from a URL, file, or string
  • find and extract data, using DOM traversal or CSS selectors
  • manipulate the HTML elements, attributes, and text
  • clean user-submitted content against a safe white-list, to prevent XSS attacks
  • output tidy HTML SwiftSoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; SwiftSoup will create a sensible parse tree.

Installation

SwiftSoup is available through CocoaPods. To install it, simply add the following line to your Podfile:

pod "SwiftSoup"

Exemple

To parse a HTML document:

let html = "<html><head><title>First parse</title></head>"
			+ "<body><p>Parsed HTML into a doc.</p></body></html>"
		let doc: Document = try SwiftSoup.parse(html)
		return try doc.text()
  • unclosed tags (e.g. <p>Lorem <p>Ipsum parses to <p>Lorem</p> <p>Ipsum</p>)
  • implicit tags (e.g. a naked <td>Table data</td> is wrapped into a <table><tr><td>...)
  • reliably creating the document structure (html containing a head and body, and only appropriate elements within the head)

###The object model of a document

  • Documents consist of Elements and TextNodes
  • The inheritance chain is: Document extends Element extends Node.TextNode extends Node.
  • An Element contains a list of children Nodes, and has one parent Element. They also have provide a filtered list of child Elements only.

Author

Nabil Chatbi, scinfu@gmail.com

Note

SwiftSoup was ported to Swift from Java Jsoup library.

License

SwiftSoup is available under the MIT license. See the LICENSE file for more info.

About

SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published

Languages