Start documentation
Mon Nov 23 13:13:22 UTC 2009 pix@kepibu.org
* Start documentation
hunk ./notes 2
+* Purpose
+"Oh, Ducks!" is an extension to cl-unification to make parsing
+structured documents easy, using CSS selectors.
+* Installation
+** Prerequisites
+ + cl-unification
+ + cl-ppcre
+ + split-sequence
+ + alexandria
+ + asdf-system-connections
+ * closure-html
+ * cxml
+[+] Mandatory [*] Optional
+** Loading
+Loading "Oh, Ducks!" is just like loading any other ASDF system.
+However, because it does not mandate a particular HTML or XML parser,
+it does not generally become useful until you have also loading an
+HTML/XML parsing library such as cxml or closure-html.
+
+Start with:
+ :(asdf:oos 'asdf:load-op :oh-ducks)
+If you would like to use the built-in support for parsing via
+closure-html (which you almost certainly do), you'll also want to load
+closure-html:
+ :(asdf:oos 'asdf:load-op :closure-html)
+And, if you want to use DOM objects provided by cxml:
+ :(asdf:oos 'asdf:load-op :cxml)
+
+** Load-order Caveats
+closure-html and cl-unification each define competing readers on #t.
+To avoid load-order issues resulting in an indeterminate reader on #t,
+you'll probably want to add
+ :#.(set-dispatch-macro-character #\# #\T 'unify::|sharp-T-reader|)
+to the top of any file which uses cl-unification's reader templates.
+
+Please feel free to submit patches to closure-html and cl-unification
+to fix this problem.
+* Usage
+The combination of oh-ducks and closure-html provides an HTML template
+for use with cl-unification, and has the following syntax:
+
+ (match (#t(html [(:model <model>)]
+ <selectors>+)
+ <document>)
+ &body)
+ selectors := (<selector> . <binding>) |
+ (<selector> . <template>) |
+ (<selector> <selectors>+)
+ document := <parsed-document> | <document-to-be-parsed>
+
+:model is only necessary for unparsed documents (e.g., a pathname or string).
+
+** Examples
+
+(match (#T(html (:model lhtml)
+ ("#id" . ?div))
+ "<div id=\"id\">I <i>like</i> cheese.</div>")
+ (car div)) =>
+ (:div ((:id "id")) "I " (:i () "like") " cheese.")
+
+(match (#T(html (:model dom)
+ ("i" . #t(list ?j ?i))
+ ("span>i" . ?span))
+ "<div>I do <i>not</i> like cheese.</div><div><span>I like <i>cheese</i>.</span></div>")
+ (values i span)) =>
+ #<ELEMENT i "not">,
+ (#<ELEMENT i "cheese">)
+
+** Selectors
+
+The goal is to support all CSS-level-3 selectors. See the below
+section "To Do > Improve Selector Support" for a list of currently
+unsupported simple selectors and combinators.
+
+Each selector should result in the same elements which would be
+affected by the same CSS selector. That is,
+ #id => elements with id of "id".
+ .foo.bar => elements with both "foo" and "bar" classes
+ div => all <div>s
+and so forth.
+
+*** Limitations
+
+Currently, selector terms are limited to alphanumeric characters, and
+do not support CSS-style character escapes. Patches welcome!
+
+** Included Object Models
+*** LHTML (closure-html)
+A list-based structure provided by closure-html. Cannot be used by
+selectors which require asking about parent or sibling objects.
+*** PT (closure-html)
+A structure-based structure provided by closure-html.
+*** DOM (cxml)
+DOM objects as provided by cxml and defined by the W3C.
+* Extending
+** Adding an object model
+While the supported models should generally be sufficient, you can add
+your own fairly easily. All models are expected to implement the
+generic functions in <traversal/interface.lisp>. See the other files
+under the traversal/ directory for examples.
+
+You might also want to see chtml.lisp and cxml.lisp.
+** Adding a selector or combinator
+see <selectors.lisp>. Generally, you should add a class which is a
+subclass of combinator or simple-selector, augment parse-selector with
+an appropriate regular expression, and define a method on
+element-matches-p.
+
+I also recommend submitting a patch. Other people might want to use
+that selector, too!
+* Known Bugs
+** Does not error on unknown CSS selectors
+** Failure to match results in NIL, rather than a unification-failure
hunk ./notes 145
+** Submit patch to cl-unification to add (enable/disable-template-reader) functions
+** Submit patch to closure-html to add (enable/disable-reader) functions
+** non-css templates (e.g., for matching on text of element)?