Page Tools:
Wiki Relationships:
Admin Tools:
HTMLtoMochiDOM HTML to MochiDOM script
by Matt Harrison
Contents |
MochiKit's DOM utilities
MochiKit, a library to "make javascript suck less" (read provides AJAX and other javascript functionality) provides DOM utilities to programmatically create HTML, rather than embedding html in a string. Doing it programmatically provides a few benefits, such as escaping of entities and guaranteeing well-formed html. But it may seem like extra work for those used to writing their html by hand. Or say you are a web developer but you get your design code from a gui person. Now you have to convert their pretty html to the DOM api. Not fun and error prone! On that note I created a script that will convert well-formed chunks of html (read valid xml) to javascript code for MochiKit.
Here's a quick example, say my friendly gui developer gave me the following html and I wanted to place it in a div using mochi.
<div class="section" id="description"> <h1><a name="description">Description</a></h1> <p>As you probably know, the DOM APIs are some of the most painful Java-inspired APIs you'll run across from a highly dynamic language. Don't worry about that though, because they provide a reasonable basis to build something that sucks a lot less.</p></div>
The corresponding javascript code using Mochi is this:
DIV({'class': 'section', 'id': 'description'},
H1(null,
A({'name': 'description'},
"Description"
)
),
P(null,
"As you probably know, the DOM APIs are some of the most painful Java-inspired
APIs you'll run across from a highly dynamic language. Don't worry about that
though, because they provide a reasonable basis to build something that
sucks a lot less."
)
);
If you want to do this by hand, by all means go ahead. But, I've written a script that will take well formed html and spit out the javascript code for you.
Script to convert HTML to MochiDOM
The script is written in python and requires the ElementTree XML Parsing Library.
To run the script save the contents of the file shown below to htmlToMochiDom.py. Typingpython htmlToMochiDom.pywill run the 7 unit tests included in the code. If you pass a filename as an additional argument, then instead of running the tests, it will spit out the javascript code (or tell you that your html wasn't well-formed). If my ui person had sent me a snippet of html called
snippet.htmlI would type
python htmlToMochiDom.py snippet.htmlto generate javascript from this html.
This is a live Document
Feel free to edit this document as necessary (it's in a wiki). If you have questions, comments, suggestions, you can either email me (mharrison at spikesource DOT com), or create a FAQ section in this page. Enjoy and have fun creating javascript!
htmlToMochiDom.py
#!/usr/bin/env python
# Copyright(c) 2004, SpikeSource Inc. All Rights Reserved.
# Licensed under the Open Source License version 2.1
# (See http://www.spikesource.com/license.html)
from elementtree.ElementTree import parse, XML
from elementtree.SimpleXMLWriter import XMLWriter
import sys
import unittest
__author__ = "matt harrison"
__license__ = "osl 2.1"
__version__ = "0.01"
INDENT_START=0 #adjust these two values for spacing...
INDENT_INCR=1
ELEMENTS = ["A", "DIV", "INPUT", "SPAN", "TABLE", "TBODY", "THEAD", "TFOOT",
"TR", "TD", "TH", "UL", "OL", "LI", "H1", "H2", "H3", "BR", "HR",
"LABEL", "TEXTAREA", "FORM", "P", "IMG"]
def _getFileContent(filename):
fin = open(filename, 'r')
return fin.read()
def _getElementTree(content):
try:
tree = XML(content)
return tree
except Exception, e:
print "Please make sure content is valid XML:\n%s"%content
sys.exit(1)
def _getAttr(elem):
if elem.attrib:
return str(repr(elem.attrib))
return "null"
def _elemToDOM(elem, indent=INDENT_START):
if elem.tag.upper() in ELEMENTS:
attrs = _getAttr(elem)
children = _getChildren(elem, indent+INDENT_INCR)
if not children:
if attrs == "null":
#just close current element
return "%s%s()"%(indent*" ", elem.tag.upper())
else:
#has attribute and no children
return "%s%s(%s)"%(indent*" ", elem.tag.upper(), attrs)
else:
return "%s%s(%s,\n%s\n%s)"%(indent*" ", elem.tag.upper(),
attrs, children, indent*" ")
return ""
def _getChildren(elem, indent):
content = []
if elem.text and elem.text.strip(): #strip to not include newlines...
content.append('%s"%s"'%(indent*" ",elem.text))
for child in elem:
if child.tag.upper() in ELEMENTS:
content.append(_elemToDOM(child, indent))
#deal with mixed content... this is a little wierd
#see http://effbot.org/zone/element-infoset.htm#mixed-content
if child.tail and child.tail.strip():
content.append('%s"%s"'%(indent*" ",child.tail))
if content:
return ",\n".join(content)
return ""
def textToDOM(text):
tree = _getElementTree(text)
depth = 0
dom = _elemToDOM(tree)
return dom+";"
class TestDom(unittest.TestCase):
def testSimple(self):
html = "<table/>"
dom = textToDOM(html)
self.assertEquals(dom,"TABLE();")
def testSimple2(self):
html = "<table><tr/></table>"
dom = textToDOM(html)
self.assertEquals(dom,"""TABLE(null,
TR()
);""")
def testMultipleChildren(self):
html = "<table><tr/><tr/></table>"
dom = textToDOM(html)
self.assertEquals(dom,"""TABLE(null,
TR(),
TR()
);""")
def testAttr(self):
html = "<table id='foo' bar='baz'/>"
dom = textToDOM(html)
self.assertEquals(dom,"TABLE({'bar': 'baz', 'id': 'foo'});")
def testText(self):
html = "<table>text</table>"
dom = textToDOM(html)
self.assertEquals(dom,"""TABLE(null,
"text"
);""")
def testMixedContent(self):
html = "<div>content<a/>more content</div>"
dom = textToDOM(html)
self.assertEquals(dom,"""DIV(null,
"content",
A(),
"more content"
);""")
def t1estBadXML(self):
#exiting on bad XML how to catch?
html = "<table id='foo' bar='baz'>"
dom = textToDOM(html)
self.assertRaises(ExpatError, testToDOM, html)
def testLonger(self):
txt = """<div class="section" id="description">
<h1><a name="description">Description</a></h1>
<p>As you probably know, the DOM APIs are some of the most painful Java-inspired
APIs you'll run across from a highly dynamic language. Don't worry about that
though, because they provide a reasonable basis to build something that
sucks a lot less.</p></div>"""
dom = textToDOM(txt)
self.assertEquals(dom,"""DIV({'class': 'section', 'id': 'description'},
H1(null,
A({'name': 'description'},
"Description"
)
),
P(null,
"As you probably know, the DOM APIs are some of the most painful Java-inspired
APIs you'll run across from a highly dynamic language. Don't worry about that
though, because they provide a reasonable basis to build something that
sucks a lot less."
)
);""")
if __name__ == '__main__':
if len(sys.argv)>1:
print(textToDOM(_getFileContent(sys.argv[1])))
else:
unittest.main()
Text Mate Modification
For those of you using Text Mate, just modify the bottom chunk of the script as follows:
if __name__ == '__main__':
if len(sys.argv)>1:
print(textToDOM(_getFileContent(sys.argv[1])))
else:
print(textToDOM(sys.stdin.read()))
#unittest.main()

Testing
