| You are here: Home > Dive Into Python > Scripts and Streams > Putting it all together | << >> | ||||
Dive Into PythonPython from novice to pro |
|||||
You've covered a lot of ground. Let's step back and see how all the pieces fit together.
To start with, this is a script that takes its arguments on the command line, using the getopt module.
def main(argv): ... try: opts, args = getopt.getopt(argv, "hg:d", ["help", "grammar="]) except getopt.GetoptError: ... for opt, arg in opts: ...
You create a new instance of the KantGenerator class, and pass it the grammar file and source that may or may not have been specified on the command line.
k = KantGenerator(grammar, source)The KantGenerator instance automatically loads the grammar, which is an XML file. You use your custom openAnything function to open the file (which could be stored in a local file or a remote web server), then use the built-in minidom parsing functions to parse the XML into a tree of Python objects.
def _load(self, source):
sock = toolbox.openAnything(source)
xmldoc = minidom.parse(sock).documentElement
sock.close()Oh, and along the way, you take advantage of your knowledge of the structure of the XML document to set up a little cache of references, which are just elements in the XML document.
def loadGrammar(self, grammar):
for ref in self.grammar.getElementsByTagName("ref"):
self.refs[ref.attributes["id"].value] = ref If you specified some source material on the command line, you use that; otherwise you rip through the grammar looking for the "top-level" reference (that isn't referenced by anything else) and use that as a starting point.
def getDefaultSource(self):
xrefs = {}
for xref in self.grammar.getElementsByTagName("xref"):
xrefs[xref.attributes["id"].value] = 1
xrefs = xrefs.keys()
standaloneXrefs = [e for e in self.refs.keys() if e not in xrefs]
return '<xref id="%s"/>' % random.choice(standaloneXrefs)Now you rip through the source material. The source material is also XML, and you parse it one node at a time. To keep the code separated and more maintainable, you use separate handlers for each node type.
def parse_Element(self, node):
handlerMethod = getattr(self, "do_%s" % node.tagName)
handlerMethod(node)You bounce through the grammar, parsing all the children of each p element,
def do_p(self, node):
...
if doit:
for child in node.childNodes: self.parse(child)replacing choice elements with a random child,
def do_choice(self, node):
self.parse(self.randomChildElement(node))and replacing xref elements with a random child of the corresponding ref element, which you previously cached.
def do_xref(self, node):
id = node.attributes["id"].value
self.parse(self.randomChildElement(self.refs[id]))Eventually, you parse your way down to plain text,
def parse_Text(self, node):
text = node.data
...
self.pieces.append(text)which you print out.
def main(argv): ... k = KantGenerator(grammar, source) print k.output()
<< Handling command-line arguments |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
Summary >> |