I'm working on a little reading app that processes HTML documents using `libxml2`. While processing an HTML tree, I check every `text()` node for its ancestors to choose a proper style. For headers I'm using the following of query to see whether a node is header or not:
boolean(ancestor::*[
self::h1 or
self::h2 or
self::h3 or
self::h4 or
self::h5 or
self::h6])
With a 5 MB book, it takes 1.1 seconds to pass. Together with two additional queries for emphasis and code styles (with larger set of node names), it adds up to 4.4 seconds.
According to Apple's Instruments, the bottleneck line is:
xmlXPathObject *object = xmlXPathNodeEval(node, query, context);
I cache `context` to speed things up. Is there anything else I could do to make it faster?
以上就是Fastest way to check ancestors using XPath and libxml2的详细内容,更多请关注web前端其它相关文章!