xmllpegparser is a fast XML parser who uses LPeg library.
luarocks install --local https://raw.githubusercontent.com/jonathanpoelen/lua-xmllpegparser/master/xmllpegparser-2.2-0.rockspec
# or in your local directory lua-xmllpegparser
luarocks make --local xmllpegparser-2.2-0.rockspecRun ./example.lua.
./example.lua xmlfile [replaceentities]
replaceentities = anything, only to enable replacement of entities.
parse(xmlstring[, visitorOrsubEntities[, visitorInitArgs...]]):
Returns a tupledocument table, (string error or nil)(seevisitor.finish).
IfsubEntitiesistrue, the entities are replaced and atentitymember is added to the documenttable.parseFile(filename[, visitorOrsubEntities[, visitorInitArgs...]]):
Returns a tupledocument table, error file or error document.
defaultEntitiyTable():
Returns the default entity table ({ quot='"', ... }).createEntityTable(docEntities[, resultEntities]):
Creates an entity table from the document entity table. ReturnresultEntities.mkReplaceEntities(entityTable_or_func):
Returns an LPeg expression that can replace entitiesreplaceEntities(s, entityTable_or_func):
Returns astring.
parser(visitor[, safeVisitor: bool]):
Returns a parser. If all visitor functions returnnil(exceptedaccuattr,initandfinish), thensafeVisitormay betrueand the parser will optimize the visitor's calls.lazyParser(visitorCreator):
Returns a parser.
parser(visitorCreator())is used on the first call ofmyparser.parse(...).mkVisitor(evalEntities: bool, defaultEntities: table | function | nil, withoutPosition):
Ifnot defaultEntitiesandevalEntitiesthendefaultEntities = defaultEntityTable().
IfwithoutPosition, thenposparameter does not exist for the visitor functions except forfinish.treeParser:
The default parser used byparse(str, false)treeParserWithReplacedEntities:
The default parser used byparse(str, true)treeParserWithoutPos:
Parser withoutposparametertreeParserWithoutPosWithReplacedEntities:
Parser withoutposparameter
enableWithoutPosParser([bool]):
Enable default parser withtreeParserWithoutPos*version.
enableParserWithoutPos(false)is same tosetDefaultParsers().
Returns the previous parsers.setDefaultParsers(parser, parserWithReplacedEntities | bool | nil):
IfparserWithReplacedEntities == true, thenparserWithReplacedEntities = p.
nilorfalsevalue restore the default parser.
Returns the previous parsers.
toString(doc: table, indentationText: nil | string, params: nil | table):\indentationTextcorresponds to the text used at each indentation level. Ifnil, there is no formatting.paramsis table withshortEmptyElements: bool = true: empty tag are self-closed or not.stableAttributes: bool | function = true: Iftrue, attribute are sorted by name. If a function, it takes the attribute table and should return an iterator function that gives the attribute name and its value.inlineTextLengthMax: number = 9999999: a node that contains only one text is formatted on one line. When the text exceeds this value, it is indented.escape: table: table offunction(string):stringattr: text in double quotetext: text nodecdata: text between<![CDATA[and]]>comment: text between<!--and-->
escapeFunctions(escapeAmp: bool = false):
Utility function forparams.escapeparameter oftoStringescapeAmp: escape&char in text and attribute
escapeComment(string):string: replace--with—escapeAttribute(string):string: replace<with<and"with"escapeAttributeAndAmp(string):string: likeescapeAttribute+ replace&with&escapeCDATA(string):string: replace]]>with]]>]]><![CDATA[escapeText(string):string: replace<with<escapeTextAndAmp(string):stringreplace<with<and&with&
-- pos member = index of string
document = {
children = {
{ pos=number, parent=table or nil, text=string[, cdata=true] } or
{ pos=number, parent=table or nil, tag=string, attrs={ { name=string, value=string }, ... }, children={ ... } },
...
},
bad = { children={ ... } } -- when a closed node has no match
preprocessor = { { pos=number, tag=string, attrs={ { name=string, value=string }, ... } },
doctype = { pos=number, name=string, ident=string or nil, pubident=string or nil, dtd=string or nil }, -- if there is a doctype
error = string, -- if error
lastpos = number, -- last known position of parse()
entities = { { pos=number, name=string, value=string }, ... },
tentities = { name=value, ... } -- only if subEntities = true
}{
parse = function(xmlstring, visitorInitArgs...) ... end,
parseFile = function(filename, visitorInitArgs...) ... end,
__call = function(xmlstring, visitorInitArgs...) ... end,
}Each member is optionnal.
{
withPos = bool -- indicates if pos parameter exists in function parameter (except `finish`)
init = function(...), -- called before parsing, returns the position of the beginning of match or nil
finish = function(err, pos, xmlstring), -- called after parsing, returns (doc, err) or nil
proc = function(pos, name, attrs), -- for `<?...?>`
entity = function(pos, name, value),
doctype = function(pos, name, ident, pubident, dtd), -- called after all entity()
accuattr = function(table, name, value), -- `table` is an accumulator that will be transmitted to tag.attrs. Set to `false` for disable this function.
-- If `nil` and `tag` is `not nil`, a default accumalator is used.
-- If `false`, the accumulator is disabled.
-- (`tag(pos, name, accuattr(accuattr({}, attr1, value1), attr2, value2)`)
tag = function(pos, name, attrs), -- for a new tag (`<a>` or `<a/>`)
open = function(), -- only for a open node (`<a>` not `<a/>`), called after `tag`.
close = function(name),
text = function(pos, text),
cdata = function(pos, text), -- or `text` if nil
comment = function(str)
}- Non-validating
- No DTD support
- Ignore processing instructions