Parsers¶

A parser extract structured information as a tree from a container as a file-like object. It does the type conversion when explicit but does not interpret anything else. Parsers can raise a ParserError.

EBML¶

EBML (Extensible Binary Meta Language) is used by Matroska and WebM.

Element types¶

enzyme.parsers.ebml.INTEGER¶: Signed integer element type

enzyme.parsers.ebml.UINTEGER¶: Unsigned integer element type

enzyme.parsers.ebml.FLOAT¶: Float element type

enzyme.parsers.ebml.STRING¶: ASCII-encoded string element type

enzyme.parsers.ebml.UNICODE¶: UTF-8-encoded string element type

enzyme.parsers.ebml.DATE¶: Date element type

enzyme.parsers.ebml.BINARY¶: Binary element type

enzyme.parsers.ebml.MASTER¶: Container element type

Main interface¶

enzyme.parsers.ebml.SPEC_TYPES¶: Specification types to Element types mapping

enzyme.parsers.ebml.READERS¶

Element types to reader functions mapping. See Readers

You can override a reader to use one of your choice here:

>>> def my_binary_reader(stream, size):
...     data = stream.read(size)
...     return data
>>> READERS[BINARY] = my_binary_reader

class enzyme.parsers.ebml.Element(id=None, type=None, name=None, level=None, position=None, size=None, data=None)¶

Base object of EBML

Parameters:

id (int) – id of the element, best represented as hexadecimal (0x18538067 for Matroska Segment element)
type (INTEGER, UINTEGER, FLOAT, STRING, UNICODE, DATE, MASTER or BINARY) – type of the element
name (string) – name of the element
level (int) – level of the element
position (int) – position of element’s data
size (int) – size of element’s data
data – data as read by the corresponding READERS

class enzyme.parsers.ebml.MasterElement(id=None, name=None, level=None, position=None, size=None, data=None)¶

Element of type MASTER that has a list of Element as its data

Parameters:	id (int) – id of the element, best represented as hexadecimal (0x18538067 for Matroska Segment element) name (string) – name of the element level (int) – level of the element position (int) – position of element’s data size (int) – size of element’s data data (list of `Element`) – child elements

MasterElement implements some magic methods to ease manipulation. Thus, a MasterElement supports the in keyword to test for the presence of a child element by its name and gives access to it with a container getter:

>>> ebml_element = parse(open('test1.mkv', 'rb'), get_matroska_specs())[0]
>>> 'EBMLVersion' in ebml_element
False
>>> 'DocType' in ebml_element
True
>>> ebml_element['DocType']
Element(DocType, u'matroska')

load(stream, specs, ignore_element_types=None, ignore_element_names=None, max_level=None)¶

Load children Elements with level lower or equal to the max_level from the stream according to the specs

Parameters:	stream – file-like object from which to read specs (dict) – see Specifications max_level (int) – maximum level for children elements ignore_element_types (list) – list of element types to ignore ignore_element_names (list) – list of element names to ignore max_level – maximum level of elements

get(name, default=None)¶

Convenience method for master_element[name].data if name in master_element else default

Parameters:	name (string) – the name of the child to get default – default value if name is not in the `MasterElement`
Returns:	the data of the child `Element` or default

enzyme.parsers.ebml.parse(stream, specs, size=None, ignore_element_types=None, ignore_element_names=None, max_level=None)¶

Parse a stream for size bytes according to the specs

Parameters:	stream – file-like object from which to read size (int or None) – maximum number of bytes to read, None to read all the stream specs (dict) – see Specifications ignore_element_types (list) – list of element types to ignore ignore_element_names (list) – list of element names to ignore max_level (int) – maximum level of elements
Returns:	parsed data as a tree of `Element`
Return type:	list

Note

If size is reached in a middle of an element, reading will continue until the element is fully parsed.

enzyme.parsers.ebml.parse_element(stream, specs, load_children=False, ignore_element_types=None, ignore_element_names=None, max_level=None)¶

Extract a single Element from the stream according to the specs

Parameters:	stream – file-like object from which to read specs (dict) – see Specifications load_children (bool) – load children elements if the parsed element is a `MasterElement` ignore_element_types (list) – list of element types to ignore ignore_element_names (list) – list of element names to ignore max_level (int) – maximum level for children elements
Returns:	the parsed element
Return type:	`Element`

enzyme.parsers.ebml.get_matroska_specs(webm_only=False)¶

Get the Matroska specs

Parameters:	webm_only (bool) – load only WebM specs
Returns:	the specs in the appropriate format. See Specifications
Return type:	dict

Readers¶

enzyme.parsers.ebml.readers.read_element_id(stream)¶

Read the Element ID

Parameters:	stream – file-like object from which to read
Raises:	ReadError – when not all the required bytes could be read
Returns:	the id of the element
Return type:	int

enzyme.parsers.ebml.readers.read_element_size(stream)¶

Read the Element Size

Parameters:	stream – file-like object from which to read
Raises:	ReadError – when not all the required bytes could be read
Returns:	the size of element’s data
Return type:	int

enzyme.parsers.ebml.readers.read_element_integer(stream, size)¶

Read the Element Data of type INTEGER

Parameters:	stream – file-like object from which to read size (int) – size of element’s data
Raises:	ReadError – when not all the required bytes could be read SizeError – if size is incorrect
Returns:	the read integer
Return type:	int

enzyme.parsers.ebml.readers.read_element_uinteger(stream, size)¶

Read the Element Data of type UINTEGER

Parameters:	stream – file-like object from which to read size (int) – size of element’s data
Raises:	ReadError – when not all the required bytes could be read SizeError – if size is incorrect
Returns:	the read unsigned integer
Return type:	int

enzyme.parsers.ebml.readers.read_element_float(stream, size)¶

Read the Element Data of type FLOAT

Parameters:	stream – file-like object from which to read size (int) – size of element’s data
Raises:	ReadError – when not all the required bytes could be read SizeError – if size is incorrect
Returns:	the read float
Return type:	float

enzyme.parsers.ebml.readers.read_element_string(stream, size)¶

Read the Element Data of type STRING

Parameters:	stream – file-like object from which to read size (int) – size of element’s data
Raises:	ReadError – when not all the required bytes could be read SizeError – if size is incorrect
Returns:	the read ascii-decoded string
Return type:	unicode

enzyme.parsers.ebml.readers.read_element_unicode(stream, size)¶

Read the Element Data of type UNICODE

Parameters:	stream – file-like object from which to read size (int) – size of element’s data
Raises:	ReadError – when not all the required bytes could be read SizeError – if size is incorrect
Returns:	the read utf-8-decoded string
Return type:	unicode

enzyme.parsers.ebml.readers.read_element_date(stream, size)¶

Read the Element Data of type DATE

Parameters:	stream – file-like object from which to read size (int) – size of element’s data
Raises:	ReadError – when not all the required bytes could be read SizeError – if size is incorrect
Returns:	the read date
Return type:	datetime

enzyme.parsers.ebml.readers.read_element_binary(stream, size)¶

Read the Element Data of type BINARY

Parameters:	stream – file-like object from which to read size (int) – size of element’s data
Raises:	ReadError – when not all the required bytes could be read SizeError – if size is incorrect
Returns:	raw binary data
Return type:	bytes

Specifications¶

The XML specification for Matroska can be found here. It is included with enzyme and can be converted to the appropriate format with get_matroska_specs().

The appropriate format of the specs parameter for parse(), parse_element() and load() is {id: (type, name, level)}

Parsers¶

EBML¶

Element types¶

Main interface¶

Readers¶

Specifications¶

Enzyme

Donate

Table Of Contents

Related Topics

This Page