delphin.tfs

Basic classes for modeling feature structures.

This module defines the FeatureStructure and TypedFeatureStructure classes, which model an attribute value matrix (AVM), with the latter including an associated type. They allow feature access through TDL-style dot notation regular dictionary keys.

In addition, the TypeHierarchy class implements a multiple-inheritance hierarchy with checks for type subsumption and compatibility.

Classes

class delphin.tfs.FeatureStructure(featvals=None)[source]

A feature structure.

This class manages the access of nested features using dot-delimited notation (e.g., SYNSEM.LOCAL.CAT.HEAD).

Parameters:: featvals (dict, list) – a mapping or iterable of feature paths to feature values

features(expand=False)[source]

Return the list of tuples of feature paths and feature values.

Parameters:: expand (bool) – if True, expand all feature paths

Example

>>> fs = FeatureStructure([('A.B', 1), ('A.C', 2)])
>>> fs.features()
[('A', <FeatureStructure object at ...>)]
>>> fs.features(expand=True)
[('A.B', 1), ('A.C', 2)]

get(key, default=None)[source]: Return the value for key if it exists, otherwise default.

class delphin.tfs.TypedFeatureStructure(type, featvals=None)[source]

Bases: FeatureStructure

A typed FeatureStructure.

Parameters:

type (str) – type name
featvals (dict, list) – a mapping or iterable of feature paths to feature values

property type: The type assigned to the feature structure.

class delphin.tfs.TypeHierarchy(top, hierarchy=None, data=None, normalize_identifier=None)[source]

Bases: MultiHierarchy

A Type Hierarchy.

Type hierarchies have certain properties, such as a unique top node, multiple inheritance, case insensitivity, and unique greatest-lower-bound (glb) types.

Note

Checks for unique glbs is not yet implemented.

TypeHierarchies may be constructed when instantiating the class or via the update() method using a dictionary mapping type names to node values, or one-by-one using dictionary-like access. In both cases, the node values may be an individual parent name, an iterable of parent names, or a TypeHierarchyNode object. Retrieving a node via dictionary access on the typename returns a TypeHierarchyNode regardless of the method used to create the node.

>>> th = TypeHierarchy('*top*', {'can-fly': '*top*'})
>>> th.update({'can-swim': '*top*', 'can-walk': '*top*'})
>>> th['butterfly'] = ('can-fly', 'can-walk')
>>> th['duck'] = TypeHierarchyNode(
...     ('can-fly', 'can-swim', 'can-walk'),
...     data='some info relating to ducks...')
>>> th['butterfly'].data = 'some info relating to butterflies'

In some ways the TypeHierarchy behaves like a dictionary, but it is not a subclass of dict and does not implement all its methods. Also note that some methods ignore the top node, which make certain actions easier:

>>> th = TypeHierarchy('*top*', {'a': '*top*', 'b': 'a', 'c': 'a'})
>>> len(th)
3
>>> list(th)
['a', 'b', 'c']
>>> TypeHierarchy('*top*', dict(th.items())) == th
True

But others do not ignore the top node, namely those where you can request it specifically:

>>> '*top*' in th
True
>>> th['*top*']
<TypeHierarchyNode ... >

Parameters:

top (str) – unique top type
hierarchy (dict) – mapping of {child: node} (see description above concerning the node values)

top: the hierarchy’s top type

ancestors(identifier): Return the ancestors of identifier.

children(identifier): Return the immediate children of identifier.

compatible(a, b)

Return True if node a is compatible with node b.

In a multiply-inheriting hierarchy, node compatibility means that two nodes share a common descendant. It is a commutative operation, so compatible(a, b) == compatible(b, a). Note that in a singly-inheriting hierarchy, two nodes are never compatible by this metric.

Parameters:

a – a node identifier
b – a node identifier

Examples

>>> h = MultiHierarchy('*top*', {'a': '*top*',
...                              'b': '*top*'})
>>> h.compatible('a', 'b')
False
>>> h.update({'c': 'a b'})
>>> h.compatible('a', 'b')
True

descendants(identifier): Return the descendants of identifier.

items(): Return the (identifier, data) pairs excluding the top node.

parents(identifier): Return the immediate parents of identifier.

subsumes(a, b)

Return True if node a subsumes node b.

A node is subsumed by the other if it is a descendant of the other node or if it is the other node. It is not a commutative operation, so subsumes(a, b) != subsumes(b, a), except for the case where a == b.

Parameters:

a – a node identifier
b – a node identifier

Examples

>>> h = MultiHierarchy('*top*', {'a': '*top*',
...                              'b': '*top*',
...                              'c': 'b'})
>>> all(h.subsumes(h.top, x) for x in h)
True
>>> h.subsumes('a', h.top)
False
>>> h.subsumes('a', 'b')
False
>>> h.subsumes('b', 'c')
True

update(subhierarchy=None, data=None)

Incorporate subhierarchy and data into the hierarchy.

This method ensures that nodes are inserted in an order that does not result in an intermediate state being disconnected or cyclic, and raises an error if it cannot avoid such a state due to subhierarchy being invalid when inserted into the main hierarchy. Updates are atomic, so subhierarchy and data will not be partially applied if there is an error in the middle of the operation.

Parameters:

subhierarchy – mapping of node identifiers to parents
data – mapping of node identifiers to data objects

Raises:

HierarchyError – when subhierarchy or data cannot be incorporated into the hierarchy

Examples

>>> h = MultiHierarchy('*top*')
>>> h.update({'a': '*top*'})
>>> h.update({'b': '*top*'}, data={'b': 5})
>>> h.update(data={'a': 3})
>>> h['b'] - h['a']
2

validate_update(subhierarchy, data)

Check if the update can apply to the current hierarchy.

This method returns (subhierarchy, data) with normalized identifiers if the update is valid, otherwise it will raise a HierarchyError.

Raises:: HierarchyError – when the update is invalid