delphin.tdl¶
Classes and functions for parsing and inspecting TDL.
This module makes it easy to inspect what is written on definitions in Type Description Language (TDL), but it doesn’t interpret type hierarchies (such as by performing unification, subsumption calculations, or creating GLB types). That is, while it wouldn’t be useful for creating a parser, it is useful if you want to statically inspect the types in a grammar and the constraints they apply.
TDL was originally described in Krieger and Schäfer, 1994 [KS1994], but it describes many features not in use by the DELPH-IN variant, such as disjunction. Copestake, 2002 [COP2002] better describes the subset in use by DELPH-IN, but it has become outdated and its TDL syntax description is inaccurate in places, but it is still a great resource for understanding the interpretation of TDL grammar descriptions. The TdlRfc page of the DELPH-IN Wiki contains the most up-to-date description of the TDL syntax used by DELPH-IN grammars, including features such as documentation strings and regular expressions.
[KS1994] | Hans-Ulrich Krieger and Ulrich Schäfer. TDL: a type description language for constraint-based grammars. In Proceedings of the 15th conference on Computational linguistics, volume 2, pages 893–899. Association for Computational Linguistics, 1994. |
[COP2002] | Ann Copestake. Implementing typed feature structure grammars, volume 110. CSLI publications Stanford, 2002. |
Module Parameters¶
Some aspects of TDL parsing can be customized per grammar, and the
following module variables may be reassigned to accommodate those
differences. For instance, in the ERG, the type used for list
feature structures is *list*
, while for Matrix-based grammars
it is list
. PyDelphin defaults to the values used by the ERG.
-
delphin.tdl.
LIST_TYPE
= '*list*'¶ type of lists in TDL
-
delphin.tdl.
EMPTY_LIST_TYPE
= '*null*'¶ type of list terminators
-
delphin.tdl.
LIST_HEAD
= 'FIRST'¶ feature for list items
-
delphin.tdl.
LIST_TAIL
= 'REST'¶ feature for list tails
-
delphin.tdl.
DIFF_LIST_LIST
= 'LIST'¶ feature for diff-list lists
-
delphin.tdl.
DIFF_LIST_LAST
= 'LAST'¶ feature for the last path in a diff-list
Functions¶
-
delphin.tdl.
iterparse
(source, encoding='utf-8')[source]¶ Parse the TDL file source and iteratively yield parse events.
If source is a filename, the file is opened and closed when the generator has finished, otherwise source is an open file object and will not be closed when the generator has finished.
Parse events are
(event, object, lineno)
tuples, whereevent
is a string (“TypeDefinition”
,“TypeAddendum”
,“LexicalRuleDefinition”
,“LetterSet”
,“WildCard”
,“LineComment”
, or“BlockComment”
),object
is the interpreted TDL object, andlineno
is the line number where the entity began in source.Parameters: Yields: (event, object, lineno)
tuplesExample
>>> lex = {} >>> for event, obj, lineno in tdl.iterparse('erg/lexicon.tdl'): ... if event == 'TypeDefinition': ... lex[obj.identifier] = obj ... >>> lex['eucalyptus_n1']['SYNSEM.LKEYS.KEYREL.PRED'] <String object (_eucalyptus_n_1_rel) at 140625748595960>
-
delphin.tdl.
format
(obj, indent=0)[source]¶ Serialize TDL objects to strings.
Parameters: - obj – instance of
Term
,Conjunction
, orTypeDefinition
classes or subclasses - indent (int) – number of spaces to indent the formatted object
Returns: str – serialized form of obj
Example
>>> conj = tdl.Conjunction([ ... tdl.TypeIdentifier('lex-item'), ... tdl.AVM([('SYNSEM.LOCAL.CAT.HEAD.MOD', ... tdl.ConsList(end=tdl.EMPTY_LIST_TYPE))]) ... ]) >>> t = tdl.TypeDefinition('non-mod-lex-item', conj) >>> print(format(t)) non-mod-lex-item := lex-item & [ SYNSEM.LOCAL.CAT.HEAD.MOD < > ].
- obj – instance of
Classes¶
The TDL entity classes are the objects returned by
iterparse()
, but they may also be used directly to build TDL
structures, e.g., for serialization.
Terms¶
-
class
delphin.tdl.
Term
(docstring=None)[source]¶ Base class for the terms of a TDL conjunction.
All terms are defined to handle the binary ‘&’ operator, which puts both into a Conjunction:
>>> TypeIdentifier('a') & TypeIdentifier('b') <Conjunction object at 140008950372168>
Parameters: docstring (str) – documentation string
-
class
delphin.tdl.
TypeTerm
(string, docstring=None)[source]¶ Bases:
delphin.tdl.Term
,str
Base class for type terms (identifiers, strings and regexes).
This subclass of
Term
also inherits fromstr
and forms the superclass of the string-based termsTypeIdentifier
,String
, andRegex
. Its purpose is to handle the correct instantiation of both theTerm
andstr
supertypes and to define equality comparisons such that different kinds of type terms with the same string value are not considered equal:>>> String('a') == String('a') True >>> String('a') == TypeIdentifier('a') False
-
class
delphin.tdl.
TypeIdentifier
(string, docstring=None)[source]¶ Bases:
delphin.tdl.TypeTerm
Type identifiers, or type names.
Unlike other
TypeTerms
, TypeIdentifiers use case-insensitive comparisons:>>> TypeIdentifier('MY-TYPE') == TypeIdentifier('my-type') True
Parameters:
-
class
delphin.tdl.
String
(string, docstring=None)[source]¶ Bases:
delphin.tdl.TypeTerm
Double-quoted strings.
Parameters:
-
class
delphin.tdl.
Regex
(string, docstring=None)[source]¶ Bases:
delphin.tdl.TypeTerm
Regular expression patterns.
Parameters:
-
class
delphin.tdl.
AVM
(featvals=None, docstring=None)[source]¶ Bases:
delphin.tfs.FeatureStructure
,delphin.tdl.Term
A feature structure as used in TDL.
Parameters: -
features
(expand=False)[source]¶ Return the list of tuples of feature paths and feature values.
Parameters: expand (bool) – if True
, expand all feature pathsExample
>>> avm = AVM([('A.B', TypeIdentifier('1')), ... ('A.C', TypeIdentifier('2')]) >>> avm.features() [('A', <AVM object at ...>)] >>> avm.features(expand=True) [('A.B', <TypeIdentifier object (1) at ...>), ('A.C', <TypeIdentifier object (2) at ...>)]
-
-
class
delphin.tdl.
ConsList
(values=None, end='*list*', docstring=None)[source]¶ Bases:
delphin.tdl.AVM
AVM subclass for cons-lists (
< ... >
)This provides a more intuitive interface for creating and accessing the values of list structures in TDL. Some combinations of the values and end parameters correspond to various TDL forms as described in the table below:
TDL form values end state < >
None
EMPTY_LIST_TYPE
closed < … >
None
LIST_TYPE
open < a >
[a]
EMPTY_LIST_TYPE
closed < a, b >
[a, b]
EMPTY_LIST_TYPE
closed < a, … >
[a]
LIST_TYPE
open < a . b >
[a]
b
closed Parameters: - values (list) – a sequence of
Conjunction
orTerm
objects to be placed in the AVM of the list. - end (str,
Conjunction
,Term
) – last item in the list (default:LIST_TYPE
) which determines if the list is open or closed - docstring (str) – documentation string
-
terminated
¶ if
False
, the list can be further extended by following theLIST_TAIL
features.Type: bool
-
append
(value)[source]¶ Append an item to the end of an open ConsList.
Parameters: value ( Conjunction
,Term
) – item to addRaises: TdlError
– when appending to a closed list
-
terminate
(end)[source]¶ Set the value of the tail of the list.
Adding values via
append()
places them on theFIRST
feature of some level of the feature structure (e.g.,REST.FIRST
), whileterminate()
places them on the finalREST
feature (e.g.,REST.REST
). If end is aConjunction
orTerm
, it is typically aCoreference
, otherwise end is set totdl.EMPTY_LIST_TYPE
ortdl.LIST_TYPE
. This method does not necessarily close the list; if end istdl.LIST_TYPE
, the list is left open, otherwise it is closed.Parameters: - end (str,
Conjunction
,Term
) – value to - as the end of the list. (use) –
- end (str,
- values (list) – a sequence of
-
class
delphin.tdl.
DiffList
(values=None, docstring=None)[source]¶ Bases:
delphin.tdl.AVM
AVM subclass for diff-lists (
<! ... !>
)As with
ConsList
, this provides a more intuitive interface for creating and accessing the values of list structures in TDL. UnlikeConsList
, DiffLists are always closed lists with the last item coreferenced with theLAST
feature, which allows for the joining of two diff-lists.Parameters: - values (list) – a sequence of
Conjunction
orTerm
objects to be placed in the AVM of the list - docstring (str) – documentation string
-
last
¶ the feature path to the list position coreferenced by the value of the
DIFF_LIST_LAST
feature.Type: str
- values (list) – a sequence of
-
class
delphin.tdl.
Coreference
(identifier, docstring=None)[source]¶ Bases:
delphin.tdl.Term
TDL coreferences, which represent re-entrancies in AVMs.
Parameters:
Conjunctions¶
-
class
delphin.tdl.
Conjunction
(terms=None)[source]¶ Conjunction of TDL terms.
Parameters: terms (list) – sequence of Term
objects-
add
(term)[source]¶ Add a term to the conjunction.
Parameters: term ( Term
,Conjunction
) – term to add; if aConjunction
, all of its terms are added to the current conjunction.Raises: TypeError
– when term is an invalid type
-
get
(key, default=None)[source]¶ Get the value of attribute key in any AVM in the conjunction.
Parameters: - key – attribute path to search
- default – value to return if key is not defined on any AVM
-
normalize
()[source]¶ Rearrange the conjunction to a conventional form.
This puts any coreference(s) first, followed by type terms, then followed by AVM(s) (including lists). AVMs are normalized via
AVM.normalize()
.
-
terms
¶ The list of terms in the conjunction.
-
Type and Instance Definitions¶
-
class
delphin.tdl.
TypeDefinition
(identifier, conjunction, docstring=None)[source]¶ A top-level Conjunction with an identifier.
Parameters: - identifier (str) – type name
- conjunction (
Conjunction
,Term
) – type constraints - docstring (str) – documentation string
-
conjunction
¶ type constraints
Type: Conjunction
-
documentation
(level='first')[source]¶ Return the documentation of the type.
By default, this is the first docstring on a top-level term. By setting level to
“top”
, the list of all docstrings on top-level terms is returned, including the type’sdocstring
value, if notNone
, as the last item. The docstring for the type itself is available viaTypeDefinition.docstring
.Parameters: level (str) – “first”
or“top”
Returns: a single docstring or a list of docstrings
-
supertypes
¶ The list of supertypes for the type.
-
class
delphin.tdl.
TypeAddendum
(identifier, conjunction=None, docstring=None)[source]¶ Bases:
delphin.tdl.TypeDefinition
An addendum to an existing type definition.
Type addenda, unlike
type definitions
, do not require supertypes, or even any feature constraints. An addendum, however, must have at least one supertype, AVM, or docstring.Parameters: - identifier (str) – type name
- conjunction (
Conjunction
,Term
) – type constraints - docstring (str) – documentation string
-
conjunction
¶ type constraints
Type: Conjunction
-
class
delphin.tdl.
LexicalRuleDefinition
(identifier, affix_type, patterns, conjunction, **kwargs)[source]¶ Bases:
delphin.tdl.TypeDefinition
An inflecting lexical rule definition.
Parameters: -
conjunction
¶ type constraints
Type: Conjunction
-
Morphological Patterns¶
-
class
delphin.tdl.
LetterSet
(var, characters)[source]¶ A capturing character class for inflectional lexical rules.
LetterSets define a pattern (e.g.,
“!a”
) that may match any one of its associated characters. UnlikeWildCard
patterns, LetterSet variables also appear in the replacement pattern of an affixing rule, where they insert the character matched by the corresponding letter set.Parameters:
-
class
delphin.tdl.
WildCard
(var, characters)[source]¶ A non-capturing character class for inflectional lexical rules.
WildCards define a pattern (e.g.,
“?a”
) that may match any one of its associated characters. UnlikeLetterSet
patterns, WildCard variables may not appear in the replacement pattern of an affixing rule.Parameters:
Deprecated¶
Use of the following functions are classes is no longer recommended, and they will be removed in a future version.
-
delphin.tdl.
parse
(f, encoding='utf-8')[source]¶ Parse the TDL file f and yield the interpreted contents.
If f is a filename, the file is opened and closed when the generator has finished, otherwise f is an open file object and will not be closed when the generator has finished.
Parameters:
-
class
delphin.tdl.
TdlDefinition
(supertypes=None, featvals=None)[source]¶ Bases:
delphin.tfs.FeatureStructure
A typed feature structure with supertypes.
A TdlDefinition is like a
FeatureStructure
but each structure may have a list of supertypes.
-
class
delphin.tdl.
TdlConsList
(supertypes=None, featvals=None)[source]¶ Bases:
delphin.tdl.TdlDefinition
A TdlDefinition for cons-lists (
< ... >
)Navigating the feature structure for lists can be cumbersome, so this subclass of
TdlDefinition
provides thevalues()
method to collect the items nested inside the list and return them as a Python list.
-
class
delphin.tdl.
TdlDiffList
(supertypes=None, featvals=None)[source]¶ Bases:
delphin.tdl.TdlDefinition
A TdlDefinition for diff-lists (
<! ... !>
)Navigating the feature structure for lists can be cumbersome, so this subclass of
TdlDefinition
provides thevalues()
method to collect the items nested inside the list and return them as a Python list.
-
class
delphin.tdl.
TdlType
(identifier, definition, coreferences=None, docstring=None)[source]¶ Bases:
delphin.tdl.TdlDefinition
A top-level TdlDefinition with an identifier.
Parameters: - identifier (str) – type name
- definition (
TdlDefinition
) – definition of the type - coreferences (list) – (tag, paths) tuple of coreferences, where paths is a list of feature paths that share the tag
- docstring (list) – list of documentation strings
-
class
delphin.tdl.
TdlInflRule
(identifier, affix=None, **kwargs)[source]¶ Bases:
delphin.tdl.TdlType
TDL inflectional rule.
Parameters: