delphin.interface
Interfaces for external data providers.
This module manages the communication between data providers, namely processors like ACE or remote services like the DELPH-IN Web API, and user code or storage backends, namely [incr tsdb()] test suites. An interface sends requests to a provider, then receives and interprets the response. The interface may also detect and deserialize supported DELPH-IN formats if the appropriate modules are available.
Classes
- class delphin.interface.Processor[source]
Base class for processors.
This class defines the basic interface for all PyDelphin processors, such as
ACEProcess
andClient
. It can also be used to define preprocessor wrappers of other processors such that it has the same interface, allowing it to be used, e.g., withTestSuite.process()
.- task
name of the task the processor performs (e.g.,
"parse"
,"transfer"
, or"generate"
)- Type:
str | None
- process_item(datum, keys=None)[source]
Send datum to the processor and return the result.
This method is a generic wrapper around a processor-specific processing method that keeps track of additional item and processor information. Specifically, if keys is provided, it is copied into the
keys
key of the response object, and if the processor object’stask
member is non-None
, it is copied into thetask
key of the response. These help with keeping track of items when many are processed at once, and to help downstream functions identify what the process did.- Parameters:
datum – the item content to process
keys – a mapping of item identifiers which will be copied into the response
- class delphin.interface.Response[source]
A wrapper around the response dictionary for more convenient access to results.
- tokens(tokenset='internal')[source]
Interpret and return a YYTokenLattice object.
If tokenset is a key under the
tokens
key of the response, interpret its value as aYYTokenLattice
from a valid YY serialization or from a dictionary. If tokenset is not available, returnNone
.- Parameters:
tokenset (str) – return
'initial'
or'internal'
tokens (default:'internal'
)- Returns:
YYTokenLattice
- Raises:
InterfaceError – when the value is an unsupported type or
delphin.tokens
is unavailble
- class delphin.interface.Result[source]
A wrapper around a result dictionary to automate deserialization for supported formats. A Result is still a dictionary, so the raw data can be obtained using dict access.
- derivation()[source]
Interpret and return a Derivation object.
If
delphin.derivation
is available and the value of thederivation
key in the result dictionary is a valid UDF string or a dictionary, return the interpeted Derivation object. If there is no ‘derivation’ key in the result, returnNone
.- Raises:
InterfaceError – when the value is an unsupported type or
delphin.derivation
is unavailable
- dmrs()[source]
Interpret and return a Dmrs object.
If
delphin.codecs.dmrsjson
is available and the value of thedmrs
key in the result is a dictionary, return the interpreted DMRS object. If there is nodmrs
key in the result, returnNone
.- Raises:
InterfaceError – when the value is not a dictionary or
delphin.codecs.dmrsjson
is unavailable
- eds()[source]
Interpret and return an Eds object.
If
delphin.codecs.eds
is available and the value of theeds
key in the result is a valid “native” EDS serialization, or ifdelphin.codecs.edsjson
is available and the value is a dictionary, return the interpreted EDS object. If there is noeds
key in the result, returnNone
.- Raises:
InterfaceError – when the value is an unsupported type or the corresponding module is unavailable
- mrs()[source]
Interpret and return an MRS object.
If
delphin.codecs.simplemrs
is available and the value of themrs
key in the result is a valid SimpleMRS string, or ifdelphin.codecs.mrsjson
is available and the value is a dictionary, return the interpreted MRS object. If there is nomrs
key in the result, returnNone
.- Raises:
InterfaceError – when the value is an unsupported type or the corresponding module is unavailable
Exceptions
- exception delphin.interface.InterfaceError(*args, **kwargs)[source]
Bases:
PyDelphinException
Raised on invalid interface operations.
Wrapping a Processor for Preprocessing
The Processor
class can be used to
implement a preprocessor that maintains the same interface as the
underlying processor. The following example wraps an
ACEParser
instance of the
English Resource Grammar with a
REPP
instance.
>>> from delphin import interface
>>> from delphin import ace
>>> from delphin import repp
>>>
>>> class REPPWrapper(interface.Processor):
... def __init__(self, cpu, rpp):
... self.cpu = cpu
... self.task = cpu.task
... self.rpp = rpp
... def process_item(self, datum, keys=None):
... preprocessed_datum = str(self.rpp.tokenize(datum))
... return self.cpu.process_item(preprocessed_datum, keys=keys)
...
>>> # The preprocessor can be used like a normal Processor:
>>> rpp = repp.REPP.from_config('../../grammars/erg/pet/repp.set')
>>> grm = '../../grammars/erg-2018-x86-64-0.9.30.dat'
>>> with ace.ACEParser(grm, cmdargs=['-y']) as _cpu:
... cpu = REPPWrapper(_cpu, rpp)
... response = cpu.process_item('Abrams hired Browne.')
... for result in response.results():
... print(result.mrs())
...
<Mrs object (proper named hire proper named) at 140488735960480>
<Mrs object (unknown compound udef named hire parg addressee proper named) at 140488736005424>
<Mrs object (unknown proper compound udef named hire parg named) at 140488736004864>
NOTE: parsed 1 / 1 sentences, avg 1173k, time 0.00986s
A similar technique could be used to manage external processes, such as MeCab for morphological segmentation of Japanese for Jacy. It could also be used to make a postprocessor, a backoff mechanism in case an input fails to parse, etc.