delphin.ace
See also
See Using ACE from PyDelphin for a more user-friendly introduction.
An interface for the ACE processor.
This module provides classes and functions for managing interactive communication with an open ACE process. The ACE software is required for the functionality in this module, but it is not included with PyDelphin. Pre-compiled binaries are available for Linux and MacOS at http://sweaglesw.org/linguistics/ace/, and for installation instructions see https://github.com/delph-in/docs/wiki/AceInstall.
The ACEParser
, ACETransferer
, and
ACEGenerator
classes are used for parsing, transferring, and
generating with ACE. All are subclasses of ACEProcess
, which
connects to ACE in the background, sends it data via its stdin, and
receives responses via its stdout. Responses from ACE are interpreted
so the data is more accessible in Python.
Warning
Instantiating ACEParser
, ACETransferer
, or
ACEGenerator
opens ACE in a subprocess, so take care to
close the process (ACEProcess.close()
) when finished or,
preferably, instantiate the class in a context manager so it is
closed automatically when the relevant code has finished.
Interpreted responses are stored in a dictionary-like
Response
object. When queried as a
dictionary, these objects return the raw response strings. When
queried via its methods, the PyDelphin models of the data are
returned. The response objects may contain a number of
Result
objects. These objects similarly
provide raw-string access via dictionary keys and PyDelphin-model
access via methods. Here is an example of parsing a sentence with
ACEParser
:
>>> from delphin import ace
>>> with ace.ACEParser('erg-2018-x86-64-0.9.30.dat') as parser:
... response = parser.interact('A cat sleeps.')
... print(response.result(0)['mrs'])
... print(response.result(0).mrs())
...
[ LTOP: h0 INDEX: e2 [ e SF: prop TENSE: pres MOOD: indicative PROG: - PERF: - ] RELS: < [ _a_q<0:1> LBL: h4 ARG0: x3 [ x PERS: 3 NUM: sg IND: + ] RSTR: h5 BODY: h6 ] [ _cat_n_1<2:5> LBL: h7 ARG0: x3 ] [ _sleep_v_1<6:13> LBL: h1 ARG0: e2 ARG1: x3 ] > HCONS: < h0 qeq h1 h5 qeq h7 > ICONS: < > ]
<MRS object (_a_q _cat_n_1 _sleep_v_1) at 140612036960072>
Functions exist for non-interactive communication with ACE:
parse()
and parse_from_iterable()
open and close an
ACEParser
instance; transfer()
and
transfer_from_iterable()
open and close an
ACETransferer
instance; and generate()
and
generate_from_iterable()
open and close an
ACEGenerator
instance. Note that these functions open a
new ACE subprocess every time they are called, so if you have many
items to process, it is more efficient to use
parse_from_iterable()
, transfer_from_iterable()
, or
generate_from_iterable()
than the single-item versions, or to
interact with the ACEProcess
subclass instances directly.
Basic Usage
The following module funtions are the simplest way to interact with
ACE, although for larger or more interactive jobs it is suggested to
use an ACEProcess
subclass instance.
- delphin.ace.compile(cfg_path, out_path, executable=None, env=None, stdout=None, stderr=None)[source]
Use ACE to compile a grammar.
- Parameters:
cfg_path (str) – the path to the ACE config file
out_path (str) – the path where the compiled grammar will be written
executable (str, optional) – the path to the ACE binary; if
None
, theace
command will be usedenv (dict, optional) – environment variables to pass to the ACE subprocess
stdout (file, optional) – stream used for ACE’s stdout
stderr (file, optional) – stream used for ACE’s stderr
- delphin.ace.parse(grm, datum, **kwargs)[source]
Parse sentence datum with ACE using grammar grm.
- Parameters:
- Returns:
Example
>>> response = ace.parse('erg.dat', 'Dogs bark.') NOTE: parsed 1 / 1 sentences, avg 797k, time 0.00707s
- delphin.ace.parse_from_iterable(grm, data, **kwargs)[source]
Parse each sentence in data with ACE using grammar grm.
- Parameters:
grm (str) – path to a compiled grammar image
data (iterable) – the sentences to parse
**kwargs – additional keyword arguments to pass to the ACEParser
- Yields:
Example
>>> sentences = ['Dogs bark.', 'It rained'] >>> responses = list(ace.parse_from_iterable('erg.dat', sentences)) NOTE: parsed 2 / 2 sentences, avg 723k, time 0.01026s
- delphin.ace.transfer(grm, datum, **kwargs)[source]
Transfer from the MRS datum with ACE using grammar grm.
- delphin.ace.transfer_from_iterable(grm, data, **kwargs)[source]
Transfer from each MRS in data with ACE using grammar grm.
Classes for Managing ACE Processes
The functions described in Basic Usage are useful for small jobs
as they handle the input and then close the ACE process, but for
more complicated or interactive jobs, directly interacting with an
instance of an ACEProcess
sublass is recommended or
required (e.g., in the case of [incr tsdb()] testsuite processing). The ACEProcess
class
is where most methods are defined, but in practice the
ACEParser
, ACETransferer
, or
ACEGenerator
subclasses are directly used.
- class delphin.ace.ACEProcess(grm, cmdargs=None, executable=None, env=None, tsdbinfo=True, full_forest=False, stderr=None)[source]
Bases:
Processor
The base class for interfacing ACE.
This manages most subprocess communication with ACE, but does not interpret the response returned via ACE’s stdout. Subclasses override the
receive()
method to interpret the task-specific response formats.Note that not all arguments to this class are used by every subclass; the documentation for each subclass specifies which are available.
- Parameters:
grm (str) – path to a compiled grammar image
cmdargs (list, optional) – a list of command-line arguments for ACE; note that arguments and their values should be separate entries, e.g.
['-n', '5']
executable (str, optional) – the path to the ACE binary; if
None
, ACE is assumed to be callable viaace
env (dict) – environment variables to pass to the ACE subprocess
tsdbinfo (bool) – if
True
and ACE’s version is compatible, all information ACE reports for [incr tsdb()] processing is gathered and returned in the responsefull_forest (bool) – if
True
and tsdbinfo isTrue
, output the full chart for each parse resultstderr (file) – stream used for ACE’s stderr
- property ace_version
The version of the specified ACE binary.
- interact(datum)[source]
Send datum to ACE and return the response.
This is the recommended method for sending and receiving data to/from an ACE process as it reduces the chances of over-filling or reading past the end of the buffer. It also performs a simple validation of the input to help ensure that one complete item is processed at a time.
If input item identifiers need to be tracked throughout processing, see
process_item()
.
- process_item(datum, keys=None)[source]
Send datum to ACE and return the response with context.
The keys parameter can be used to track item identifiers through an ACE interaction. If the
task
member is set on the ACEProcess instance (or one of its subclasses), it is kept in the response as well. :param datum: the input sentence or MRS :type datum: str :param keys: a mapping of item identifier names and values :type keys: dict- Returns:
- receive()[source]
Return the stdout response from ACE.
Warning
Reading beyond the last line of stdout from ACE can cause the process to hang while it waits for the next line. Use the
interact()
method for most data-processing tasks with ACE.
- property run_info
Contextual information about the the running process.
- send(datum)[source]
Send datum (e.g. a sentence or MRS) to ACE.
Warning
Sending data without reading (e.g., via
receive()
) can fill the buffer and cause data to be lost. Use theinteract()
method for most data-processing tasks with ACE.
- class delphin.ace.ACEParser(grm, cmdargs=None, executable=None, env=None, tsdbinfo=True, full_forest=False, stderr=None)[source]
Bases:
ACEProcess
A class for managing parse requests with ACE.
See
ACEProcess
for initialization parameters.
- class delphin.ace.ACETransferer(grm, cmdargs=None, executable=None, env=None, stderr=None)[source]
Bases:
ACEProcess
A class for managing transfer requests with ACE.
See
ACEProcess
for initialization parameters.
- class delphin.ace.ACEGenerator(grm, cmdargs=None, executable=None, env=None, tsdbinfo=True, stderr=None)[source]
Bases:
ACEProcess
A class for managing realization requests with ACE.
See
ACEProcess
for initialization parameters.
Exceptions
- exception delphin.ace.ACEProcessError(*args, **kwargs)[source]
Bases:
PyDelphinException
Raised when the ACE process has crashed and cannot be recovered.
ACE stdout Protocols
PyDelphin communicates with ACE via its “stdout protocols”, which are the ways ACE’s outputs are encoded across its stdout stream. There are several protocols that ACE uses and that this module supports:
regular parsing
parsing with ACE’s
--tsdb-stdout
optionparsing with
--tsdb-stdout
and--itsdb-forest
transfer
regular generation
generation with ACE’s
--tsdb-stdout
option
When a user interacts with ACE via the classes and functions in
this module, responses will be interpreted and wrapped in
Response
objects, thus separating the
user from the details of ACE’s stdout protocols. Sometimes,
however, the user will store or pipe ACE’s output directly, such as
when using the delphin convert command with ace at the command line. Even
though ACE outputs MRSs using the common SimpleMRS format, additional content used in
ACE’s stdout protocols can complicate tasks such as format or
represenation conversion. The user can provide some options to ACE
(see https://github.com/delph-in/docs/wiki/AceOptions), such as
-T, to filter the non-MRS content, but for convenience
PyDelphin also provides the ace
codec,
available at delphin.codecs.ace
. The codec ignores the
non-MRS content in ACE’s stdout so the user can use ACE output as a
stream or as a corpus of MRS representations. For example:
[~]$ ace -g erg.dat < sentences.txt | delphin convert --from ace
The codec does not support every stdout protocol that this module does. Those it does support are:
regular parsing
parsing with ACE’s
--tsdb-stdout
optiongeneration with ACE’s
--tsdb-stdout
option