RIGOTRIO at SemEval-2017 Task 9: Combining Machine Learning and
Grammar Engineering for AMR Parsing and Generation
Normunds Gruzitis1, Didzis Gosko2 and Guntis Barzdins1
1University of Latvia, IMCS / Rainis blvd. 29, Riga, Latvia
normunds.gruzitis@lumii.lv, guntis.barzdins@lumii.lv
2LETA / Marija street 2, Riga, Latvia
didzis.gosko@leta.lv
Abstract
By addressing both text-to-AMR parsing
and AMR-to-text generation, SemEval-
2017 Task 9 established AMR as a pow-
erful semantic interlingua. We strengthen
the interlingual aspect of AMR by apply-
ing the multilingual Grammatical Frame-
work (GF) for AMR-to-text generation.
Our current rule-based GF approach com-
pletely covered only 12.3% of the test
AMRs, therefore we combined it with
state-of-the-art JAMR Generator to see if
the combination increases or decreases the
overall performance. The combined sys-
tem achieved the automatic BLEU score
of 18.82 and the human Trueskill score
of 107.2, to be compared to the plain
JAMR Generator results. As for AMR
parsing, we added NER extensions to
our SemEval-2016 general-domain AMR
parser to handle the biomedical genre, rich
in organic compound names, achieving
Smatch F1=54.0%.
1 Introduction
AMR (Banarescu et al., 2013) as a sentence-level
semantic representation is evolving towards inter-
lingua at SemEval-2017 Task 9 on Abstract Mean-
ing Representation Parsing and Generation (May
and Priyadarshi, 2017). The challenge was to im-
prove over state-of-the-art systems for both text-
to-AMR parsing (Barzdins and Gosko, 2016) and
AMR-to-text generation (Flanigan et al., 2016).
AMR parsing subtask this year focused on spe-
cific genre of Biomedical scientific articles regard-
ing cancer pathway discovery. Such texts are chal-
lenging to existing AMR parsers because they are
rich in organic compound names with types “en-
zyme”, “aminoacid”, etc. not recognized by com-
mon NER tools that are often restricted to types
“person”, “organization”, “location”, etc.
The paper starts with NER extensions used for
the Biomedical AMR parsing subtask, followed
by a novel approach of using Grammatical Frame-
work for AMR generation, and concludes with a
brief analysis of our SemEval results.
2 Text-to-AMR parsing
Only two adaptations to the AMR parser from
SemEval-2016 (Barzdins and Gosko, 2016) were
implemented: it was retrained on the union of
LDC2015E86, LDC2016E25, LDC2016E33 and
Bio AMR Corpus, and a gazetteer was added to
extend the NER coverage to organic compound
names found in the Bio AMR Corpus (e.g. “B-
Raf enzyme”, “dabrafenib small-molecule”, etc.).
The gazetteer was generalized w.r.t. numbers used
in the names.
Although we achieved an above average Smatch
score (54.0% versus 53.6%) in the preliminary
official scoring, the ablation metrics show that
we scored below average for named entities
(46.0% versus 55.8%) and wikification (0% versus
33.0%). Since we used gazetteers extracted from
the training data for both named entities and wik-
ification, this suggests that external data sources
should have been used instead.
3 AMR-to-text generation
Our approach to text generation from AMR graphs
stems from a recent feasibility study (Gruzitis and
Barzdins, 2016) on the grammar-based generation
of storyline highlights – a list of events extracted
from a set of related documents. The events would
be represented by pruned AMR graphs acquired
by an abstractive text summarizer and verbalized
afterwards.
Such storyline highlight extraction is a part of
the H2020 research project SUMMA, Scalable
Understanding of Multilingual MediA.1 The sto-
ryline highlights are expected to be relatively sim-
ple and concise in terms of grammatical structure
and, thus, in terms of the underlying meaning rep-
resentation. In our use case, adequacy and seman-
tic accuracy of the generated sentences, and the
control of the generation process are more impor-
tant than fluency. Therefore we are following a
grammar-based approach for AMR-to-text gener-
ation. In the SemEval task, however, we are push-
ing the limits and scalability of such approach, as
the task requires much more robust wide-coverage
general-purpose generation.
The proposed approach builds on Grammatical
Framework, GF (Ranta, 2011). GF is a grammar
formalism and technology for implementing com-
putational multilingual grammars. GF grammars
are bi-directional; however, they are particularly
well suited for language generation. Most im-
portantly, GF provides a wide-coverage resource
grammar library with a language-independent
API – a shared abstract syntax. The idea is to
transform the AMR graphs to the GF abstract
syntax trees (AST), leaving the surface realiza-
tion (linearization) of ASTs to the existing En-
glish resource grammar.2 Since the GF resource
grammar library supports many more languages,
this approach automatically extends to multilin-
gual AMR-to-text generation, provided that there
is a wide-coverage translation lexicon which in-
cludes named entities.
Because the coverage of our hand-crafted
AMR-to-AST transformation rules is currently
far from complete, we use the JAMR generator
(Flanigan et al., 2016) as a default option for
AMRs not fully covered by the rules. In other
words, we want to measure if sentences generated
by the GF approach (from AMRs fully covered by
the transformation rules) outperform the respec-
tive JAMR-generated sentences. If so, it would be
worth developing this approach further.
3.1 Grammatical Framework
More precisely, GF is a categorial grammar for-
malism and a functional programming language,
specialized for multilingual grammar develop-
ment. It has a command interpreter and a batch
compiler, as well as Haskell and C run-time sys-
1http://summa-project.eu
2In terms of GF, linearization refers to resolving the word
order, word forms (agreement), function words, etc.
tems for parsing and linearization. The C run-time
system has Java and Python bindings, and it allows
for probabilistic parsing as well. Compiled GF
grammars can be embedded in applications writ-
ten in other programming languages.3
The key feature of GF grammars is the divi-
sion between the abstract syntax and the concrete
syntax. The abstract syntax defines the language-
independent semantic structure and terms, while
the concrete syntax defines the surface realiza-
tion of the abstract syntax for a particular lan-
guage. The same abstract syntax can be equipped
with many concrete syntaxes (and lexicons) – re-
versible mappings from ASTs to feature structures
and strings – making the grammar multilingual
(Ranta, 2004).
What makes the development of GF application
grammars rapid and flexible is the general-purpose
GF resource grammar library, RGL (Ranta, 2009).
The library currently covers more than 30 lan-
guages that implement the same abstract syntax,
a shared syntactic API. The API provides over-
loaded constructors like
mkVP : V2→ NP→ VP
mkVP : VP→ Adv→ VP
for building a verb phrase from a transitive verb
and an object noun phrase, or for attaching an ad-
verbial phrase to a verb phrase, etc. – all without
the need of specifying low-level details like inflec-
tional paradigms, syntactic agreement and word
order. These details are handled by the language-
specific resource grammars.
Note that the overloaded API constructors gen-
eralize over the actual functions of the abstract
syntax. The respective RGL functions of the above
given mkVP constructors are
ComplV2 : V2→ NP→ VP
AdvVP : VP→ Adv→ VP
These constructors and functions are applied to
build ASTs. For instance, the sentence
“The boys want an adventure.”
is represented by the following AST w.r.t. RGL:
(PredVP
(DetCN
(DetQuant DefArt NumPl)
3http://www.grammaticalframework.org
(UseN boy N))
(ComplV2
want V2
(DetCN
(DetQuant IndefArt NumSg)
(UseN adventure N))))
The respective API constructor application tree
is more general and simpler:
(mkCl
(mkNP the Quant pluralNum boy N)
(mkVP
want V2
(mkNP a Quant adventure N)))
where want V2, boy N and adventure N are
nullary lexical functions, while the Quant and
a Quant are predefined constructors for the func-
tion words, and pluralNum is a parameter for se-
lecting the plural form of the noun. In GF, there is
no formal distinction between syntactic and lexi-
cal functions.
We map AMRs to ASTs by a sequence of
pattern-matching transformation rules, using RGL
API constructors as a convenient intermediate
layer. The language-specific linearization of the
acquired ASTs is already defined by the English
(or other language) resource grammar and lexicon.
3.2 AMR-to-AST transformation
The overall transformation process is as follows:
1. The input AMR is rewritten from the PEN-
MAN notation to the the LISP-like bracket-
ing tree syntax by a simple parsing expres-
sion grammar.
2. In case of a multi-sentence AMR, the graph is
split into two or more graphs to be processed
separately.
3. For each AMR graph represented as a LISP-
like tree, a sequence of tree pattern-matching
and transformation rules is applied, acquiring
a fully or partially converted AST constructor
application tree w.r.t. the API of RGL.
4. In case of a partially converted AST, the
pending subtrees are just pruned.4
5. The resulting ASTs are passed to the GF in-
terpreter for linearization.
4For the SemEval submission, we took the respective
JAMR-generated sentence instead, skipping the fifth step.
Inspired by Butler (2016), we use the Stan-
ford JavaNLP utilities Tregex and Tsurgeon (Levy
and Andrew, 2006) for the pilot implementation
of AMR-to-AST conversion.5 The difference of
our approach is that we convert AMR graphs to
abstract instead of concrete syntax trees, and the
choice of GF allows for further multilingual text
generation, preserving grammatical and semantic
accuracy.
In the time frame of the SemEval task, we de-
fined around 200 transformation rules6 covering
many basic and advanced constructions used in
AMR.7 To illustrate the ruleset, let us consider the
AMR graph given in Figure 1, and its expected
AST in Figure 2, acquired by the ordered rules
outlined in Figure 3. For each rule, P denotes a
simplified Tregex pattern (a subtree to match), and
R denotes the resulting subtree – after Tsurgeon
operations like adjoin, move, relabel, delete have
been applied (omitted in Figure 3). Note that we
first slightly enrich the original AMR by adding
frame-specific semantic roles to ARG2..ARGn.
In our example, :ARG4 is rewritten to :ARG4-
GOL, based on PropBank. Semantic roles are
used to determine the preposition in a preposi-
tion phrase (see the ninth rule in Figure 3). Thus,
we get GOL Prep for the NP under :ARG4 of
go-02, which, in the current prototype, is always
linearized as the preposition “to”. Although this
preposition fits to our example, in general, other
preposition can be used in the realization of GOL.
More elaborated post-processing is needed to re-
construct the prepositions that are lost in the AMR
representation of the input sentence. Statistics
from the PropBank corpus (Palmer et al., 2005)
would be helpful to decide whether there is a dom-
inant frame-dependent preposition for the argu-
ment/role, or a dominant NP-dependent preposi-
tion, independently of the frame.
In addition to the regular AMR constructions,
we have defined a number of rules for the treat-
ment of the special frames: have-org-role-91 and
have-rel-role-91. The special rules are applied be-
fore the regular ones. We have also introduced
some post-editing rules over the resulting ASTs to
5https://github.com/
GrammaticalFramework/gf-contrib/tree/
master/AMR/AMR-to-text
6More precisely, we have defined 198 Tregex patterns
over AMR/AST trees. For each pattern, 2.5 Tsurgeon trans-
formation operations are defined on average (493 in total).
7Roughly estimating, the development of the current rule-
set took us less than two person months.
(w / want-01
:ARG0 (b / boy)
:ARG1 (g / go-02
:ARG0 b
:ARG4 (c / city
:name (n / name
:op1 "New"
:op2 "York"
:op3 "City")
:wiki "New_York_City")))
Figure 1: An AMR representing the sentence “The
boys want to go to New York City”.
(mkText (mkUtt (mkS
(mkCl
(mkNP a_Quant (mkCN boy_N))
(mkVP
want_VV
(mkVP
(mkVP go_V)
(mkAdv
GOL_Prep
(mkNP (mkPN "New York City")))))
))) fullStopPunct)
Figure 2: An AST acquired from the AMR given
in Figure 1. When linearized, the AST yields “A
boy wants to go to New York City” in English, or
“En pojke vill ga˚ till New York City” in Swedish.
make the final linearization more fluent. For in-
stance, simple attributive relative clauses are con-
verted to adjective modifiers; e.g. “luck that is
good” gets converted to “good luck”. Similarly, it
would be often possible to convert general nouns
modified by simple verbal relative clauses into
more specific nouns omitting the use of relative
clauses: “person that reports” – to “reporter”, “or-
ganization that governs” – to “government”, etc.
Regarding the RGL lexicon, it contains more
than 60,000 lexical entries, providing a good cov-
erage for general purpose applications. To handle
out-of-vocabulary words and multi-word expres-
sions which most frequently are named entities
(e.g. “New York City”), we use low-level RGL
constructors to specify fixed strings.
3.3 JAMR Generator
A pre-trained JAMR generation model (Flanigan
et al., 2016) along with provided Gigaword corpus
4-grams were used.8 The JAMR authors reported
8https://github.com/jflanigan/jamr/
tree/Generator
1. mkVP : VV→ VP→ VP
P (frameA (:ARG1 (var frameB)))
R (want-01A (mkVP go-02B))
2. mkCl : VP→ Cl
P (var frameA)
R (mkCl (mkVP want-01A))
3. mkCl : NP→ VP→ Cl
P (mkCl (mkVP (frameA :ARG0)))
R (mkCl :ARG0 (mkVP want-01A))
4. mkNP : Quant→ CN→ NP
P (:ARG0 (var conceptA))
R (mkNP a Quant (mkCN (b boyA)))
5. mkCN : N→ CN
P (mkCN (var conceptA))
R (mkCN boy NA)
6. Recursively merge :op1 .. :opn under :op1
P (name (:op1 literalA) (:opi literalB))
R (name (:op1 “NewA YorkA CityB”))
7. mkPN : Str→ PN
mkNP : PN→ NP
P (var (name (:op1 literalA)))
R (mkNP (mkPN “New York City”A))
8. Excise a node chain: var - NE type - :name
P (var (type (:name mkNP)))
R (mkNP)
9. mkVP : VP→ Adv→ VP
mkAdv : Prep→ NP→ Adv
P (mkVP (frameA (ARGn-roleB mkNP)))
R (mkVP (mkVP go-02A)
(mkAdv GOL PrepB mkNP))
10. Ignore (delete) :wiki nodes.
11. Relabel frames, depending on their syn-
tactic valence in the resulting AST. Thus,
want-01 becomes want VV (a verb-phrase-
complement verb), in contrast to want V2
(see Section 3.1), and go-02 becomes just
go V (an intransitive verb).
Figure 3: A sample set of AMR-to-AST transfor-
mation rules for converting the AMR in Figure 1
into the AST in Figure 2. P – pattern, R – result.
The terms in italic (P) refer to regular expressions.
their original results on Gigaword corpus 5-grams
(Gigaword licence required) which is known to
improve the BLEU score by approx. 1 point.
3.4 First results
By applying the transformation ruleset (see Sec-
tion 3.2), we were able to fully convert and lin-
earize 12.3% of the 1293 evaluation AMRs. Ad-
ditionally, we acquired partially transformed trees
and, consequently, partially generated sentences
for another 36% of the evaluation AMRs, but we
did not include those sentences in the final sub-
mission, since large subtrees often got pruned. In-
stead, we replaced them by sentences acquired by
JAMR Generator (see Section 3.3).
With the combined JAMR and GF-based sys-
tem, we achieved Trueskill 107.2 and BLEU 18.82
in the preliminary official scoring. The key open
question is if the 12.3% GF-generated sentences
scored better or worse in comparison to the JAMR
output by human-evaluated Trueskill.
By BLEU-scoring the GF-generated sentences
apart from the JAMR-generated sentences, the
BLEU scores are 11.35 and 19.18 respectively.
This suggests that BLEU favors the corpus-driven
JAMR approach which tries to reproduce the orig-
inal input sentence, while the grammar-driven GF
approach sticks to AMR more literally, producing
a paraphrase of the input sentence. Therefore we
are primarily interested in Trueskill and less con-
cerned about BLEU.
4 Conclusion
The SemEval subtask on AMR-to-text generation
has given us confidence that it is worth to develop
further the GF-based approach. We are particu-
larly pleased to see that AMR equipped with GF
is emerging as a powerful interlingua for seman-
tic cross-lingual applications. Although it is dif-
ficult to reach a large coverage in a short term,
a grammar-based approach complemented with
statistics from AMR and PropBank-annotated cor-
pora is competitive with other approaches in a
longer term, at least for use cases where adequacy
is more important than fluency.
Acknowledgments
This work was supported in part by the Latvian
State research programmes SOPHIS and NexIT,
the H2020 project SUMMA (grant No. 688139),
and the European Regional Development Fund
grant No. 1.1.1.1/16/A/219. We also thank Aarne
Ranta, Alexandre Rademaker and Alastair Butler
for encouraging and helpful discussions.
References
Laura Banarescu, Claire Bonial, Shu Cai, Madalina
Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin
Knight, Philipp Koehn, Martha Palmer, and Nathan
Schneider. 2013. Abstract Meaning Representation
for Sembanking. In Proceedings of the 7th Linguis-
tic Annotation Workshop and Interoperability with
Discourse.
Guntis Barzdins and Didzis Gosko. 2016. RIGA at
SemEval-2016 Task 8: Impact of Smatch extensions
and character-level neural translation on AMR pars-
ing accuracy. In Proceedings of the 10th Interna-
tional Workshop on Semantic Evaluation.
Alastair Butler. 2016. Deterministic natural language
generation from meaning representations for ma-
chine translation. In Proceedings of the 2nd Work-
shop on Semantics-Driven Machine Translation.
Jeffrey Flanigan, Chris Dyer, Noah A. Smith, and
Jaime Carbonell. 2016. Generation from Abstract
Meaning Representation using Tree Transducers. In
Proceedings of the 2016 Conference of the North
American Chapter of the Association for Computa-
tional Linguistics: Human Language Technologies.
Normunds Gruzitis and Guntis Barzdins. 2016. The
role of CNL and AMR in scalable abstractive sum-
marization for multilingual media monitoring. In
Controlled Natural Language, Springer, volume
9767 of LNCS.
Roger Levy and Galen Andrew. 2006. Tregex and
Tsurgeon: Tools for querying and manipulating tree
data structures. In Proceedings of the 5th Interna-
tional Conference on Language Resources and Eval-
uation.
Jonathan May and Jay Priyadarshi. 2017. Semeval-
2017 task 9: Abstract meaning representation pars-
ing and generation. In Proceedings of the 11th In-
ternational Workshop on Semantic Evaluation.
Martha Palmer, Daniel Gildea, and Paul Kingsbury.
2005. The Proposition Bank: An Annotated Cor-
pus of Semantic Roles. Computational Linguistics
31(1).
Aarne Ranta. 2004. Grammatical Framework: A Type-
Theoretical Grammar Formalism. Journal of Func-
tional Programming 14(2).
Aarne Ranta. 2009. The GF Resource Grammar Li-
brary. Linguistic Issues in Language Technology
2(2).
Aarne Ranta. 2011. Grammatical Framework: Pro-
gramming with Multilingual Grammars. CSLI Pub-
lications, Stanford.