In this work we investigate the performance of different sequence to sequence architectures (seq2seq) for transliteration. We start our inves- tigation with simple transliteration tasks based on artificially generated transliterations which easily demonstrate some of the key character- istics of the different seq2seq models. Specif- ically we investigate the seq2seq models based on LSTMs and Bi-LSTMs and show how they compare to encoder-decoder architectures as de- scribed in Cho et al. (2014). We then show the effect of adding an attention mechanism as de- scribed in Bahdanau et al. (2014). In the last part of the paper we visualize the attention ma- trix for some examples that illustrate different aspects of the attention mechanism.
Investigation of variants of a neural parser based on Recursive Convolutional Networks (rCNNs). We evaluate this parser with a toy problem, which is to accept valid sentences produced by a simple grammar. The toy problem can only be solved if the parser is able to detect dependencies over long distances. We develop the intuition how to map the recursive application of grammar productions naturally to the structure of a rCNN. Further we show that the problems of feature reuse and of vanishing/exploding gradients are hard problems that can be addressed by addition of a gating logic to our convolutional operator. In the last part of the paper we show that the gating logic also can help to drastically improve the performance of a neural parser based on a binary tree topology.
Investigation of variants of recursive Convolutional Networks (rCNNs) for a relation classification task. We give a short intuition why recursive Convolutional Networks (rCNNs) are well suited for tasks dependent on structural patterns. We then show that we can improve the performance substantially by adding a gating logic to our convolutional architecture. The task is taken from exercise 7 from the Deep Learning master course at CIS-LMU Munich, WS 2016/17.
In this work we give an overview of recent work on relation classification with convolutional neural networks (CNNs). We first give a general overview on convolutional architectures and then look at work of Santos et al. (2015). We give a short description of the shared task Semeval 2, 2010, Task 8, see Hendrickx et al. (2009). We then discuss the paper Strubell et al. (2017) and explain a new convolutional architecture for relation classification based on the architecture presented in their paper. In the last part we describe the task from exercise 7 from the Deep Learning master course at CIS-LMU Munich, WS 2016/17 and run experiments with our architecture on this task. Finally we compare the results.
View Relation Classification with Convolutional Networks as PDF.