GSOC-Week-One

2020-06-08

Introduction

After last meeting(held on every Thursday), I have gone through several approaches of template expansion(Paraphrase generation).

Template Expansion
We should differentiate between Template Expansion (what we’re focussing on) and Template Generation, i.e. completely new questions for which we need to build also the corresponding SPARQL query.
The expected output would be a paraphrase of the original template.

Paraphrases
Paraphrases are sequences that convey the same meaning but using different wording. Paraphrasing exists at different granularity levels, such as lexical level, phrasal level and sentential level.

Although most of the papers don’t publish their source code, I’ve made a list for those whose implementation is open-sourced:

Title	Paper	Code source and Implementation
-	https://arxiv.org/pdf/1709.05074.pdf	https://github.com/kefirski/pytorch_RVAE
-	-	https://github.com/vsuthichai/paraphraser
Neural Syntactic Preordering for Controlled Paraphrase Generation	https://arxiv.org/pdf/2005.02013v1.pdf	https://github.com/tagoyal/sow-reap-paraphrasing
Decomposable Neural Paraphrase Generation	https://www.aclweb.org/anthology/P19-1332.pdf	——
Syntax-guided Controlled Generation of Paraphrases	https://arxiv.org/pdf/2005.08417v1.pdf	https://github.com/malllabiisc/SGCP
Paraphrase Generation with Latent Bag of Words	https://arxiv.org/pdf/2001.01941v1.pdf	https://github.com/FranxYao/dgm_latent_bow
T5 model	https://arxiv.org/pdf/2002.08910.pdf	https://github.com/ramsrigouthamg/Paraphrase-any-question-with-T5-Text-To-Text-Transfer-Transformer-

Among all, the last T5 model(proposed by my mentor Tommaso) has the best instruction of implementation and seems to match our need, so we will start our Template Expansion task with this model.

Pipeline of Template Expansion

Figure1: New question template will be generated by our Paraphrase Model and hopefully be matched with the same Query template

Evaluation of the quality

We should first of all ensure that the paraphrase of question template have the same meaning with the original one, this could be done with Universal Sentence Encoder, which could help us compute sentence level semantic similarity scores.
Here are some results of the preliminary experiments realized with the pre-trained T5 model and the metric Cosine similarity:

Original Question One:

When is the birth date of XYZ ?

Paraphrased Questions followed by its Cosine similarity :

0: When did the birth date of XYZ begin?
0.93356544
1: Is XYZ born on June 8th?
0.76759416
2: What is the year XYZ?
0.7768826
3: What is the date of birth of XYZ?
0.9564296
4: When is XYZ birth date?
0.97022116
5: When did you date your XYZ birth?
0.7718135
6: What was XYZ & when was his birthday?
0.71993476
7: When was XYZ born?
0.7896207
8: What is birth date of xyz?
0.9210696

We can say that #7 paraphrase is the perfect one because it is closer to people’s daily language. Additionally, it contains the conversion from the nominal predicate(birth date) to a verbal predicate(was born), but interestingly, the cosine similarity is relatively low among all (0.7896207).

Consequently, I think wh should have a second criteria to evaluate the quality of paraphrasing. With this second metric, we should be able to ensure that the syntax similarity is low as we expect very different question structures. To be noted, we are not looking for synonyms as those will be handled by replacing internal with global Word Embeddings.
This difference of syntax could be evaluated with Levenshtein distance, or other possible metrics. My mentor Tommaso proposed a tool part-of-speech tagging.

This POS tagging tool can make linguistic annotations as a token attributes, so it may help us to detect the ‘nominal2verbal’ or opposite changes.

But the most important thing is that the improvement of the overall F-score on the QALD-benchmark(Question Answering over Linked Data), that will be our final goal.

Next Weeks Plan

As we will still focus on the final metric on QALD benchmark, setting up a benchmark and a baseline will be necessary in the next weeks.
In addition, in order to build this benchmark, creating our own pipeline is also supposed to be done next weeks. I will not create a pipeline from scratch, but use my predecessor, Anand’s work as the base, and add my Template Expansion pipeline to it.