Week 3 and 4

--

(21st June — 4th July)

Hello, Hii, hola!!

The last 2 weeks were enthralling, we started with separating samples from our datasets that contained questions, benchmark entities, predicted entities, and their precision and recall score. eg- falcon dataset. For separating, we filtered out those samples where precision and recall both were 0.

After filtering the failed samples eg- Failed Falcon, I started looking into the samples in which I noticed few categories where falcon was giving wrong entities like -

  1. Falcon can identify the probable entities correctly but while entity disambiguation, fetches the incorrect entities. This happens because of acronyms and other entities which contain extracted entities’ words.

Eg-

question — “Count all the different purposes followed by the different NGOs.”
Benchmarked Entity- [‘http://dbpedia.org/resource/Non-governmental_organization']
Entity by Falcon — [‘http://dbpedia.org/resource/Ngosini_West']

2. Good quantities of difference between entities mentioned in benchmark answers and entities mentioned in questions.

Eg-

question — “In how many places have Irishmen died?”
Benchmarked Entity- [‘http://dbpedia.org/resource/Ireland']falcon Entity by Falcon- [‘http://dbpedia.org/resource/William_Tennant_(United_Irishmen)']

There were few more examples and categories for both Falcon and spotlight which if mentioned here will increase the size of the blog, So we have a Google doc for it that contains problems and examples where those approaches are failing, google docs was also preferred so that all mentors can review and comment on the same document.

After this analysis of both the approaches while it was under review, I tried a new approach for Entity liking which used DBpedia lookup instead of Dbpedia Sparql endpoint and a new formula for disambiguation. Scores for this approach -

precision score: 0.7232011747430249
recall score: 0.7230962869729389
F-Measure score: 0.7231487270546688

Here DBpedia lookup takes candidate entities directly like “Winston Churchill” and returns an XML file, which we then convert into JSON and from the list of the candidates related to bill finger then we match and select one entity from the returned candidates. Now for matching/disambiguation, we find out the intersection of words between the question and entity and divide it by the number of words in the entity I.e. (intersection of words/length of entity)

Eg-

question is -”Was winston churchill the prime minister of Selwyn Lloyd”
So the candidates and their score for token “winston churchill” will be — score = intersection/(entity length)
[‘winston’, ‘churchill’] : 2 : 1.0
[‘winston’, ‘churchill’, ‘range’] : 2 : 0.6666666666666666
[‘winston’, ‘churchill’, ‘(novelist)’] : 2 : 0.6666666666666666

If the score is the same for more than one candidate then we select the one with a Maximum no. of intersections with the question. Now, this disambiguation approach is one of the main reasons for the increase in scores, this is taken from Jaccard where (intersection / (questionlength+ entity length — intersection)) But our’s is a bit different, here the intersection of words is divided by entity length.

Here is the CODE and DATASET for the above-mentioned approach.

All of these are properly documented with the dataset, code, and observations here — GitHub and all the problems and samples can be found in this Google doc.

What’s for next week?

For the next week, we’re going to look into our new approach and see where it fails. We will also try to implement the same lookup approach but with the Jaccard formula for disambiguation and see how it’s different from the last one.

This was all for weeks 3 and 4. You can find the LC-QuAD benchmarked dataset, code, and observations here — benchmarks/LC-QuAD.

See you next week !!

STAY TUNED!🙌.

--

--

No responses yet