The coding period Begins!

Week 1 and 2 (7th to 20th June)

Hey there!😄

In the previous blog, I had mentioned the problem with the Entity linking method. After discussion with my mentors, I got to know about DBpedia spotlight and Falcon. It provides an API to annotate the entities in the sentence. Falcon has a better F-score than Spotlight. For more insights, you can find a referenced Paper here.

By this point, my mentors suggested that I should use API + entity disambiguation for better results. Now, let us know what is entity disambiguation?

In entity disambiguation, we use extracted entities from API and then validate those probable entities by breaking down the token and validating it using “SELECT ?uri ?label WHERE { ?uri rdfs:label ?label . ?label bif:contains “‘’’+token+’’’” } limit 100" this query with limit 100.

In case the API fails, we go for comparing the question user asked and matched question in Dialogflow to get the probable entities. Then, we validate it using entity disambiguation.

This approach is actually efficient, but the main problem is with accuracy of the API and time. Dialogflow requires webhook to return the request in a certain amount of time, else the webhook call fails. So, we can not use Falcon which has a very good F-score but it takes a good amount of time. Therefore, we use Spotlight + entity disambiguation.

So, considering this problem, I got a task to Benchmark LC-QuAD using API + entity disambiguation, where I compared benchmarks answers from LCQuad-v1-dataset with

  1. Dbpedia spotlight annotations
  2. spotlight +extraction + Entity disambiguation
  3. Falcon annotations.
  4. Extraction + Entity disambiguation.

Spotlight had an F-score of 0.558 , Spotlight+disambiguation+extraction had an F-score of 0.599 whereas Falcon had an F-score of 0.829 . Afterwards, I observed that extraction + entity disambiguation can also perform, so I implemented that as well, which had an F-score of 0.639. Now, this proves the point that even with single words or probable entities, entity disambiguation can give correct results.

After all these implementations, my observation was that falcon had a tendency to over annotate like it annotated extra wrong entities which reduced the precision score thus overall score. Also, Falcon took 13 hours to annotate 5000 small questions which is not quite time-efficient. For Dbpedia spotlight I noticed that it missed out on some easy entities due to incomplete entity names in the questions.

All of these are properly documented with the dataset, code, and observations here — GitHub.

So, what’s incoming?

For the next week, we’re going to look into the Benchmarked dataset and try to separate and find out the samples for which the above approaches did not work and why they did not work.

This was all for weeks 1 and 2. You can find the Dialogflow updated webhook code here — DBpedia-LiveNeural-Chatbot and LC-QuAD benchmarked dataset, code and observations here — benchmarks/LC-QuAD.

See you next week !!

Till then, STAY TUNED!🙌.

Thank You!!✨✨

--

--