Skip to content

Add a simple Question Answering notebook with Haystack #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 9, 2023

Conversation

codificat
Copy link
Member

@codificat codificat commented Feb 8, 2023

In #9 we are exploring various QA systems.

This PR provides a simple experiment of Extractive and Generative QA using Haystack

@codificat codificat requested a review from durandom as a code owner February 8, 2023 20:25
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codificat
Copy link
Member Author

NOTE: this PR also includes the sample dataset from #11 (same commit) in order to have data to work on.

@@ -0,0 +1,494 @@
{
Copy link
Member

@Shreyanand Shreyanand Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start! The adversarial example was a good addition :D Haystack makes implementing the reader retriever approach quite intuitive. For the source of context, it should have a method to point back to the file, it seems to be a simple string search problem...

For the PR, I think it has the commits from the file addition PR as well. I guess if we separate them then this can be merged independently.

Looking forward to the generative example from haystack as well!


Reply via ReviewNB

@@ -0,0 +1,494 @@
{
Copy link
Collaborator

@suppathak suppathak Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #24.            details="minimum" ## Choose from minimum, medium, and all

Looks great!! Thanks Pep.

I was observing some of the contexts. I do have some observations and suggestions in this context.

  1. "All" is the choice means whole document ?
  2. I also observed that some of the question got included in the context along with the answers. what if we separate the questions from the answers in the corpus text. and train our model with the text only containing the answers and not question. lmkwyt? I will also try to apply it in my QA model testing work.

Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all means: show all the details about each answer, e.g. score, offset, ... vs minimum which only shows the answer and context. I thought "minimum" would be enough here, but happy to change it to show all the details.

About context that includes questions: yes, that comes from some of the source documents that do include questions. Most notably, the FAQ in the ROSA worshop: https://www.rosaworkshop.io/rosa/14-faq/

If we try to pre-process the content to separate Questions from Answers, I'm not sure how successful that would be in general because most of the answers need their respective questions in order to make sense (extreme case: there are answers that are just "Yes"). But for that document in particular (the FAQ) it might make sense to use it for the "squad-like" test... only that I have not verified if all the answers there can be obtained from other documents.

@codificat codificat marked this pull request as draft February 15, 2023 11:45
@codificat
Copy link
Member Author

Converted this PR to draft while I'm working to expand it with a Generative QA approach

@codificat codificat force-pushed the haystack-experiment branch 2 times, most recently from d3100d1 to 56a6b2a Compare February 22, 2023 22:37
@codificat
Copy link
Member Author

Updated with the current version that adds 3 generative QA types: RAG, LFQA and OpenAI-based.

Context now includes the full ROSA docs (plus the ROSA workshop and the MOBB material in the data/external samples)

Results are not great, I'm still trying to see if they can be improved a bit - also need to elaborate/document.

@codificat codificat marked this pull request as ready for review February 23, 2023 22:23
@codificat
Copy link
Member Author

Ok, I believe this is ready for another review.

The RAG version is not working well for some reason that so far has escaped me. @Shreyanand @suppathak if you have suggestions especially on that part they would be most welcome.

I have added the retrieval of the whole ROSA docs from S3 storage, and these docs together with the in-repo samples (ROSA workshop and MOBB) are used for context.

There are now more comments/docs and the structure has also been updated, hopefully making it more easy to follow.

@@ -0,0 +1,1462 @@
{
Copy link
Member

@Shreyanand Shreyanand Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to inspect the model would be to observe what retriever is fetching. Maybe it's not providing enough context to the generator. What happens if we tweak retriever model parameters?

Also. another possible explanation could be that these embedding models may not be advanced enough to capture language constructs leading to poor answers.


Reply via ReviewNB

@@ -0,0 +1,1462 @@
{
Copy link
Member

@Shreyanand Shreyanand Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In essence, this notebook compares free and OS model bart_lfqa and paid Open AI model text-davinci-003 for the long form generative QA task.

For the extractive qa task it tries roberta-base-squad2 . For all of these experiments, the retriever is BM25.

Also, it has a separate experiment with combined DPR as retriever and facebook/rag-token-nq as the generator model.

Could we add this classification in the summary/conclusion? Once we have a validation dataset and metrics defined, we can add results of these experiments based on the metrics as well.


Reply via ReviewNB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In essence, this notebook compares free and OS model bart_lfqa and paid Open AI model text-davinci-003 for the long form generative QA task.

While it does have a free model for LFQA and an OpenAI version, the main goal of the notebook is to explore Haystack as a framework.

For all of these experiments, the retriever is BM25.

Dense vectors are used for generative QA - and a dense retriever accordingly. I expanded the introduction a bit to hopefully explain that (although I am not getting into details - should I?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the detail level is appropriate now.

While it does have a free model for LFQA and an OpenAI version, the main goal of the notebook is to explore Haystack as a framework

That makes sense.

Signed-off-by: Pep Turró Mauri <[email protected]>
@codificat codificat force-pushed the haystack-experiment branch from f1db9ea to 0f82abb Compare March 3, 2023 17:21
@codificat
Copy link
Member Author

codificat commented Mar 3, 2023

Another update:

  • I now removed the RAG generator test: it does not work, and deepset plan to remove the RAG generator tutorial.

  • I found a problem with the Markdown pre-processor. I mention it as "fixmes" in the notebook. I am inclined NOT to try to fix these in this notebook though: the purpose of this PR is to review Haystack as a framework, and it there are issues with the markdown pre-processor they should be mentioned, right?

Copy link
Member

@Shreyanand Shreyanand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@codificat codificat removed the request for review from durandom March 7, 2023 17:27
@Shreyanand Shreyanand merged commit 5076c6f into redhat-et:master Mar 9, 2023
@codificat codificat deleted the haystack-experiment branch March 10, 2023 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants