Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summary of potential issues (20220608) #253

Open
pfliu-nlp opened this issue Jun 9, 2022 · 1 comment
Open

Summary of potential issues (20220608) #253

pfliu-nlp opened this issue Jun 9, 2022 · 1 comment

Comments

@pfliu-nlp
Copy link
Contributor

pfliu-nlp commented Jun 9, 2022

1. cmrc2019

  • dataset = load_dataset("cmrc2019")
  • task_type: cloze-multiple-choice
    image
  • example: load_dataset("gaokao2018_np1", "cloze-multiple-choice")

2. dureader_yesno

  • should answer be answers?
  • we should introduce context as a column
  • this is not a qa_extractive? it should be qa_multiple_choice or qa_bool?
@register_task(TaskType.qa_bool_dureader)
@dataclass
class QuestionAnsweringBoolDureader(QuestionAnswering):
    task: TaskType = TaskType.qa_bool_dureader
    question_column: str = "question"
    context_column: str = "documents"
    answers_column: str = "answers"
    
    answers: {"text": "xxx", "yesno_answer":"Yes"}

3. dureader_search

  • the task is qa_extractive while the context_column = "documents" is not a string

4. ckbqa

  • this dataset could be broken down to two tasks
    • qa_open_domain: question_column, answers_column
    • text_to_sql: question_column, sql_column

5. coqa

  • Similar to the above one

6. dureader_robust

answers = {"text": answer_text, "answer_start": answer_start}
(1) answers = [{"text": answer_text, "answer_start": answer_start}]
(2) answers = {"text": [answer_text], "answer_start": [answer_start]}

7. ccpm

8. cail2019

  • Similar to the above one

ccks2019_fin

  • the event type should also be regarded as one input?

ccks2020_fin_ee

ccks2021_fin_ea

  • it seems that the schema of arguments is different from the above one, so we probably need to modify the task name a little bit
  • v.s 2020: define a new task schema for ccks2021_fin_ea?

ccks2021_fin_re

  • it seems that the schema of relation is pretty complicated, should we modify the task name of event_relation_extraction
@pfliu-nlp
Copy link
Contributor Author

STSB

  • should the label type of text_similarity be float?

cail2018

  • if any config name contains "space" we can replace it with "-" or "_"

ubuntu_dialogs_corpus

  • it seems we didn't define task_template for this task?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant