Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TODOs after refactoring task schema #202

Open
1 of 6 tasks
pfliu-nlp opened this issue May 14, 2022 · 0 comments
Open
1 of 6 tasks

TODOs after refactoring task schema #202

pfliu-nlp opened this issue May 14, 2022 · 0 comments

Comments

@pfliu-nlp
Copy link
Contributor

pfliu-nlp commented May 14, 2022

Core file: dataset_info.jsonl`

Latest version

TODO Items

  • So far, almost all ERRORs result from the use of the google drive link, which can work sometimes but will fail as well sometimes. We can move them to S3 gradually (Since most of them are from summarization tasks, so maybe @yixinL7 and @xcfcode could help out with this part.
  • languages for several datasets should be added.

Some other follow-up things that should be done after task refactoring:

  • IMPORTANT: update get_dataset_info.py and dataset_info.json/ make sure it could be applied to explainaboard_web db: Post-refactoring (update get_dataset_info & dataset scripts) #203
  • update docs for newly-introduced task schema.
  • make sure all datasets include
    • languages
    • other important metadata
  • add task schema for (also think about modality-dependent schema)
    • glue-stsb
    • superglue
    • polyprompt
  • reformat the organization of some datasets
    • adv_mtl
  • add unit test for checking the validity of the newly-introduced script of dataset loader.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant