Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join topicType with subjectTopicType, evaluationBlock and genericTopicTopicType #5

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

YGPJACOB
Copy link

@YGPJACOB YGPJACOB commented Jul 11, 2024

Read in topicType, subjectTopicType, evaluationBlock and genericTopicTopicType.
And join them with id by topicTypeId

note: df is very big about 162 million objects.. Is there a better way to organize this?
Or maybe these csv's are not to be joined?

@YGPJACOB YGPJACOB linked an issue Jul 11, 2024 that may be closed by this pull request
3 tasks
@Tomeriko96 Tomeriko96 self-assigned this Jul 15, 2024
@Tomeriko96
Copy link
Member

Hi @YGPJACOB ,

Code looks good!

Had a look at the data. It seems deEvaluationBlock contains a lot of NA, I think it makes sense to filter out rows that have NA for name and/or description. This will allow the resulting join to be smaller.

Furthermore, some pointers:

  1. Make sure your branch is in sync with main, for when we will merge the request
  2. Change the title of the pull request to make it more descriptive (what will the pull request do)
  3. Update the description of the pull request to explain what it does

@YGPJACOB YGPJACOB changed the title 2 join tables to topictype Join topicType with subjectTopicType, evaluationBlock and genericTopicTopicType Jul 16, 2024
@YGPJACOB
Copy link
Author

NA filtered out for both name and description columns.
resulting in a smaller join

@tin900 tin900 requested a review from JornGitHub July 30, 2024 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Join tables to TopicType
2 participants