Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolved issue #5 #6

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
12 changes: 10 additions & 2 deletions gdc.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,7 @@ def search(endpoint, in_filter={}, exclude_filter={}, fields=[], expand=[],
response = requests.post(url, data=payload)
else:
response = requests.get(url, params=payload)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer not to have whitespace changes if not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. will change that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed white space.

if response.status_code == 200:
results = response.json()['data']['hits']
if typ.lower() == 'json':
Expand Down Expand Up @@ -376,8 +377,11 @@ def get_project_info(projects=None):
project_df = search('projects', in_filter=in_filter,
fields=['name', 'primary_site', 'project_id',
'program.name'])
return project_df.set_index('id')

if(project_df.empty==False):
return project_df.set_index('id')
else:
return project_df
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function should just return project_df not matter what to make things consistent. And returning empty dataframe on failed/no-hit search seems reasonable to me. Therefore, things has to be changed here accordingly. And the check here may be whether project_df is None since search can return None. So check that and make sure it returns empty dataframe properly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yunhailuo so you mean to return project_df itself? I mean the other functions would get changed where we were using project_df.set_index('id') state of dataframe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arp95 Exactly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yunhailuo Thanks for clearing out. Working on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yunhailuo if the project_df returned is null, then should I remove that project from projects array in gdc2xena file or just leave it empty?

Copy link
Collaborator

@yunhailuo yunhailuo Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arp95 I think an empty projects array makes more sense to me. That means you didn't get any valid projects from gdc.get_project_info(). For that line, it will be all or none, right? So there is no removing one project from projects array?


def get_samples_clinical(projects=None):
"""Get info for all samples of ``projects`` and clinical info for all
Expand Down Expand Up @@ -431,7 +435,11 @@ def main():
print('A simple python module providing selected GDC API functionalities.')

# Simple test
print(get_project_info(['TCGA-THCA']).head())
df = get_project_info()
if(df.empty==False):
print(df.head())
else:
print("Empty dataframe!")


if __name__ == '__main__':
Expand Down