Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

learner profiles not matching #235

Closed
psteinb opened this issue Nov 26, 2020 · 2 comments
Closed

learner profiles not matching #235

psteinb opened this issue Nov 26, 2020 · 2 comments

Comments

@psteinb
Copy link
Contributor

psteinb commented Nov 26, 2020

During the HPC Carpentry co-working hour on Nov 26, 2020, I discovered that we have unsynced learner profiles. There are learner profiles in this repo here

* A statistics student wants to cross-validate a model. This involves running the model 1000
  times -- but each run takes an hour. Running the model on a laptop will take over a month!

* A genomics researcher has been using small datasets of sequence data, but soon will be receiving
  a new type of sequencing data that is 10 times as large. It's already challenging to open the
  datasets on a computer -- analyzing these larger datasets will probably crash it.

* An engineer is using a fluid dynamics package that has an option to run in parallel. So far, this option was not utilized on a desktop. In going from 2D to 3D simulations, the simulation time has more than tripled. It might be useful to take advantage of that option or feature.

and here:

> ## Environmental Biology
> 
> Y. is an environmental biologist that uses DNA signatures obtained from
> soils to study species diversity in the environment. 
> She need to compare DNA sequences to large databases. So far, she has
> been able to use web-based tools for her limited datasets.
> 
> Recently, Y has started working with much larger datasets, and
> discovered that the online tool he uses has a limit of 50 entries on the
> online server. 
> He has heard it should be possible to run the same tool through the
> command line, and managed to install it on his local Laptop. 
> Now, however, it takes several days before each of the analyses are
> finished.
> 
> The workshop will teach Y to move his data to and from the university's
> computer cluster, and submit jobs using pre-installed software on the
> cluster. 
> Afterwards, Y will be able to analyze her own data and pre-installed
> command-line based versions of the tool
> to spread the analysis over several dozen cores so it finishes in a few
> hours.
> ## Physics (or many other domains!)
> 
> A new PhD student is given a task to select parameters for their
> simulation.  They need to run a set of calculations on several thousand 
> combinations of parameters.  One calculation takes several minutes. 
> They set up the problem on their laptop but quickly realise 
> that it would take more than one month to complete the task. 
> They are told to use local HPC but they are not sure how this would help
> them.
@psteinb
Copy link
Contributor Author

psteinb commented Nov 26, 2020

Given @ocaisa issue where to put the learner profile, I'd follow the suggestion made in that comment, to put the learner profiles into the extras folder.

@tkphd
Copy link
Contributor

tkphd commented Jan 29, 2021

Closed by #236.

@tkphd tkphd closed this as completed Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants