Skip to content

1.3.0

Latest
Compare
Choose a tag to compare
@LukasMahieu LukasMahieu released this 13 Feb 16:01
78e5597

1.3.0

This is a big release wherein we introduce our model repository and do a functional refactoring of our tl.Crested class.

Features

  • new crested.get_model function to fetch models from crested model repository
  • new enformer and borzoi model options in the crested zoo, as well as scripts for converting weights to keras format.
  • new cut site bigwigs option for mouse cortex dataset
  • new crested model repository in the readthedocs with model descriptions.
  • new pattern clustering plot pl.patterns.clustermap_with_pwm_logos that shows PWM logo plots below the heatmap.
  • Extra parameters options in modisco calculations and plotting regarding allowed seqlets per cluster and top_n_regions selection.
  • option to color lines in gene locus scoring plotting
  • extra ylim option in crested.pl.bar.prediction
  • gene locus scoring plotting improvements

Bugfixes

  • importing bigwigs now correctly accounts for regions that were removed due to chromsizes

Notebooks

  • Rewrote the tutorials to use the new functional API (WIP).
  • Expanded on the enhancer design section

Functional Refactor of crested.tl.Crested(...) class

In this large refactor we're moving everything from the Crested class that does not use both a model and the AnnDatamodule out to a new _tools.py module where everything will be functional.
All the old functions remain in the Crested class for backward compatibility (for now) but will now raise a deprecation warning.

We're giving up a bit of clarity for ease of use by combining functions that do the same on different inputs into one single function.

Equivalent new functions

  • tl.Crested.get_embeddings(...) ---> tl.extract_layer_embeddings(...)
  • tl.Crested.predict(...) ---> tl.predict(...)
  • tl.Crested.predict_regions(...) ---> tl.predict(...)
  • tl.Crested.predict_sequence(...) ---> tl.predict(...)
  • tl.Crested.score_gene_locus(...) ---> tl.score_gene_locus(...)
  • tl.Crested.calculate_contribution_scores(...) ---> tl.contribution_scores(...)
  • tl.Crested.calculate_contribution_scores_regions(...) ---> tl.contribution_scores(...)
  • tl.Crested.calculate_contribution_scores_sequence(...) ---> tl.contribution_scores(...)
  • tl.Crested.calculate_contribution_scores_enhancer_design(...) ---> tl.contribution_scores(...)
  • tl.Crested.tfmodisco_calculate_and_save_contribution_scores_sequences ---> tl.contribution_scores_specific(...)
  • tl.Crested.tfmodisco_calculate_and_save_contribution_scores ---> tl.contribution_scores(...)
  • tl.Crested.enhancer_design_motif_implementation ---> tl.enhancer_design_motif_insertion
  • tl.Crested.enhancer_design_in_silico_evolution ---> tl.enhancer_design_in_silico_evolution

New functions

Some utility functions were hidden inside the Crested class but required to be made explicit.

  • utils.calculate_nucleotide_distribution (advised for enhancer design)
  • utils.derive_intermediate_sequences (required for inspecting intermediate results from enhancer design)

New behaviour

  • All functions that accept a model can now also accept lists of models, in which case the results will be averaged across models.
  • All functions use a similar api, namely they expect some 'input' that can be converted to a one hot encoding (sequences, region names, anndatas with region names), but now the conversion happens behind the scenes so the user doesn't have to worry about this and we don't have a separate function per input format.