Skip to content

Commit

Permalink
Merge pull request #5 from UBC-MDS/jill-dev
Browse files Browse the repository at this point in the history
Add feature descriptions to readme
  • Loading branch information
TariqAHassan authored Feb 11, 2018
2 parents 4dde5d4 + d8cdf78 commit 8af696b
Show file tree
Hide file tree
Showing 2 changed files with 47 additions and 19 deletions.
29 changes: 29 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
BSD 3-Clause License

Copyright (c) 2018, Master of Data Science at the University of British Columbia
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
37 changes: 18 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,40 @@
# PyPunisher

The PyPunisher package will implement techniques for feature and model selection.
Namely, it will contain tools for forward and backward selection, as well as tools for computing
AIC and BIC (see below).
The PyPunisher package will implement techniques for feature and model selection. Namely, it will contain tools for forward and backward selection, as well as tools for computing AIC and BIC (see below).


## Contributors:

Avinash
Tariq
Jill

## Functions included:
Avinash, Tariq, Jill

- Forward selection
* `forward_selection()`
- Backward selection
* `backwards_selection()`
- Metrics:
- '`aic()`': Computes the Akaike information criterion [AIC](https://en.wikipedia.org/wiki/Akaike_information_criterion)
- '`bic()`: Computes the [Bayesian_information_criterion](https://en.wikipedia.org/wiki/Bayesian_information_criterion)

## ToDos:

Jill

* function description
* function description

Tariq

* summary paragraph
* summary paragraph

Avinash

* where your packages fit into the Python and R ecosystems
* where your packages fit into the Python and R ecosystems


## Functions included:

We will be implementing two stepwise feature selection techniques:

- `forward_selection()`: a feature selection method in which you start with a null model and iteratively add useful features
- `backward_selection()`: a feature selection method in which you start with a full model and iteratively remove the least useful feature at each step

We will also be implementing metrics that evaluate model performance:

- `aic()`: computes the Akaike information criterion [Akaike information criterion](https://en.wikipedia.org/wiki/Akaike_information_criterion)
- `bic()`: computes the [Bayesian information criterion](https://en.wikipedia.org/wiki/Bayesian_information_criterion)

**Due**: Sunday Feb 11, 2018.


## How the packages fit into the existing R and Python ecosystems.
Expand All @@ -45,3 +43,4 @@ In Python ecosystem, forward selection has been implemented in scikit learn by t
[f_regression](http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.f_regression.html) function. The function uses Linear model for testing the individual effect of each of many regressors. It has been implemented as a scoring function to be used in feature seletion procedure. The backward selection has also been implemented in scikit learn by the [RFE](http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFE.html) function. RFE uses an external estimator that assigns weights to features and it prunes the number of features by recursively considering smaller and smaller sets of features until the desired number of features to select is eventually reached. Whereas, in R ecosystem, forward and backward selection are implemented by [olsrr package](https://cran.r-project.org/web/packages/olsrr/)
and in [MASS package](https://cran.r-project.org/web/packages/MASS/MASS.pdf) by function
[StepAIC](https://stat.ethz.ch/R-manual/R-devel/library/MASS/html/stepAIC.html). StepAIC performs stepwise selection (forward, backward, both) by exact AIC.

0 comments on commit 8af696b

Please sign in to comment.