-
Notifications
You must be signed in to change notification settings - Fork 559
/
FAQ.Rmd
252 lines (186 loc) · 14.8 KB
/
FAQ.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
---
title: "Bayesian Data Analysis course - FAQ"
date: "Page updated: `r format.Date(file.mtime('FAQ.Rmd'),'%Y-%m-%d')`"
---
## How to use R and RStudio remotely
### Option 1: Using R and RStudio via JupyterHub {#jupyterhub}
Instead of installing RStudio on your computer, you can use it in your web browser:
* [Information about Aalto JupyterHub](https://scicomp.aalto.fi/aalto/jupyterhub/)
* Go to [jupyter.cs.aalto.fi](https://jupyter.cs.aalto.fi)
* Choose `CS-E5710 - Bayesian Data Analysis (2024)`
* In the Launcher click `RStudio`
* In the RStudio Files pane (bottom right) you can create folders for your work and upload files from your computer to the server
* The **notebooks** folder is the only persistent folder (stays there if you sign out) so save everything to that folder!
* You may get an error when uploading a large zip file, but uploading smaller zip files work. If you can't upload demo zip file contact the course staff via Zulip.
* You may access your data as a network drive by SMB mounting it on your own computer - [see Accessing JupyterHub data](https://scicomp.aalto.fi/aalto/jupyterhub-data/). This allows you to have total control over your data.
* After uploading files, use Files pane to open them (e.g. an RMarkdown or Quarto notebook)
* Knitting of R, Rmd, and qmd files works as well (tested 24th August)
* CmdStanR used later in the course has been tested to work 24th August.
* To use CmdStanR:<br>
`library(cmdstanr)`<br>
* If you do not see the following output, please contact us on Zulip:
`This is cmdstanr version 0.6.0`<br>
`- CmdStanR documentation and vignettes: mc-stan.org/cmdstanr`<br>
`- CmdStan path: /coursedata/cmdstan`<br>
`- CmdStan version: 2.33.0`<br>
* There is a limited memory available (3Gib) and bigger models and datasets can run out of memory with cryptic error message, but the demos and assignment models should run (if not, then contact the course staff via Zulip).
* See also [Aalto JupyterHub FAQ and bugs](https://scicomp.aalto.fi/aalto/jupyterhub/#faq-and-bugs)
### Option 2: Use Aalto Linux via remote-desktop solution provided by Aalto-IT.
* [Information about Aalto remote desktop](https://scicomp.aalto.fi/aalto/remoteaccess/#remote-desktop)
* Go to [vdi.aalto.fi](https://vdi.aalto.fi)
* Download VMWare Horizon application or use the web portal
* If using the VMWare Horizon application, click on `New Server` and enter `vdi.aalto.fi`
* Enter your aalto username (aalto email works too) and password in the respective fields.
* Select `Ubuntu 20.04`
* Click Activities, start typing `RStudio` in the search bar, and click `RStudio`.
### How to access the BDA R (and Python) demos on CS JupyterHub
* Go to `jupyter.cs.aalto.fi` on your favorite web-browser. ![](https://zenodo.org/record/4010716/files/1.png)
* Log-in with your aalto username and password.
* Select the `CS-E5710 - Bayesian Data Analysis (2024) -r-ubuntu:6.2.3-bayesda2024` server.
![](https://zenodo.org/record/8296300/files/2.png)
* Select the `notebooks` folder in the left hand file browser.
![](https://zenodo.org/record/8296300/files/3.png)
* Select the `git clone icon` as seen in the screenshot below.
![](https://zenodo.org/record/8296300/files/4.png)
* In the text box type `https://github.com/avehtari/BDA_R_demos.git` for python demos replace `BDA_R_demos.git` with `BDA_Python_demos.git` instead. Then click `clone`.
![](https://zenodo.org/record/8296300/files/5.png)
* Wait a while, there should be `BDA_R_demos` folder under `notebooks` folder. Click on the `BDA_R_demos` folder.
![](https://zenodo.org/record/8296300/files/6.png)
* Click on the RStudio button on the right.
![](https://zenodo.org/record/8296300/files/7.png)
* Now you should have an R-studio like interface in your web-browser. Click on `File -> Open File...`
![](https://zenodo.org/record/4010716/files/9.png)
* Click on `notebooks` and then select `BDA_R_demos` folder.
![](https://zenodo.org/record/8296300/files/10.png)
* Select a demo to run. Here we open the folder `demos_ch2` and then select `demo2_1.R` file and click `open`.
This should open the file in the window.
![](https://zenodo.org/record/4010716/files/11.png)
* Select the contents of the file and click `Code -> Run Selected Line(s)` as shown in the screenshot below.
![](https://zenodo.org/record/4010716/files/12.png)
* You should see the output of the code in the bottom right corner.
![](https://zenodo.org/record/4010716/files/13.png)
## How to use R and RStudio on your own computer
### How to access the course material in Github
Instead of trying to download each file separately via the Github interface, it is recommended to use one of these options:
- The best way is to clone the repository using git, and use pull to get the latest updates.
- If you want to learn to use git, start by installing a [git client](https://www.google.com/search?q=git+clients&oq=git+client). There are plenty of good git tutorials online. See for example the tutorial by Aalto [Git Aalto](https://scicomp.aalto.fi/scicomp/git/)
- If you don't want to learn to use git, download a the repository as a zip file. Click the green button "Code" at the main page of the repository and choose "Download ZIP" ([direct link](https://github.com/avehtari/BDA_course_Aalto/archive/master.zip)). Remember to download again during the course to get the latest updates.
### R packages used in demos
Aalto JupyterHub has all the R packages used in demos pre-installed. If you install R on your own computer, you can install all the packages used by the main demos with
```
install.packages(c("MASS", "bayesplot", "brms", "cmdstanr ", "dplyr", "gganimate", "ggdist", "ggforce", "ggplot2", "grid", "gridExtra", "latex2exp", "loo", "plyr", "posterior", "purrr", "rprojroot", "tidyr", "quarto"))
```
### Installing `aaltobda` package
The course has its own R package `aaltobda` with data and
functionality to simplify coding. `aaltobda` has been pre-installed in
Aalto JupyterHub. To install the package to your own computer just run the following:
1. `install.packages("aaltobda", repos = c("https://avehtari.r-universe.dev", getOption("repos")))`
If during the course there is announcement that `aaltobda` has been
updated (e.g. some error has been fixed), you can get the latest
version by repeating the second step above.
### Error related to LC_CTYPE while installing `aaltobda` r-package.
When installing the `aaltobda` package, you may encounter an error like this:
```{r eval=FALSE}
Error: (converted from warning) Setting LC_CTYPE failed, using "C"
Execution halted
Error: Failed to install 'aaltobda' from GitHub:
(converted from warning) installation of package '/var/folders/g6/bdv4dr4s6qq4zyxw2nzy26kr0000gn/T//Rtmp3uYwuD/file121f355845a3/aaltobda_0.1.tar.gz' had non-zero exit status
```
**Solution:**
See the following StackOverflow solution. ([link](https://stackoverflow.com/a/3909546))
### Problems installing R packages on Windows ?
Getting the setup needed for the course working on Windows might involve a bit more effort than on Linux and Mac. Consequently, **we recommend using either Linux or MacOS, or using R remotely**.
Moreover, `Stan`, the probabilistic programming language which we will use later on during the course requires a C++ compiler toolchain which is not available by default in Windows (blame Microsoft).
However, if you want to use Windows and have a problem getting the setup working, below are two options to consider:
### Installing `knitr`
If you just installed RStudio and R, chances are you don't have `knitr` installed, the package responsible for rendering your notebook to pdf.
**Solution:**
```{r eval=FALSE}
install.packages("knitr")
```
You can also install packages from RStudio menu `Tools->Install Packages`.
### If `knitr` is installed but the pdf won't compile
In this case it is possible that you don't have LaTeX installed, which is the package that runs the engine to process the text and render the pdf itself.
**Solution:**
Tinytex is the bare minimum Latex core that you need to install in order to run the pdf compiler. If you want to go further and download a full distribution of Latex, look at TeX Live for Linux and MacTeX for Mac OS.
```{r eval=FALSE}
install.packages("tinytex")
tinytex::install_tinytex()
```
### How to install the latest version of `CmdStanR` or `RStan`
* Make sure you have installed R version 4.* or newer. If you don't, install a newer version using instructions from [https://www.r-project.org/](https://www.r-project.org/)
* At the moment `CmdStanR` is faster in use, and thus we recommend using `CmdStanR` instead of `RStan`, but both can be used for the assignments.
* [`CmdStanR`](https://mc-stan.org/cmdstanr/) is a lightweight interface to Stan for R users (see `CmdStanPy` for Python).
* `CmdStanR` avoids some installation problems as it doesn't require matching C++ tools for `R` and `RStan`
* Install `RStan` along with the necessary C++ compiler toolchain as described [here](https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started)
Instead of RStan, you can also use new `CmdStanR` which maybe easier to install.
### Workarounds for current Rstan Windows issues
* [A list of the current RStan Windows problems and and known temporary workarounds](https://discourse.mc-stan.org/t/workarounds-for-current-rstan-windows-issues/17389)
## How to render a single qmd-file to html or pdf
Clicking Render in RStudio defaults to render all qmd files in the directory ("project") and to render both html and pdf. To save time in rendering, you can install `quarto` R package
```
install.packages("quarto")
library(quarto)
```
and then render just one file and choose the target to be html with
```
quarto_render("template2.qmd", output_format = "html")
```
and when you are ready to submit the pdf, use
```
quarto_render("template2.qmd", output_format = "pdf")
```
`quarto` R package is available as pre-installed in JupyterHub (you may need to restart your server o get the new image).
If the rendering to pdf doesn't work, you can print the html to pdf file, but do make sure to turn off "More setings -> Print headers and footers" to avoid accidentally printing your identity.
## What is `tidyr` or `tidyverse` that is used in the R demos? What does `%>%` mean?
* [Tidyverse](https://www.tidyverse.org/) is a collection of R packages designed for data science. The packages "share an underlying design philosophy, grammar, and data structures".
* A clear characteristic that distinguishes tidyverse from the base R is the [pipe operator](https://style.tidyverse.org/pipes.html) `%>%`
* Recent R versions have a new built-in pipe operator `|>`, which in most cases can replace `%>%`, and there are some differences only in more advanced use cases.
* In this course you do not need to use tidyverse. However, some packages belonging to tidyverse, such as `ggplot2`, can be useful for visualizing results in the reports.
## M1 Macs with Python and Stan
Unfortunately the installation of `pystan` will fail on an M1 Mac, as there is not a binary wheel available for the `httpstan` dependency. A recommended alternative here is the [CmdStanPy package](https://mc-stan.org/cmdstanpy/).
M1 Mac users that are intent on using `pystan` will need to complete the following steps to build `httpstan` from source and then install `pystan`:
### Build and Install `httpstan`
```
# Download httpstan source
git clone https://github.com/stan-dev/httpstan
cd httpstan
# Build shared libraries and generate code
python3 -m pip install poetry
# Build httpstan source
# - There will be many compiler warnings, these are safe to ignore
make -j8
# Build the httpstan wheel
python3 -m poetry build
# Install the wheel
python3 -m pip install dist/*.whl
```
### Install `pystan`
```
python3 -m pip install pystan
```
## I missed some deadline or wasn't able to do some part of the course
- Can I combine results from assignments, project, presentation, and e-exam made in different periods / years?
- Yes.
- I missed the deadline to register for the course in Sisu. Can I join the course?
- Yes, just register in MyCourses and contact student services that they add you in Sisu, too.
- I missed the deadline for the assignment. Can you accept my late submission?
- Open MyCourses Quizzes are automatically submitted at the deadline time
<!--
- If you miss the FeedbackFruits deadline first time it's used due technical problems, but you send the pdf to one of the TAs few minutes after the deadline it can accepted once. As the recommended submission time is before 4pm on Friday, you have in general 56 hours extra hours for submission and several few minute late submissions is not likely just due to the technical problems.
-->
- I was not able to do one of the assignments because [some personal problem]. Can I do some extra work?
- Things happen and you don't need to tell the course staff your personal reasons (especially you shouldn't tell any health issue details). Everyone gets a second change in period III. In period III there is just one submission deadline, but otherwise the procedure is the same (ie. you need to return all the assignments). If you submitted the project work in autumn you don't need to re-submit it if you re-submit assignments.
- I missed the deadline to register project group. Can I still register?
- Yes. Those who registered early are allowed to choose the presentation slots first.
- My group member a) disappeared, b) doesn't do anything, c) is annoying. Can I continue with the project alone.
- First we hope you can resolve the issue, but if nothing works, then you can continue the project work alone.
- I was not able a) to do the project or b) to give a presentation because [some personal problem]. Can a) I submit it later, b) present later.
- Things happen and you don't need to tell the course staff your personal reasons (especially you shouldn't tell any health issue details). Everyone gets a second change in period III. In period III there is second project submission deadline and presentation slots. If you are happy with your assignment score, you don't need to re-submit assignments if you submit the project work in period III.
## Recommended courses after Bayesian Data Analysis
Here are some great Aalto courses that are using Bayesian inference
- [CS-E4820 - Machine Learning: Advanced Probabilistic Methods](https://mycourses.aalto.fi/course/info.php?id=40717) (common probabilistic models in machine learning, such as sparse Bayesian linear models, Gaussian mixture models and factor analysis models, variational inference)
- [ELEC-E8106 - Bayesian Filtering and Smoothing](https://mycourses.aalto.fi/course/info.php?id=39443) (dynamical systems, time series, tracking)
- [CS-E4895 - Gaussian Processes](https://mycourses.aalto.fi/course/view.php?id=44810)
- [CS-E5795 - Computational Methods in Stochastics](https://mycourses.aalto.fi/course/info.php?id=40704) (more about MC, MCMC, HMC)
- [MS-E1654 - Computational Inverse Problems](https://mycourses.aalto.fi/course/info.php?id=40592)