Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list of ideas for improvements from Gokul #4

Open
sje30 opened this issue Jul 21, 2021 · 12 comments
Open

list of ideas for improvements from Gokul #4

sje30 opened this issue Jul 21, 2021 · 12 comments

Comments

@sje30
Copy link
Owner

sje30 commented Jul 21, 2021

hi @jmb280cam

here is a list of ideas for improvements from Gokul @srgk26

1 how automatic differentiation works (not just how it's implemented, as in the video by Alan Edelman)
2 types and type stability; also highlight difference between type annotation and type stability (the former does not improve
 performance)
3 refer to blog for type stability in julia: https://www.juliabloggers.com/writing-type-stable-julia-code/
4 using @code_warntype on function calls to check for type stability
5 custom typing with struct
6 multiple dispatch
7 highlight performance tips page in docs: https://docs.julialang.org/en/v1/manual/performance-tips/
8 using maths macros @simd, @fastmath, @inbounds to further improve performance (not always, can reduce performance -
 case-by-case basis)
9 good idea to introduce docker as well
10 tutorial for common tools and packages in julia ecosystem (dataframes, plotting, flux, differentialequations, etc.)
11 note that julia is column-major, so iterate through rows before columns in matrices, for example:

m = rand(2,3)
@inbounds @views for j in 1:size(m,2)
    for i in 1:size(m,1)
        m[i,j] = m[i,j]*(i+j)
    end
end

some of these could be in the intro, some (like types) would be better off in a case study.

CC: @Nick-Gale for info

@sje30
Copy link
Owner Author

sje30 commented Jul 21, 2021

p.s. for iterating over an array (item 11), is eachindex(m) a better way to iterate over the matrix m? Not tried it yet myself

@jmbyrne
Copy link
Collaborator

jmbyrne commented Jul 21, 2021

My initial thoughts on these are:
1, 10 may fit in better into case studies
2, 3, 4, 5, 6 could constitute a new section, but more likely a case study (as you say)
7 should probably be mentioned
8, 11 can go into the efficiency case study, whatever that may end up being
9 I don't know what you mean by "docker", could you elaborate please?

A note on eachindex(m), it's described in the Julia manual as

an efficient iterator for visiting each position in A

so it's good for that, but if you need the row/column indices I think it's just as efficient to use two for loops

@srgk26
Copy link

srgk26 commented Jul 21, 2021

Hi @jmb280cam! Docker is a container platform, sort of like virtual machines but operating on an OS level rather than on kernel level. You can find more info here:

https://en.wikipedia.org/wiki/Docker_(software)
https://www.docker.com/

Docker isn't necessarily linked to Julia itself, I was making a more general suggestion to maybe include docker as a scientific computing tool. I certainly use docker all the time. The main purpose of docker is to export software across machines, such that if it works on machine A it is guaranteed to work on machine B. But I tend to use docker just because it's cleaner, compartmentalized, and I don't have to watch out for updates. I just need to do docker pull, and the latest software version is automatically downloaded. But whatever the use case, I think it's useful to know.

Anyways this may very well be complicating things more than necessary, and may very well be out of scope. But since I'm already using it for Julia and have the instructions, I'll provide them here. Feel free to make use of any of it if you'd like or leave it out entirely.

Setting up docker itself differs, depending on if it's Windows, Mac or Linux machine, and if the user has root permissions (they would if it's their personal computer). After it's set up, I would create a new container like this:

JULIA_VERSION="1.6.0"
docker pull julia:latest
docker build -f Julia-CUDA-Dockerfile --no-cache -t julia-cuda:latest .
docker run -it --name julia-"${JULIA_VERSION}" -v /home/srgk26:/home/srgk26 --gpus all julia-cuda:latest

Docker build step:
The -f option specified file input. The Dockerfile input I'm using is called Julia-CUDA-Dockerfile (this is with CUDA support). The -t option means tag, this gives the name of the image created.

Docker run step:
The -it option means interactive. The --name option gives a name to the container created, can provide any name but a name is useful (will explain later below). In my setup, it's named julia-1.6.0, so I can see what version of julia is for which docker image. The -v option is the filesystem mount option, of the form -v host:container. So the directory to the left of the colon is the directory on my shot machine, the directory to the right of the colon is the directory within the docker container (if directory doesn't exist in container, it'll create it). Normally docker containers are isolated from the rest of the host filesystem. But if want to work with files in the host system though, this is necessary. The --gpus all option means I'm exposing my local GPUs to the docker container.

This is how my dockerfile 'Julia-CUDA-Dockerfile' looks like:

FROM julia:latest

## Specify NVIDIA driver features to mount inside container
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility

## Install Linux system packages
RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive \
    apt-get install --yes --no-install-recommends \
                    build-essential curl git libgomp1 sudo wget && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

## Install MKL.jl
## Until Julia v1.7, building MKL.jl rebuilds Julia's system image against Intel MKL
## Until Julia v1.7, need to set environment variable ENV["USE_BLAS64"]=true to install 64-bit MKL version
#RUN julia -e 'ENV["USE_BLAS64"] = true; using Pkg; Pkg.add("MKL")'

## Set user name for container to run as user
ARG USER=julia-docker

## Provide root privileges to $USER
## Add $USER to sudo group and disable password requirement
RUN adduser --disabled-password --gecos '' $USER
RUN adduser $USER sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers

## Switch user from root to $USER
USER $USER

## Install and precompile essential Julia packages
RUN julia -e 'using Pkg; Pkg.add(["Combinatorics", "CSV", "CUDA", "DataFrames", \
                                  "DifferentialEquations", "Distributions", "Flux", "IterTools", \
                                  "LoopVectorization", "MKL", "Parameters", "StatsFuns", "StatsBase", "StatsPlots"]); \
                         Pkg.precompile()'

## Default executable for container
ENTRYPOINT ["julia"]

The main purpose for doing this is to circumvent the permission issues when working with docker containers. Docker automatically assumes root permission, which means any files created or edited within a docker system will be created by root. This is troublesome when later working with those files outside of docker as a regular user. I'm using plain Linux desktop, so this is a problem. Not sure if it's a problem with WSL2 in windows or Mac though.

In any case, this is a sample Dockerfile that works, can adapt it as per use case.

After creating a container, I can call on that container for future use. I'm using vscode, and in vscode it's just a click on a button. Within terminal though, these are the steps:

docker start julia-general-1.6.0
docker exec -it julia-general-1.6.0 /bin/bash

Anyways, these steps are probably overkill. I already have these instructions, so I sent them to you. Feel free to disregard this.

@jmbyrne
Copy link
Collaborator

jmbyrne commented Jul 21, 2021

Thanks for that explanation. I would have thought that for CATAM we wouldn't need to consider this sort of thing (since the code should be simple enough to work on any system), but it's good to have in the back pocket if it comes up

@Nick-Gale
Copy link

Nick-Gale commented Jul 22, 2021 via email

@Nick-Gale
Copy link

Nick-Gale commented Jul 22, 2021 via email

@srgk26
Copy link

srgk26 commented Aug 1, 2021

Just putting it out here, saw this on my youtube recommendations. Think it's a good link to share: https://youtu.be/gRj7E5kYG1I

Also, @sje30 you may be interested in this, but perhaps out of scope for catam-julia: https://youtu.be/Sh_jBtP7RVY

@sje30
Copy link
Owner Author

sje30 commented Aug 1, 2021

thanks. there are also some nice pluto videos there from the JuliaCon

@srgk26
Copy link

srgk26 commented Aug 24, 2021

Hey! I also came across these last week, wanted to post this then but got distracted by the presentations. Anyway, this is the post: https://www.numerical-tours.com/julia/

Numerical tours in Julia. Not sure if it's interesting/relevant, but sending it across anyway.

@sje30
Copy link
Owner Author

sje30 commented Aug 25, 2021

Thanks @srgk26

if you had a spare 30-60 minutes, would you be able to read over https://sje30.github.io/catam-julia/intro/julia-manual.html and comment?

@srgk26
Copy link

srgk26 commented Aug 25, 2021

Hi @sje30, I had a brief look. Firstly, should say that was a bit weird to see cell output above the code. But it seems that's the way Pluto was designed. A quick couple of points (caveats really):

  1. Under the efficiency section, you mentioned Julia being compiled. But perhaps it's worth also point out that this means that the first run is also slower, which would be more obvious for small operations, or for benchmarking results.
  2. Just a note that Threads.@threads doesn't always speed up computations. From my benchmarking, I observed that it only speeds up for large enough datasets. Otherwise, the hyperthreading overhead is more dominant and slows down than if it were single-threaded.

I'd be able to give a more detailed comment maybe over the weekend. I'm packing up now actually, for a flight to India tomorrow. Will be back in the UK in about a month.

I'd be happy to look into this closer then.

@jmbyrne
Copy link
Collaborator

jmbyrne commented Aug 25, 2021

Thanks again for the input Gokul, I'll make those changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants