Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

15618 project #1558

Closed
wants to merge 337 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
337 commits
Select commit Hold shift + click to select a range
f4c7d3a
sync
hugolatendresse Dec 8, 2024
41cf635
sync
hugolatendresse Dec 8, 2024
4121340
sync
hugolatendresse Dec 8, 2024
7f6a18b
sync
hugolatendresse Dec 8, 2024
3498538
I'm able to output with only one expert by commenting out groupby and…
hugolatendresse Dec 8, 2024
344fe09
sync
hugolatendresse Dec 9, 2024
b9883c1
sync
hugolatendresse Dec 9, 2024
65afc21
works without topk!
hugolatendresse Dec 9, 2024
4d6886c
outputs tokens with softmax assertion not commented out
hugolatendresse Dec 9, 2024
e757567
sync
hugolatendresse Dec 9, 2024
d45edbc
sync
hugolatendresse Dec 9, 2024
8c422aa
use full precisoin
hugolatendresse Dec 9, 2024
c0e4824
sync
hugolatendresse Dec 9, 2024
1c13c1e
sync
hugolatendresse Dec 9, 2024
d31cbea
sync
hugolatendresse Dec 9, 2024
8c525c4
sync
hugolatendresse Dec 9, 2024
8b10df9
sync
hugolatendresse Dec 9, 2024
f8ea430
sync
hugolatendresse Dec 9, 2024
54e6adc
sync
hugolatendresse Dec 9, 2024
e5075fa
sync
hugolatendresse Dec 9, 2024
fe8e7fd
sync
hugolatendresse Dec 9, 2024
dbe3be6
sync
hugolatendresse Dec 9, 2024
0d2fa44
sync
hugolatendresse Dec 9, 2024
a0ad4c0
sync
hugolatendresse Dec 9, 2024
5473f2c
sync
hugolatendresse Dec 9, 2024
927356d
sync
hugolatendresse Dec 9, 2024
f7c6360
sync
hugolatendresse Dec 9, 2024
88a7163
just run groupby
hugolatendresse Dec 9, 2024
89f5a85
sync
hugolatendresse Dec 9, 2024
0eeaab4
outputs tokens
hugolatendresse Dec 9, 2024
17174d2
comments
hugolatendresse Dec 9, 2024
99e6e31
Merge pull request #6 from flexflow/mixtral
hugolatendresse Dec 9, 2024
dca4703
all changes from my most recent branch except in mixtral.cc and aggre…
hugolatendresse Dec 9, 2024
1375849
names
hugolatendresse Dec 9, 2024
afd49a0
non null points for aggregate_inputs
hugolatendresse Dec 9, 2024
d8a1b34
>=
hugolatendresse Dec 9, 2024
41f8fea
added guid to groupby
hugolatendresse Dec 9, 2024
26cbd49
layer_guid for aggregate
hugolatendresse Dec 9, 2024
4f5ff4b
sync
hugolatendresse Dec 9, 2024
5595e05
sync
hugolatendresse Dec 9, 2024
4dbcdb1
compiles with layer_guid in aggregate and groupby, but getting unsupo…
hugolatendresse Dec 9, 2024
37a56a3
added div op to some of the scripts
hugolatendresse Dec 9, 2024
d937c00
sync
hugolatendresse Dec 9, 2024
dcea4d6
sync
hugolatendresse Dec 10, 2024
87039e6
sync
hugolatendresse Dec 10, 2024
6e445ad
sync
hugolatendresse Dec 10, 2024
f3091f6
sync
hugolatendresse Dec 10, 2024
894e10d
sync
hugolatendresse Dec 10, 2024
e12d5e2
sync
hugolatendresse Dec 10, 2024
c586ae8
sync
hugolatendresse Dec 10, 2024
19ee4a6
sync
hugolatendresse Dec 10, 2024
6175749
sync
hugolatendresse Dec 10, 2024
32f2c7b
sync
hugolatendresse Dec 10, 2024
8beedc8
sync
hugolatendresse Dec 10, 2024
74692f6
sync
hugolatendresse Dec 10, 2024
6f40ccf
sync
hugolatendresse Dec 10, 2024
c25fb1b
sync
hugolatendresse Dec 10, 2024
ef90e99
sync
hugolatendresse Dec 10, 2024
0168374
switch to 128 and print old
hugolatendresse Dec 10, 2024
1ec0401
ff_norm as placeholder, no groupby
hugolatendresse Dec 10, 2024
f27b979
CHECKPOINT
hugolatendresse Dec 10, 2024
840d1e1
mlpout2
hugolatendresse Dec 10, 2024
85d05c9
sync
hugolatendresse Dec 10, 2024
1f9841c
sync
hugolatendresse Dec 10, 2024
166aabd
dummy grouped tokens 2
hugolatendresse Dec 10, 2024
ecb5a96
try creating dummy to make sure we can create useless thigns
hugolatendresse Dec 10, 2024
20be94a
unable to create wdumm1 and wdummy2! implies that we can't just recre…
hugolatendresse Dec 10, 2024
16b7ee1
try to make dummy work
hugolatendresse Dec 10, 2024
67c2dbb
sync
hugolatendresse Dec 10, 2024
2997db4
sync
hugolatendresse Dec 10, 2024
4533b9c
sync
hugolatendresse Dec 10, 2024
96abed1
update: i AM able to create dummy extra tensros with ff.dense. Create…
hugolatendresse Dec 10, 2024
78ae547
no dummy
hugolatendresse Dec 10, 2024
37e1cb5
try mlp_out2
hugolatendresse Dec 10, 2024
7d07a47
starting point before messing with aggregate to fix legion index spac…
hugolatendresse Dec 10, 2024
68ac2aa
starting point before messing with aggregate to fix legion index spac…
hugolatendresse Dec 10, 2024
3daf714
sync
hugolatendresse Dec 10, 2024
ea80e2e
sync
hugolatendresse Dec 10, 2024
d45167a
groupby works as part of piepline. tokens outputtedgit diff
hugolatendresse Dec 10, 2024
83ca675
sync
hugolatendresse Dec 10, 2024
52e40b1
sync
hugolatendresse Dec 10, 2024
a650304
sync
hugolatendresse Dec 10, 2024
165750e
sync
hugolatendresse Dec 10, 2024
b17b44a
comments
hugolatendresse Dec 10, 2024
51c3b5d
sync
hugolatendresse Dec 10, 2024
ce06918
todo
hugolatendresse Dec 10, 2024
ab82308
sync
hugolatendresse Dec 10, 2024
ed50b58
sync
hugolatendresse Dec 10, 2024
9d33dbe
able to output tokens if I bypass both
hugolatendresse Dec 10, 2024
b096e8c
able to output tokens if I bypass both
hugolatendresse Dec 10, 2024
9eea582
sync
hugolatendresse Dec 10, 2024
22975bc
still outputting with no agg
hugolatendresse Dec 10, 2024
25ef67c
sync
hugolatendresse Dec 10, 2024
077905a
sync
hugolatendresse Dec 10, 2024
443e563
able to output tokens with empty forward task for aggregate
hugolatendresse Dec 10, 2024
731dd9f
sync
hugolatendresse Dec 10, 2024
0c4c5be
space domain error occurs even without kernel call. The problem is so…
hugolatendresse Dec 10, 2024
36b2eec
space domain error occurs even without kernel call. The problem is so…
hugolatendresse Dec 10, 2024
94a62b6
still experiencing error with get index space domain
hugolatendresse Dec 10, 2024
b954777
sync
hugolatendresse Dec 10, 2024
69e67ed
sync
hugolatendresse Dec 10, 2024
121cc06
sync
hugolatendresse Dec 10, 2024
262b85c
sync
hugolatendresse Dec 10, 2024
36e64d3
sync
hugolatendresse Dec 10, 2024
523294b
sync
hugolatendresse Dec 10, 2024
cddd7dd
printf in mha
hugolatendresse Dec 10, 2024
d03cc70
printf in mha
hugolatendresse Dec 10, 2024
0ee07e3
sync
hugolatendresse Dec 10, 2024
8a8ff96
able to output tokens. copied legion in init_inference, but no doing …
hugolatendresse Dec 10, 2024
3d869a6
sync
hugolatendresse Dec 10, 2024
78528a6
sync
hugolatendresse Dec 10, 2024
f38d6b4
sync
hugolatendresse Dec 10, 2024
721ba9d
sync
hugolatendresse Dec 10, 2024
24cd22b
sync
hugolatendresse Dec 10, 2024
dbcf1f3
sync
hugolatendresse Dec 10, 2024
c8826a5
now onto forward task
hugolatendresse Dec 10, 2024
0ec62da
now onto forward task
hugolatendresse Dec 10, 2024
68f8058
now onto forward task
hugolatendresse Dec 10, 2024
bbe639d
outputting tokens
hugolatendresse Dec 10, 2024
6bd5d7e
sync
hugolatendresse Dec 10, 2024
bab8ccb
sync
hugolatendresse Dec 10, 2024
f9ed445
ZHIHAO COMMENTS
hugolatendresse Dec 10, 2024
66c01ff
dummy third and foruth inputs
hugolatendresse Dec 11, 2024
4c66590
Merge pull request #4 from hugolatendresse/bridge_works_aggregate
hugolatendresse Dec 11, 2024
f73c8e7
sync
hugolatendresse Dec 11, 2024
f8619b5
sync
hugolatendresse Dec 11, 2024
2d79719
sync
hugolatendresse Dec 11, 2024
0e6bc47
sync
hugolatendresse Dec 11, 2024
c060031
segfault
hugolatendresse Dec 11, 2024
2ac047d
conflicts
hugolatendresse Dec 11, 2024
2e864d9
sync
hugolatendresse Dec 11, 2024
bb7c9dd
sync
hugolatendresse Dec 11, 2024
e86df71
sync
hugolatendresse Dec 11, 2024
630fa69
sync
hugolatendresse Dec 11, 2024
6d92d9d
sync
hugolatendresse Dec 11, 2024
5996ed1
still country of nation
hugolatendresse Dec 11, 2024
85aa791
test forward task
hugolatendresse Dec 11, 2024
cd6de3d
sync
hugolatendresse Dec 11, 2024
1c799fb
sync
hugolatendresse Dec 11, 2024
0d325f8
sync
hugolatendresse Dec 11, 2024
8d55c90
sync
hugolatendresse Dec 11, 2024
34741e0
sync
hugolatendresse Dec 11, 2024
37f71a7
sync
hugolatendresse Dec 11, 2024
58247d9
sync
hugolatendresse Dec 11, 2024
da7bc9a
sync
hugolatendresse Dec 11, 2024
92fa0db
sync
hugolatendresse Dec 11, 2024
4ec9bf2
sync
hugolatendresse Dec 11, 2024
024515b
outputting tokens!! (correct tokens)
hugolatendresse Dec 11, 2024
552004c
sync
hugolatendresse Dec 11, 2024
b397e70
sync
hugolatendresse Dec 11, 2024
a60e3b9
outputting right tokens! fixed regions
hugolatendresse Dec 11, 2024
d9af821
sync
hugolatendresse Dec 11, 2024
0eb4815
everything works! moving on to kernel call. Right tokens get outputted
hugolatendresse Dec 11, 2024
c46eef4
should still pass
hugolatendresse Dec 11, 2024
dc449c0
sync
hugolatendresse Dec 11, 2024
8d59d5b
added docker exec perm for matt
Dec 11, 2024
b5b5c1e
outputting right tokens!
hugolatendresse Dec 11, 2024
2527d67
Merge pull request #11 from hugolatendresse/hugo1211
hugolatendresse Dec 11, 2024
0ca8e13
Merge remote-tracking branch 'origin/dev_mixtral' into fixaggregate
hugolatendresse Dec 11, 2024
b09144b
added print debug groupby numdims
Dec 11, 2024
1e4daa0
added print debug groupby numdims. fixed. no dummy grouped_tokens
Dec 11, 2024
d585686
old forward_task
hugolatendresse Dec 11, 2024
df4ccb8
softmax = -1. output was 0.
Dec 11, 2024
f460ac8
all tokens ok! able to infer with old kernel call
hugolatendresse Dec 11, 2024
02c86cc
sync
hugolatendresse Dec 11, 2024
352577c
sync
hugolatendresse Dec 11, 2024
4f34865
sync
hugolatendresse Dec 11, 2024
281d388
sync
hugolatendresse Dec 11, 2024
b14487c
sync
hugolatendresse Dec 11, 2024
c1bbfd1
changed dims
Dec 11, 2024
23b46f6
changed dims
Dec 11, 2024
f432480
sync
hugolatendresse Dec 11, 2024
46b8c68
changed dims
Dec 11, 2024
9f6abd1
changed dims
Dec 11, 2024
27d48ec
sync
hugolatendresse Dec 11, 2024
9a3ab8d
alpha = 0
Dec 11, 2024
e046d01
sync
hugolatendresse Dec 11, 2024
8f4dafd
sync
hugolatendresse Dec 11, 2024
7110763
sync
hugolatendresse Dec 11, 2024
54621b0
sync
hugolatendresse Dec 11, 2024
277ab10
sync
hugolatendresse Dec 11, 2024
8ca5b7b
sync
hugolatendresse Dec 11, 2024
2d3b3f1
sync
hugolatendresse Dec 11, 2024
4101524
num_dims - 1
Dec 11, 2024
e572bbc
print groupby ops dims
Dec 11, 2024
d555c39
print groupby ops dims
Dec 11, 2024
628d2f2
more groupby print
Dec 11, 2024
5ba390a
more groupby print
Dec 11, 2024
53e556d
removed some hashing stuff. probs wont change anything
Dec 11, 2024
14ffa88
doesn't do anything
Dec 11, 2024
18f527e
softmax back to 0?
Dec 11, 2024
a3b6e2b
using dummy groupby output to check inc decoding stuff
Dec 11, 2024
8312b84
was the dim index wrong?
Dec 11, 2024
0ee5a76
added kernel print
Dec 11, 2024
105419f
added kernel print
Dec 11, 2024
23264b4
altered gb_forward_kernel slightly. wasnt recording output, putting i…
Dec 11, 2024
72cc1cb
use full seq len
Dec 11, 2024
2dcfa01
sync
hugolatendresse Dec 11, 2024
b0d5d55
try keeping only 3 dimensions
hugolatendresse Dec 11, 2024
131d6ea
sync
hugolatendresse Dec 11, 2024
92cdb4b
sync
hugolatendresse Dec 11, 2024
4cf85ca
sync
hugolatendresse Dec 11, 2024
ccb7216
sync
hugolatendresse Dec 11, 2024
8d32e73
outpoutting tokens
hugolatendresse Dec 11, 2024
1e28a5a
sync
hugolatendresse Dec 11, 2024
0a967d0
sync
hugolatendresse Dec 11, 2024
425983c
sync
hugolatendresse Dec 11, 2024
587bcba
sync
hugolatendresse Dec 11, 2024
2d50fea
sync
hugolatendresse Dec 11, 2024
c7862d6
sync
hugolatendresse Dec 11, 2024
ff3d849
sync
hugolatendresse Dec 11, 2024
eea2720
sync
hugolatendresse Dec 11, 2024
ffd8bd9
shortened max len of gen for debugging
Dec 11, 2024
6a02a9b
sync
hugolatendresse Dec 11, 2024
2526d5f
outputting tokens using aggregate!!
hugolatendresse Dec 11, 2024
e1af57c
outputting tokens using aggregate!!
hugolatendresse Dec 11, 2024
7a3ef50
outputting tokens using aggregate!!
hugolatendresse Dec 11, 2024
06a8380
CHECKPOINT
hugolatendresse Dec 11, 2024
2e5bcdd
added
Dec 11, 2024
dc6832b
added
Dec 11, 2024
c13ea95
cleanup
hugolatendresse Dec 11, 2024
edb4cbf
stop printing wieght file names
hugolatendresse Dec 11, 2024
1de1db4
commented out prints
Dec 12, 2024
8faffc1
cleaned up
Dec 12, 2024
5e61714
Merge branch 'matt-groupby-1210' into fixed_aggregate
mhk197 Dec 12, 2024
c0a1154
Merge pull request #13 from hugolatendresse/fixaggregate
hugolatendresse Dec 12, 2024
d8e6b2e
Merge pull request #15 from hugolatendresse/aggregate_works
hugolatendresse Dec 12, 2024
6eeaf92
Merge pull request #14 from hugolatendresse/fixed_aggregate
hugolatendresse Dec 12, 2024
1544ee8
remove hard-coided number of experts
hugolatendresse Dec 12, 2024
c4993bc
output
hugolatendresse Dec 12, 2024
74b584c
try few experts
hugolatendresse Dec 12, 2024
6d4ddb5
3
hugolatendresse Dec 12, 2024
a3589f5
10 regions
hugolatendresse Dec 12, 2024
22b4ffa
sync
hugolatendresse Dec 12, 2024
bd099ea
fixed
hugolatendresse Dec 12, 2024
02624d6
all experts
hugolatendresse Dec 12, 2024
4404694
Merge pull request #18 from hugolatendresse/remove_magic
hugolatendresse Dec 15, 2024
e133fc6
cleanup mixtral.cc
hugolatendresse Dec 15, 2024
9bfe777
comments
hugolatendresse Dec 16, 2024
4d39aa4
cleanup
hugolatendresse Dec 16, 2024
daaa506
restore backward
hugolatendresse Dec 16, 2024
b785259
started cleanup groupby
hugolatendresse Dec 16, 2024
57739b6
if alpha
hugolatendresse Dec 16, 2024
959ec78
if
hugolatendresse Dec 16, 2024
a1f8200
remove comments
hugolatendresse Dec 16, 2024
f289afc
remove comments
hugolatendresse Dec 16, 2024
01641a4
aggregate
hugolatendresse Dec 16, 2024
74b4d52
remove comment
hugolatendresse Dec 16, 2024
7898107
hidden dim
hugolatendresse Dec 16, 2024
a577b65
remove comments
hugolatendresse Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -196,3 +196,8 @@ tests/inference/python_test_configs/*.json

core.*
fine_grained_alignment_config.json

# CLion
.idea/
cmake-build-debug
cmake-build-debug-remote-host
134 changes: 134 additions & 0 deletions docker/run-persistent.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
#! /usr/bin/env bash
set -euo pipefail

# Usage: ./run.sh <docker_image_name>
# Optional environment variables: FF_GPU_BACKEND, cuda_version, hip_version, ATTACH_GPUS, SHM_SIZE

# Cd into directory holding this script
cd "${BASH_SOURCE[0]%/*}"

# Parse input params
image=${1:-flexflow}
FF_GPU_BACKEND=${FF_GPU_BACKEND:-cuda}
cuda_version=${cuda_version:-"empty"}
hip_version=${hip_version:-"empty"}

# Parameter controlling whether to attach GPUs to the Docker container
ATTACH_GPUS=${ATTACH_GPUS:-true}
gpu_arg=""
if $ATTACH_GPUS ; then gpu_arg="--gpus all" ; fi
FORWARD_STREAMLIT_PORT=${FORWARD_STREAMLIT_PORT:-true}
port_forward_arg=""
if $FORWARD_STREAMLIT_PORT ; then
port_forward_arg+="-p 8501:8501"
fi


# Amount of shared memory to give the Docker container access to
# If you get a Bus Error, increase this value. If you don't have enough memory
# on your machine, decrease this value.
SHM_SIZE=${SHM_SIZE:-8192m}

# Check docker image name
if [[ "$image" != @(flexflow-environment|flexflow) ]]; then
echo "Error, image name ${image} is invalid. Choose between 'flexflow-environment', 'flexflow'."
exit 1
fi

# Check GPU backend
if [[ "${FF_GPU_BACKEND}" != @(cuda|hip_cuda|hip_rocm|intel) ]]; then
echo "Error, value of FF_GPU_BACKEND (${FF_GPU_BACKEND}) is invalid. Pick between 'cuda', 'hip_cuda', 'hip_rocm' or 'intel'."
exit 1
elif [[ "${FF_GPU_BACKEND}" != "cuda" ]]; then
echo "Running $image docker image with gpu backend: ${FF_GPU_BACKEND}"
else
echo "Running $image docker image with default GPU backend: cuda"
fi

# gpu backend version suffix for the docker image.
gpu_backend_version=""

if [[ "${FF_GPU_BACKEND}" == "cuda" || "${FF_GPU_BACKEND}" == "hip_cuda" ]]; then
# Autodetect cuda version if not specified
if [[ $cuda_version == "empty" ]]; then
# shellcheck disable=SC2015
cuda_version=$(command -v nvcc >/dev/null 2>&1 && nvcc --version | grep "release" | awk '{print $NF}' || true)
# Change cuda_version eg. V11.7.99 to 11.7
cuda_version=${cuda_version:1:4}
if [[ -z "$cuda_version" ]]; then
echo "Could not detect CUDA version. Please specify one manually by setting the 'cuda_version' env."
exit 1
fi
fi
# Check that CUDA version is supported
if [[ "$cuda_version" != @(11.1|11.2|11.3|11.4|11.5|11.6|11.7|11.8|12.0|12.1|12.2|12.3|12.4|12.5|12.6|12.7|12.8|12.9) ]]; then
echo "cuda_version is not supported, please choose among {11.1|11.2|11.3|11.4|11.5|11.6|11.7|11.8|12.0|12.1|12.2}"
exit 1
fi
# Use CUDA 12.2 for all versions greater or equal to 12.2 for now
if [[ "$cuda_version" == @(12.3|12.4|12.5|12.6|12.7|12.8|12.9) ]]; then
cuda_version=12.2
fi
# Set cuda version suffix to docker image name
echo "Running $image docker image with CUDA $cuda_version"
gpu_backend_version="-${cuda_version}"
fi

if [[ "${FF_GPU_BACKEND}" == "hip_rocm" || "${FF_GPU_BACKEND}" == "hip_cuda" ]]; then
# Autodetect HIP version if not specified
if [[ $hip_version == "empty" ]]; then
# shellcheck disable=SC2015
hip_version=$(command -v hipcc >/dev/null 2>&1 && hipcc --version | grep "HIP version:" | awk '{print $NF}' || true)
# Change hip_version eg. 5.6.31061-8c743ae5d to 5.6
hip_version=${hip_version:0:3}
if [[ -z "$hip_version" ]]; then
echo "Could not detect HIP version. Please specify one manually by setting the 'hip_version' env."
exit 1
fi
fi
# Check that HIP version is supported
if [[ "$hip_version" != @(5.3|5.4|5.5|5.6) ]]; then
echo "hip_version is not supported, please choose among {5.3, 5.4, 5.5, 5.6}"
exit 1
fi
echo "Running $image docker image with HIP $hip_version"
if [[ "${FF_GPU_BACKEND}" == "hip_rocm" ]]; then
gpu_backend_version="-${hip_version}"
fi
fi

# Check that image exists, if fails, print the default error message.
if [[ "$(docker images -q "${image}-${FF_GPU_BACKEND}${gpu_backend_version}":latest 2> /dev/null)" == "" ]]; then
echo "Error, ${image}-${FF_GPU_BACKEND}${gpu_backend_version}:latest does not exist!"
if [[ "${FF_GPU_BACKEND}" == "cuda" ]]; then
echo ""
echo "To download the docker image, run:"
echo " FF_GPU_BACKEND=${FF_GPU_BACKEND} cuda_version=${cuda_version} $(pwd)/pull.sh $image"
echo "To build the docker image from source, run:"
echo " FF_GPU_BACKEND=${FF_GPU_BACKEND} cuda_version=${cuda_version} $(pwd)/build.sh $image"
echo ""
elif [[ "${FF_GPU_BACKEND}" == "hip_rocm" ]]; then
echo ""
echo "To download the docker image, run:"
echo " FF_GPU_BACKEND=${FF_GPU_BACKEND} hip_version=${hip_version} $(pwd)/pull.sh $image"
echo "To build the docker image from source, run:"
echo " FF_GPU_BACKEND=${FF_GPU_BACKEND} hip_version=${hip_version} $(pwd)/build.sh $image"
echo ""
fi
exit 1
fi

cache_volume="-v cache_volume:/root/.cache"
home_volume="-v home_volume:/home"
tmp_volume="-v tmp_volume:/tmp"


ssh_key_volume=""
ssh_key_path="$HOME/.ssh/id_rsa"
if [ -f "$ssh_key_path" ] && [ -f "$ssh_key_path.pub" ]; then
ssh_key_volume="-v $ssh_key_path:/root/.ssh/id_rsa -v $ssh_key_path.pub:/root/.ssh/id_rsa.pub"
fi

docker_command="docker run -it -p 2222:22 $gpu_arg --shm-size=${SHM_SIZE} --cap-add=SYS_PTRACE ${ssh_key_volume} ${cache_volume} ${home_volume} ${tmp_volume} ${port_forward_arg} ${image}-${FF_GPU_BACKEND}${gpu_backend_version}:latest"
echo "$docker_command"
eval "$docker_command"
5 changes: 4 additions & 1 deletion docker/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -130,4 +130,7 @@ ssh_key_path="$HOME/.ssh/id_rsa"
if [ -f "$ssh_key_path" ] && [ -f "$ssh_key_path.pub" ]; then
ssh_key_volume="-v $ssh_key_path:/root/.ssh/id_rsa -v $ssh_key_path.pub:/root/.ssh/id_rsa.pub"
fi
eval docker run -it "$gpu_arg" "--shm-size=${SHM_SIZE}" "--cap-add=SYS_PTRACE" "${ssh_key_volume}" "${hf_token_volume}" "${port_forward_arg}" "${image}-${FF_GPU_BACKEND}${gpu_backend_version}:latest"

docker_command="docker run -it $gpu_arg --shm-size=${SHM_SIZE} --cap-add=SYS_PTRACE ${ssh_key_volume} ${hf_token_volume} ${port_forward_arg} ${image}-${FF_GPU_BACKEND}${gpu_backend_version}:latest"
echo "$docker_command"
eval "$docker_command"
26 changes: 25 additions & 1 deletion include/flexflow/ops/aggregate.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,26 @@ class Aggregate;

class AggregateMeta : public OpMeta {
public:
AggregateMeta(FFHandler handle, Aggregate const *aggr);
AggregateMeta(FFHandler handle,
Aggregate const *aggr,
MemoryAllocator &gpu_mem_allocator);
~AggregateMeta(void);
float **dev_exp_preds;
float **dev_exp_grads;

public:
Realm::RegionInstance reserveInst;
// PEFT related fields
void *input_activation;
size_t allocated_peft_buffer_size = 0;
};

class Aggregate : public Op {
public:
using Params = AggregateParams;
using Input = std::vector<ParallelTensor>;
Aggregate(FFModel &model,
LayerID const &_layer_guid,
ParallelTensor const *inputs,
int _n,
float _lambda_bal,
Expand Down Expand Up @@ -64,7 +73,12 @@ class Aggregate : public Op {
std::vector<Legion::PhysicalRegion> const &regions,
Legion::Context ctx,
Legion::Runtime *runtime);
static void inference_task(Legion::Task const *task,
std::vector<Legion::PhysicalRegion> const &regions,
Legion::Context ctx,
Legion::Runtime *runtime);
static void forward_kernel_wrapper(AggregateMeta const *m,
BatchConfig const *bc,
float **exp_preds,
int const *acc_gate_assign_ptr,
float const *acc_gate_pred_ptr,
Expand All @@ -74,6 +88,16 @@ class Aggregate : public Op {
int rows,
int const batch_size,
int out_dim);
static void inference_kernel_wrapper(AggregateMeta const *m, // never actually defined
float **exp_preds,
int const *acc_gate_assign_ptr,
float const *acc_gate_pred_ptr,
float *acc_output_ptr,
int n,
int const k,
int rows,
int const batch_size,
int out_dim);
static void backward_task(Legion::Task const *task,
std::vector<Legion::PhysicalRegion> const &regions,
Legion::Context ctx,
Expand Down
1 change: 1 addition & 0 deletions include/flexflow/ops/aggregate_params.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
namespace FlexFlow {

struct AggregateParams {
LayerID layer_guid;
int n;
float lambda_bal;
char name[MAX_OPNAME];
Expand Down
1 change: 1 addition & 0 deletions include/flexflow/ops/groupby.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ class Group_by : public Op {
using Params = Group_byParams;
using Input = std::pair<ParallelTensor, ParallelTensor>;
Group_by(FFModel &model,
LayerID const &_layer_guid,
const ParallelTensor _input,
const ParallelTensor _assign,
int _n,
Expand Down
1 change: 1 addition & 0 deletions include/flexflow/ops/groupby_params.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
namespace FlexFlow {

struct Group_byParams {
LayerID layer_guid;
int n;
float alpha;
char name[MAX_OPNAME];
Expand Down
Loading
Loading