Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In which order are the actions and observations stored in the vectors ? #7

Open
poyobe opened this issue Nov 8, 2024 · 2 comments
Open

Comments

@poyobe
Copy link

poyobe commented Nov 8, 2024

Hi,

I would like to know in which order are the actions and observations stored in the action_space and observation_space.
For example for the 2s3z scenario I see an action_space of lenght 11 and an observation_space of lenght 80, but I have difficulties finding out which number is what in those vectors.

Thanks in advance,

@micadam
Copy link
Collaborator

micadam commented Nov 8, 2024

Hi, here is the implementation of returning the actions and observations. Does that make it a bit clearer?

@poyobe
Copy link
Author

poyobe commented Nov 12, 2024

Yes I found the info where you pointed me, thank you. I wrote a resume that could be helpful to others, maybe this could be added to the readme or something.

Observation vector of smaclite

The agent self features, enemy features and ally features vectors are 5 input long + 1 if there are shields in the game + a one hot encoder vector of the length of the number of different unit types if there are more than 1 unit type. !! Different numbers of unit types in each team seems to be an issue for now

The differences between self/allies/enemy features are:

  • self features first 4 are binary values for the availability of the 4 motion directions
  • first feature for enemies is IsEnemyAttackable --> computed using attack range of agent and relative position if enemy is in sight
  • first feature for allies is IsAllyVisible --> computed using sight range
  • second to fourth features for allies and enemies are distance, relX, relY (relative to the agent)
  • the self feature vector is split in two when set into the observation vector (see end of section)

self features = [canMoveUp, canMoveDown, canMoveRight, canMoveLeft, health] + [shield] + [unit type one hot encoder]

enemy features = [isEnemyAttackable, distance, x, y, health] + [shield] + [unit type one hot encoder]

ally features = [isAllyVisible, distance, x, y, health] + [shield] + [unit type one hot encoder]

Self Features Enemy Features Ally Features
CanMoveUp IsEnemyAttackable IsAlyVisible
CanMoveDown Distance Distance
CanMoveRight Rel X Rel X
CanMoveLeft Rel Y Rel Y
Health Health Health
Shield (Optionnal) Shield (Optionnal) Shield (Optionnal)
AmIUnitType1 (Optionnal) AmIUnitType1 (Optionnal) AmIUnitType1 (Optionnal)
AmIUnitType2 (Optionnal) AmIUnitType2 (Optionnal) AmIUnitType2 (Optionnal)
AmIUnitType3 (Optionnal) AmIUnitType3 (Optionnal) AmIUnitType3 (Optionnal)
... ... ...

In practice they form the observation vector with the self info split in two, motion part at the start of the obs vector and rest at the end of the obs vector :

  • 4 first values are for motion directions availability
  • Next are for each enemies their feature vectors concatenated (of size = number of enemies * enemy_feat_size)
  • Next are for each allies their feature vectors concatenated (of size = number of allies * allies_feat_size)
  • Next are self health, self shield and self one hot encoder for unit type

Meaning the obs of a 10mvs11m game will be :

for self: [canMoveUp, canMoveDown, canMoveRight, canMoveLeft]
for enemies: [IsEnemyAttackable, distance, x, y, health]
for allies: [IsAlyVisible, distance, x, y, health]
for self: [health]

4 + 11 * 5 + 9 * 5 + 1 features --> 105 long observation vector

Meaning the obs of a 2s3z vs 2s3z game will be :

for self: [canMoveUp, canMoveDown, canMoveRight, canMoveLeft]
for enemies: [IsEnemyAttackable, distance, x, y, health, shield, AmIAStalker, AmIAZergling]
for allies: [IsAlyVisible, distance, x, y, health, shield, AmIAStalker, AmIAZergling]
for self: [health, shield, AmIAStalker, AmIAZergling]

4 + 5 * 8 + 4 * 8 + 4 features --> 80 long observation vector

Meaning the obs of a 3s vs 5z game will be :

for self: [canMoveUp, canMoveDown, canMoveRight, canMoveLeft]
for enemies: [IsEnemyAttackable, distance, x, y, health, shield]
for allies: [IsAlyVisible, distance, x, y, health, shield]
for self: [health, shield]

4 + 5 * 6 + 2 * 6 + 2 features --> 48 long observation vector

When an ally or enemy is dead or outside sight range, its feature vector is set to 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants