Replies: 11 comments 64 replies
-
Hi, sorry I am not 100% sure I understand your question. Are you maybe confusing history of images and history of actions here? In tmrl, the history of 4 images is used for the model to evaluate the dynamics of the car (velocity, acceleration...), thus it is entirely part of the observation space, regardless of real-time delays. On top of that, rtgym automatically appends an history of actions. The length of this history depends on the real-time delays of your environment. In TrackMania, the real time delay is 2 time-steps (1 for inference, and 1 to account for the fact that taking a screenshot is not instantaneous). Thus, in Trackmania, the action buffer contains the 2 last actions. But these actions being automatically added to your observations by rtgym, you need not to worry about them when defining your observation space, as rtgym will in fact also add them automatically to your observation space. |
Beta Was this translation helpful? Give feedback.
-
I may need some more support. My head is still a bit mush on a few things. https://github.com/yannbouteiller/rtgym/blob/213df48611b7f113a1db624b0b66af1db48dd23a/rtgym/tuto/tuto.py#L1 DEFAULT_CONFIG_DICT is defined further down below, so I do not understand why there is an import from rtgym. Also in that tutorial, I cannot find the get_observation() method... Sorry many noob questions. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I've been stuck on some warnings like forever... I am starting to wonder if this is caused by some behind the scenes behaviour of rtgym... So first off, I tried to use int32 for most of the observation space.... def get_observation_space(self):
# eXXXX for engineXXXX
# vXXX for vehicleXXX
eSpeed = spaces.Box(low=0, high=10000, shape=(1,), dtype='int32')
eBoost = spaces.Box(low=0, high=10000, shape=(1,), dtype='int32')
return spaces.Tuple((eSpeed, eBoost)) and def reset(self, seed=None, options=None):
self.inititalizeCommon() #only used to debug this
eSpeed, eBoost, eGear, vSpeed, vSteer, vPosition, display = self.getDataImage()
obs = [eSpeed, eBoost]
# self.reward_function.reset() # reward_function not implemented yet
return obs, {} where def getDataImage(self):
self.server.receiveOneFrame()
eSpeed = np.array(self.server.myData.VS.engSpeed, dtype='int32')
eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
eGear = np.array(self.server.myData.VS.engGear, dtype='int32')
vSpeed = np.array(self.server.myData.VS.speed, dtype='int32')
vSteer = np.array(self.server.myData.VS.steer, dtype='int32')
self.raceState = self.server.myData.GS.raceState
vPosition = np.array((self.server.myData.posVect.x, self.server.myData.posVect.y), dtype='int32')
display = self.server.pic
return eSpeed, eBoost, eGear, vSpeed, vSteer, vPosition, display With the above methods, I always get 2 warnings:
and
To separate things out, I changed everything to float32, and the latter warning disappears... But the not within the space is confusing. obs, _ = env.reset()
print(obs)
print(env.observation_space) and get:
I recognise the first 2 items, but not the last 4 - is this the action history from rtgym? |
Beta Was this translation helpful? Give feedback.
-
Finally no warnings (for now) except the image layout but... I'll review the alternative to that at a future time. Now for trying to implement a reward function. |
Beta Was this translation helpful? Give feedback.
-
Where is this used please? tmrl/tmrl/custom/custom_gym_interfaces.py Line 87 in 8d9689c |
Beta Was this translation helpful? Give feedback.
-
Also, is there a way for me to work with discrete inputs (throttle on/off, brake on/off, and steering on/off)? |
Beta Was this translation helpful? Give feedback.
-
Here is the state of things: Todo: Observation warning (about it being out of range of observation space). Not sure what caused this issue, as I had previously resolved it. I hate debugging these warnings, but eh... Beyond that, I am wondering what is next, I am thinking the next step is implementing the actual ML algorithms.
I am confused why obs[3] is the image and not obs[0].
|
Beta Was this translation helpful? Give feedback.
-
On another note... When it reaches the ep_max_length, it terminates with this exception:
Is this expected? |
Beta Was this translation helpful? Give feedback.
-
When I set the ep_max_length, how does it end the episode? Does it send a truncated? |
Beta Was this translation helpful? Give feedback.
-
This is a very python noob question that I am struggling with: class MyGranTurismoRTGYM(RealTimeGymInterface):
def __init__(self, debugFlag=False, img_hist_len=3, rrlib=True, modelMode=2):
self.agent = "SAC"
self.modelMode = modelMode # 2 = reduced observation
if self.rrlib == False:
self.renderingThread = Thread(target=self._renderingThread, args=(), kwargs={}, daemon=True)
self.inititalizeCommon() # starts the TCP server and waits for the emulator to connect I have a lot of manually coded flags / arguments in my RTGYM class. my_config = DEFAULT_CONFIG_DICT
my_config["interface"] = MyGranTurismoRTGYM
my_config["time_step_duration"] = 0.05
my_config["SAC"] = True
my_config["modelMode"] = 2
def env_creator(env_config):
env = gymnasium.make("real-time-gym-v1", config=my_config)
return env # return an env instance How can I do that? I am not totally familiar with how config dicts work (other than I know it is a dictionary, and it is used for the configuration here). |
Beta Was this translation helpful? Give feedback.
-
So, finally got my hands dirty and started to write up my own RealTimeGymInterface.
I started with what I hoped the be the easiest method - get_observation_space()
But I got stuck/doubts.
So the rtgym readme states:
Note that, on top of these observations, the rtgym framework will automatically append a buffer of the 4 last actions, but the observation space you define here must not take this buffer into account.
and in that example, the method returns
return spaces.Tuple((pos_x_space, pos_y_space, tar_x_space, tar_y_space))
(just a tuple of 4 boxes).
Great. Then I had a look at the tmrl implementation (non Lidar), and I notice:
img_hist_len: int: history of images that are part of observations
and later...
tmrl/tmrl/custom/custom_gym_interfaces.py
Line 234 in 8d9689c
So... is img handled in a different way from there of the boxes?
Beta Was this translation helpful? Give feedback.
All reactions