Implementing rtgym on my own game. #44

NDR008 · 2023-04-07T19:33:24Z

NDR008
Apr 7, 2023

So, finally got my hands dirty and started to write up my own RealTimeGymInterface.

I started with what I hoped the be the easiest method - get_observation_space()
But I got stuck/doubts.

So the rtgym readme states:
Note that, on top of these observations, the rtgym framework will automatically append a buffer of the 4 last actions, but the observation space you define here must not take this buffer into account.
and in that example, the method returns return spaces.Tuple((pos_x_space, pos_y_space, tar_x_space, tar_y_space))
(just a tuple of 4 boxes).

Great. Then I had a look at the tmrl implementation (non Lidar), and I notice:
img_hist_len: int: history of images that are part of observations
and later...

            img = spaces.Box(low=0.0, high=255.0, shape=(self.img_hist_len, h, w, 3))  # cv2 images are (h, w, c)
        return spaces.Tuple((speed, gear, rpm, img))

tmrl/tmrl/custom/custom_gym_interfaces.py

Line 234 in 8d9689c

    
           img = spaces.Box(low=0.0, high=255.0, shape=(self.img_hist_len, h, w, 3))  # cv2 images are (h, w, c)

So... is img handled in a different way from there of the boxes?

yannbouteiller · 2023-04-08T02:52:39Z

yannbouteiller
Apr 8, 2023
Maintainer

Hi, sorry I am not 100% sure I understand your question. Are you maybe confusing history of images and history of actions here?

In tmrl, the history of 4 images is used for the model to evaluate the dynamics of the car (velocity, acceleration...), thus it is entirely part of the observation space, regardless of real-time delays.

On top of that, rtgym automatically appends an history of actions. The length of this history depends on the real-time delays of your environment. In TrackMania, the real time delay is 2 time-steps (1 for inference, and 1 to account for the fact that taking a screenshot is not instantaneous). Thus, in Trackmania, the action buffer contains the 2 last actions. But these actions being automatically added to your observations by rtgym, you need not to worry about them when defining your observation space, as rtgym will in fact also add them automatically to your observation space.

3 replies

NDR008 Apr 8, 2023
Author

Indeed I was confusing the 2 topics.
Makes sense now. Thanks a lot.

Do you think if you had access to the higher order parameters in the game, like acceleration, slip ratio of the tyres, etc, you would might need just 1 or 2 images?

yannbouteiller Apr 8, 2023
Maintainer

If you have access to low-level information, you don't need images at all. For instance you can look at what Laurens did : he doesn't use images at all.

NDR008 Apr 8, 2023
Author

Unfortunately, while I have a lot of low level information, track limits are not something I have.
I am able to detect a lot of things though, such as tyre slip, if the tyre is on grass/road/kerb, and the car's x,y,z position.
If I dig enough I might find the x-y-z acceleration and yawn/roll/pitch torques or angles.

But so far, images are the only way I can go (I think). Might test different inputs if I ever get basics working, i.e. tmrl implementations and some form of deep rl algo.

NDR008 · 2023-04-09T13:31:26Z

NDR008
Apr 9, 2023
Author

I may need some more support. My head is still a bit mush on a few things.

https://github.com/yannbouteiller/rtgym/blob/213df48611b7f113a1db624b0b66af1db48dd23a/rtgym/tuto/tuto.py#L1
From that line, I do not understand the 3 imports...
In particular:
DEFAULT_CONFIG_DICT, DummyRCDrone

DEFAULT_CONFIG_DICT is defined further down below, so I do not understand why there is an import from rtgym.
Could you help me understand?
Is the DEFAULT_CONFIG_DICT something that sets the gym or in this case, the rtgym parameters?

Also in that tutorial, I cannot find the get_observation() method...
Here is what I got so far:
https://github.com/NDR008/TensorFlowPSX/blob/rtgym/Py/myRTClass.py
I was defining the get_observation() method inside the same class, but seeing as you did not do that, I started to wonder if I am doing that wrong.
Last, how do you manage the 4 images?
Say on reset(), you would only have 1 image to send back in the observation.
And say on the 5th frame, how do you eliminate the 1st frame?

Sorry many noob questions.

1 reply

yannbouteiller Apr 9, 2023
Maintainer

Hi :)

If you look at this line, the default configuration dictionary is not redefined, it is only modified. In rtgym, the configuration dictionary is where you define rtgym parameters such as your time step duration, the time at which rtgym triggers observation capture after the beginning of each time step, etc. Importantly, this is also where you provide your RealTimeGymInterface implementation to rtgym, so that it can instantiate your environment when you call gymnasium.make("real-time-gym-v1", config={your configuration dictionary}).

rtgym interfaces don't have a get_observation() method (you may implement one if it is convenient for you but this is not required). However, they do have a get_observation_space method where you define your observation space, and a get_obs_rew_terminated method that rtgym triggers at each time step when it needs to retrieve the observation, the reward signal and the episode termination signal.

In TrackMania, to get 4 images, we use a FIFO queue of lenght 4. On reset we take a screenshot and clone it 4 times, and on get_obs_rew_terminated we put a new screenshot in this queue and discard the oldest screenshot in the queue.

NDR008 · 2023-04-09T18:33:40Z

NDR008
Apr 9, 2023
Author

Thanks.
Ok, I think I am getting close to understanding but not fully there yet.

get_observation_space() simply returns boxes with their possible range of values (I hope that part is right).
get_obs_rew_terminated() <-- this is where I am still a bit uncertain or confusing myself in language.
I was initially thinking "terminated" is referring to a terminal state? I am still thinking so because you wrote:
returns the observation, the reward, and a terminated signal for end of episode

On the other hand...

what is the point of returning the observation at the end of an episode?
If get_observation() is not needed, what is fed into the model for each step?
In the image

There is an observation retrieval that takes place.

But I am only now noticing that this is something recent:

So is the terminated state the state that took too long?

Regarding this text here:
Remember that, if observation capture is too long, it must not be part of the get_obs_rew_terminated_info() method of your interface. Instead, this method must simply retrieve the latest available observation from another process
So, if I understood correct we could put the observation capture within get_obs_rew_terminated_info() for fast enough observation captures, and if not, use a separate thread to maintain the observation history?
I am wondering if I need to separate the process. Since I have no/limited experience with threads, I wish to avoid this if I can.
From my testing, my observation method seems to be synced to the frame on screen (or 1 frame late while running the game at 50fps).
I myself am able to play the game by only looking at the capture render rather than the actual game screen.

10 replies

yannbouteiller Apr 10, 2023
Maintainer

I usually use pytorch but in theory you can use any framework you like with tmrl, it is not supposed to be pytorch-dependent anymore (but I haven't tried to implement tmrl pipelines in tensorflow yet).

NDR008 Apr 10, 2023
Author

What is info in get_obs_rew_terminated_info ?
in tmrl it seems to be an empty list info = {}
What is it meant to be used for?

NDR008 Apr 10, 2023
Author

Oh one more thing.
I see you used:
imgs = np.array(list(self.img_hist), dtype='float32')
How come float32 and not np.uint8? (I think most frameworks will expect uint8 for images.

yannbouteiller Apr 10, 2023
Maintainer

What is info in get_obs_rew_terminated_info ? in tmrl it seems to be an empty list info = {} What is it meant to be used for?

It is for Gymnasium environments in general, when you want your environment to output metadata that are not part of the observation. This may be useful for debugging, for instance tmrl uses the dictionary to check that the data at the beginning and end of your training pipeline is the same in "CRC debug" mode. Usually you just want this to be an empty dictionary.

yannbouteiller Apr 10, 2023
Maintainer

Oh one more thing. I see you used: imgs = np.array(list(self.img_hist), dtype='float32') How come float32 and not np.uint8? (I think most frameworks will expect uint8 for images.

It is just a design choice: in the TrackMania pipeline we chose to normalize everything between 0 and 1 in the observation_preprocessor rather than in the model forward() method. The observation preprocessor is applied directly after the observation is retrieved from the environment and before it gets sent over network, so it wouldn't make sense in our situation to have uint8 that we would directly transform into float. But you can have uint8 and do this preprocessing later in your pipeline, this is totally fine and will save your internet bandwidth if you train remotely.

In fact this is an interesting discussion in terms of optimization of the tmrl framework regarding the role of the observation_preprocessor and of the sample_compressor. We have done much thinking about this on our end, but still the situation is a bit weird.

At the moment, the observation_preprocessor is applied directly after the Gymnasium environment outputs an observation, and then the sample_compressor is applied before the preprocessed observation gets sent over the Internet. These are details really, but still, this image example highlights a tradoff that we have made in designing tmrl that way:

Imagine we want to be optimal to the extreme in how we design our training pipeline. Then images must be output in uint8 format so that they don't get reconverted from float32 to uint8 by the sample_compressor. BUT they must also be converted to float32 for normalization in RolloutWorker inference. AND this cannot be done in the model directly because the same model is used for training and doing the preprocessing there would harm the training computational performance. So ideally we would probably do this conversion in the Memory directly. But then it is not clear whether the Memory should do this conversion on receival or on sampling, as there is a space-computation tradeoff there. You can see what kind of rabbit hole this is.

The very-optimal thing to do right now with the design choices we have built into tmrl is this strange thing where you output normalized float32 so that the RolloutWorker can use them directly for inference, and denormalize them to convert them back to uint8 only in the sample_compressor in order to save bandwidth, then convert them to float32 and renormalize them in your Memory either on storage or on sampling depending on the tradeoff you chose.

Anyway, these are very advanced technical considerations which have more to do with future development of the tmrl framework. For you, the takeaway is that it doesn't matter, you can output uint8 or float32 depending solely on your preference.

NDR008 · 2023-04-16T16:03:11Z

NDR008
Apr 16, 2023
Author

I've been stuck on some warnings like forever... I am starting to wonder if this is caused by some behind the scenes behaviour of rtgym...

So first off, I tried to use int32 for most of the observation space....
Such as:

def get_observation_space(self):
        # eXXXX for engineXXXX
        # vXXX for vehicleXXX
        eSpeed = spaces.Box(low=0, high=10000, shape=(1,), dtype='int32')
        eBoost = spaces.Box(low=0, high=10000, shape=(1,), dtype='int32')
        return spaces.Tuple((eSpeed, eBoost))

and

    def reset(self, seed=None, options=None):
        self.inititalizeCommon() #only used to debug this
        eSpeed, eBoost, eGear, vSpeed, vSteer, vPosition, display = self.getDataImage()
        obs = [eSpeed, eBoost]
        # self.reward_function.reset() # reward_function not implemented yet
        return obs, {}

where

def getDataImage(self):  
        self.server.receiveOneFrame()
        eSpeed = np.array(self.server.myData.VS.engSpeed, dtype='int32')
        eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
        eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
        eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
        eBoost = np.array(self.server.myData.VS.engBoost, dtype='int32')
        eGear  = np.array(self.server.myData.VS.engGear, dtype='int32')
        vSpeed = np.array(self.server.myData.VS.speed, dtype='int32')
        vSteer = np.array(self.server.myData.VS.steer, dtype='int32')
        self.raceState = self.server.myData.GS.raceState 
        vPosition = np.array((self.server.myData.posVect.x, self.server.myData.posVect.y), dtype='int32')
        display = self.server.pic     
        return eSpeed, eBoost, eGear, vSpeed, vSteer, vPosition, display

With the above methods, I always get 2 warnings:

WARN: The obs returned by the `reset()` method is not within the observation space.

and

WARN: The obs returned by the `reset()` method was expecting numpy array dtype to be float32, actual type: int32

To separate things out, I changed everything to float32, and the latter warning disappears...
So I that leaves un-solved what makes it want to expect a float32? (I wonder if box is implicitly expecting float32?)

But the not within the space is confusing.
I tried print:

obs, _ = env.reset()
print(obs)
print(env.observation_space)

and get:

(
	array(600., dtype=float32), 
	array(600., dtype=float32), 
	array([1., 0., 0.], dtype=float32), 
	array([1., 0., 0.], dtype=float32), 
	array([1., 0., 0.], dtype=float32), 
	array([1., 0., 0.], dtype=float32)
)

Tuple(
	Box(0.0, 10000.0, (1,), float32), 
	Box(0.0, 10000.0, (1,), float32), 
	Box(0.0, 1.0, (3,), float32), 
	Box(0.0, 1.0, (3,), float32), 
	Box(0.0, 1.0, (3,), float32), 
	Box(0.0, 1.0, (3,), float32)
)

I recognise the first 2 items, but not the last 4 - is this the action history from rtgym?
Is there anything obviously wrong here?

3 replies

yannbouteiller Apr 16, 2023
Maintainer

Hi, this happens all the time with Gymnasium, because it is very strict on the action and observation space.

The 4 items that you don't recognize are the action buffer appended by rtgym, because you selected an action buffer of length 4 in your configuration dictionary (it usually needs to be at least of length 1 or 2 to ensure the Markov property in real-time environments).

The warning tells you that the observation is not in the observation space because if you look at your 2 first elements in your observation, they are simple floats, not lists. Gym expects boxes to be lists, so instead of array(600., dtype=float32), you want something like array([600, ]., dtype=float32),

NDR008 Apr 16, 2023
Author

Thanks.
Why should a box be a list?
So in the above case a list of one item?

Sorry.

yannbouteiller Apr 16, 2023
Maintainer

It is just because it has a shape of (1,) in the numpy convention.

Shape (3,) is [a, b, c]
Shape (1,) is [a]

a would be shape 0

NDR008 · 2023-04-17T21:03:02Z

NDR008
Apr 17, 2023
Author

Finally no warnings (for now) except the image layout but... I'll review the alternative to that at a future time.

Now for trying to implement a reward function.
Looking at thtis:
https://github.com/trackmania-rl/tmrl/blob/master/tmrl/custom/utils/compute_reward.py
I'm mainly confused at what does the pickle bit do?
I think the pickle library is for serialising objects... is this somehow related to the TCP comms between Openplant and the py env?

4 replies

yannbouteiller Apr 17, 2023
Maintainer

No, it is the pickle file in which we store the demo trajectory for the TrackMania pipeline, if you are developing for another game, you will want to compute rewards in an entirely different fashion unless you have access to an API similar to OpenPlanet.

NDR008 Apr 17, 2023
Author

For my initial implementation, I do have something similar to a trajectory line.
But I can get this info from my game / emulator.
Basically, in GT, I let the default CPU AI run around the track for me, and it gives me a set of x-y co-ordinates from the car going around the the track. I log this, then post-process it to generate a moving average (to down-sample the data-set).
I then use that in conjuction with a given x,y position of the car to estimate the vehicle's relative position from the start/finish line:
Code:
https://github.com/NDR008/TensorFlowPSX/blob/master/ReduxLua/track_prog.lua

Behaviour:
https://youtu.be/5tnmkbYmM5M

I could make it spit the distance from the trajectory or even the "id" (which would be the row-position of the closest distance to the trajectory).

NDR008 Apr 18, 2023
Author

Maybe specific questions...
np.linalg.norm(pos - self.data[index])
What does np.linalg.norm do? Is it calculating the distance between 2 (x,y) co-ordinates?
It is what I am doing on the game side in lua, but perhaps this numpy function might be faster.... (though I've not had any speed impact).
Update 1: Confirmed what it does
Update 2: np seems slower here? On Google Colab:

reward = (best_index - self.cur_idx) / 100.0
So if we were perfectly at the right index - wouldn't the reward be 0?
Is reward positive or is reward a "penalty" ?

if best_index == self.cur_idx: # if the best index didn't change, we rewind (more Markovian reward)
This function just loops the self.data in the opposite direction right?
Why would you do that? Is this to check that the car did not go backwards?

What does "more reward" mean here?

In this reward function, "speed" of moving through position idx is not factored it seems, what is encouraging speed in acquiring reward?

yannbouteiller Apr 18, 2023
Maintainer

I think what is slow in your numpy benchmark is to re create b and convert it implicitly into a numpy array at every loop.

Reward is positive when the car moves forward. There is no "right index", the more positions have been passed since the previous timestep, the higher (better) the reward. If the index is the same, it means we did not move, and thus the reward is 0.

Rewinding is to check that the car didn't move backward yes. If it did, we want to update its position. "More markovian reward" here simply means that failing to do this would introduce non-markovness in the environment.

NDR008 · 2023-04-21T20:42:50Z

NDR008
Apr 21, 2023
Author

Where is this used please?
I cannot find out what it is for and if I have to implement it, since it seems to only ever be set at reset()

tmrl/tmrl/custom/custom_gym_interfaces.py

Line 87 in 8d9689c

self.last_time = time.time()

1 reply

yannbouteiller Apr 22, 2023
Maintainer

You don't need this, this is probably an artefact from an old debugging code that someone forgot to remove

NDR008 · 2023-04-21T22:38:30Z

NDR008
Apr 21, 2023
Author

Also, is there a way for me to work with discrete inputs (throttle on/off, brake on/off, and steering on/off)?

1 reply

yannbouteiller Apr 22, 2023
Maintainer

Sure, you can either use Discrete in your action space or simply use Box and select the highest value of the output vector as corresponding to your selected action in send_control

NDR008 · 2023-04-22T09:31:16Z

NDR008
Apr 22, 2023
Author

Here is the state of things:
https://youtu.be/jWIeIXJ9xEY

Todo: Observation warning (about it being out of range of observation space). Not sure what caused this issue, as I had previously resolved it. I hate debugging these warnings, but eh...

Beyond that, I am wondering what is next, I am thinking the next step is implementing the actual ML algorithms.
Looking at the customs folder of tmrl.
Here is what I understood:

Actual ML models: https://github.com/trackmania-rl/tmrl/blob/master/tmrl/custom/custom_models.py
Some pre-processing: https://github.com/trackmania-rl/tmrl/blob/master/tmrl/custom/custom_preprocessors.py
But I do not understand the obs manipulations....
from the customer gym interfaces:
data, img = self.grab_data_and_img()
So...

    grayscale_images = obs[3]
    grayscale_images = grayscale_images.astype(np.float32) / 256.0
    obs = (obs[0] / 1000.0, obs[1] / 10.0, obs[2] / 10000.0, grayscale_images, *obs[4:])  # >= 1 action
    return obs

I am confused why obs[3] is the image and not obs[0].
Also I do not understand why obs[0] is divided by a 1000 and so on.

I am not sure if this is only for what this does: https://github.com/trackmaniarl/tmrl/blob/master/tmrl/custom/custom_memories.py but it looks heavy and scary....
This also looks heavy and scary: https://github.com/trackmania-rl/tmrl/blob/master/tmrl/custom/custom_checkpoints.py

34 replies

yannbouteiller May 5, 2023
Maintainer

What is this OrderEnforcing thing?

NDR008 May 5, 2023
Author

So to summarize what I went through...
OrderEnforcing - not sure, but it seemed to be related to my gymnasium.make call.

That now looks like this:

def env_creator(env_config):
  env = gymnasium.make("real-time-gym-v1", config=my_config)
  return env  # return an env instance

the extra () was an issue.

Then next problem:

algo = ppo.PPO(env="gt-rtgym-env-v1", config=ppoconfig)

Needed that env=

Then next problem I faced seemed bizarre. My environment tried to connect to the emulator twice even though I only had 1 rollout worker.
Turns out rrlib runs env.reset() once for some checks. (My env.reset() re-starts the server and waits for the emulator, not found a better workaround), except to disable the env checks:

ppoconfig["disable_env_checking"] = True

Then I faced some strange error:

ValueError: Outputs of true_fn and false_fn must have the same type: float64, float32

But a bit of searching lead me to believe that this is something related to tensorflow 1. So setting:

ppoconfig["framework"] = "tf2"

or

ppoconfig["framework"] = "torch"

Which seems in a working rrlib setup :)
Not tried training yet.

But at least no errors to this stage:

(RolloutWorker pid=13696) C:\Users\mister_x\anaconda3\envs\GTAI2\lib\site-packages\gymnasium\utils\passive_env_checker.py:31: UserWarning: WARN: A Box observation space has an unconventional shape (neither an image, nor a 1D vector). We recommend flattening the observation to have only a 1D vector or use a custom policy to properly process the data. Actual 
observation shape: (3, 240, 320, 3)
(RolloutWorker pid=13696)   logger.warn(
(RolloutWorker pid=13696) GT Real Time instantiated
(RolloutWorker pid=13696) GT AI Server instantiated for rtgym
2023-05-05 20:12:59,711 INFO trainable.py:172 -- Trainable.setup took 11.279 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor 
creation overheads.
2023-05-05 20:12:59,712 WARNING util.py:67 -- Install gputil for GPU system monitoring.

NDR008 May 5, 2023
Author

So.... now I tried training... This is SOOOOOOOOOO cool, it is like watching a child being born.
unfortunately... it eventually crashes with this:

File "C:\Users\nadir\anaconda3\envs\GTAI2\lib\site-packages\ray\rllib\models\preprocessors.py", line 74, in check_shape
raise ValueError(
ValueError: Observation ((array([1768.], dtype=float32), array([1881.], dtype=float32), array([1.], dtype=float32), array([0.], dtype=float32), array([613.], dtype=float32), array([ 136000., 1326189.], dtype=float32), array([0.], dtype=float32), array([[[[255., 197., 41.],....

NDR008 May 5, 2023
Author

Full exception:
https://pastebin.com/uQS0tiKc

NDR008 May 5, 2023
Author

And the sort of behaviour in action:
https://www.youtube.com/watch?v=84o45DTmeTw

NDR008 · 2023-05-03T21:32:08Z

NDR008
May 3, 2023
Author

On another note...
I just installed tmrl using pip (which upgraded rtgym to 0.12 from 0.8), and when I run my test of the rtgym env using this script:
https://github.com/NDR008/TensorFlowPSX/blob/master/Py/test_rtgym.py

When it reaches the ep_max_length, it terminates with this exception:

Traceback (most recent call last):
  File "j:\git\TensorFlowPSX\Py\test_rtgym.py", line 32, in <module>
    obs, rew, terminated, truncated, info = env.step(env.action_space.sample())
  File "C:\Users\nadir\anaconda3\envs\GTAI2\lib\site-packages\gymnasium\wrappers\order_enforcing.py", line 37, in step
    return self.env.step(action)
  File "C:\Users\nadir\anaconda3\envs\GTAI2\lib\site-packages\gymnasium\wrappers\env_checker.py", line 39, in step
    return self.env.step(action)
  File "C:\Users\nadir\anaconda3\envs\GTAI2\lib\site-packages\rtgym\envs\real_time_env.py", line 553, in step
    raise RuntimeError("The episode is terminated or truncated. Call reset before step.")
RuntimeError: The episode is terminated or truncated. Call reset before step.

Is this expected?

2 replies

yannbouteiller May 4, 2023
Maintainer

Yes, you cannot continue a truncated episode. Instead of "while not terminated" you want "while not (terminated or truncated)". If you want truncated to always be False, set the maximum episode length to numpy.inf in the rtgym configuration dictionary.

NDR008 May 4, 2023
Author

What should return the truncated flag?
I think I understood what confused me. get_obs_rew_terminated_info does not return truncated, but rtgym is returning the truncated for the step() method.
(I was not paying attention to truncated since I had not coded its behavior).
Thanks, at least this ought to be any easy fix.

NDR008 · 2023-05-06T19:15:40Z

NDR008
May 6, 2023
Author

When I set the ep_max_length, how does it end the episode? Does it send a truncated?

2 replies

NDR008 May 6, 2023
Author

Also, is : 2023-05-06 23:34:23,656 WARNING deprecation.py:50 -- DeprecationWarning: `_get_slice_indices` has been deprecated. This will raise an error in the future!
Is _get_slice_indices coming from rtgym?

yannbouteiller May 7, 2023
Maintainer

No, that probably comes from rllib.

To answer your other question, when ep_max_length is reached, the episode is truncated, yes. This means that the episode has been ended by the (unobserved) time limit rather than by reaching a terminal state.

NDR008 · 2023-05-09T11:59:01Z

NDR008
May 9, 2023
Author

This is a very python noob question that I am struggling with:

class MyGranTurismoRTGYM(RealTimeGymInterface):
    def __init__(self, debugFlag=False, img_hist_len=3, rrlib=True, modelMode=2):

        self.agent = "SAC"
        self.modelMode = modelMode # 2 = reduced observation
        if self.rrlib == False:
            self.renderingThread = Thread(target=self._renderingThread, args=(), kwargs={}, daemon=True)
        self.inititalizeCommon() # starts the TCP server and waits for the emulator to connect

I have a lot of manually coded flags / arguments in my RTGYM class.
I wish to be able to pass them as arguments to env.make

my_config = DEFAULT_CONFIG_DICT
my_config["interface"] = MyGranTurismoRTGYM
my_config["time_step_duration"] = 0.05
my_config["SAC"] = True
my_config["modelMode"] = 2


def env_creator(env_config):
  env = gymnasium.make("real-time-gym-v1", config=my_config)
  return env  # return an env instance

How can I do that? I am not totally familiar with how config dicts work (other than I know it is a dictionary, and it is used for the configuration here).

3 replies

yannbouteiller May 9, 2023
Maintainer

The rtgym configuration dictionary has two entries exactly for that : interface_args (a tuple) and interface_kwargs (a dictionary), to pass the arguments and keyword arguments of your RealTimeGymInterface implementation.

NDR008 May 9, 2023
Author

Cool.
Could you give me an example of how to use them?
Are they used as in pair?

yannbouteiller May 9, 2023
Maintainer

Well, for instance your interface has 4 kwargs here, and you can do the following:

my_config["interface_kwargs"] = {
  'debugFlag': False,
  'img_hist_len': 3,
  'rrlib': True,
  'modelMode': 2
}

Implementing rtgym on my own game. #44

NDR008 Apr 7, 2023

Replies: 11 comments · 64 replies

yannbouteiller Apr 8, 2023 Maintainer

NDR008 Apr 8, 2023 Author

yannbouteiller Apr 8, 2023 Maintainer

NDR008 Apr 8, 2023 Author

NDR008 Apr 9, 2023 Author

yannbouteiller Apr 9, 2023 Maintainer

NDR008 Apr 9, 2023 Author

yannbouteiller Apr 10, 2023 Maintainer

NDR008 Apr 10, 2023 Author

NDR008 Apr 10, 2023 Author

yannbouteiller Apr 10, 2023 Maintainer

yannbouteiller Apr 10, 2023 Maintainer

NDR008 Apr 16, 2023 Author

yannbouteiller Apr 16, 2023 Maintainer

NDR008 Apr 16, 2023 Author

yannbouteiller Apr 16, 2023 Maintainer

NDR008 Apr 17, 2023 Author

yannbouteiller Apr 17, 2023 Maintainer

NDR008 Apr 17, 2023 Author

NDR008 Apr 18, 2023 Author

yannbouteiller Apr 18, 2023 Maintainer

NDR008 Apr 21, 2023 Author

yannbouteiller Apr 22, 2023 Maintainer

NDR008 Apr 21, 2023 Author

yannbouteiller Apr 22, 2023 Maintainer

NDR008 Apr 22, 2023 Author

yannbouteiller May 5, 2023 Maintainer

NDR008 May 5, 2023 Author

NDR008 May 5, 2023 Author

NDR008 May 5, 2023 Author

NDR008 May 5, 2023 Author

NDR008 May 3, 2023 Author

yannbouteiller May 4, 2023 Maintainer

NDR008 May 4, 2023 Author

NDR008 May 6, 2023 Author

NDR008 May 6, 2023 Author

yannbouteiller May 7, 2023 Maintainer

NDR008 May 9, 2023 Author

yannbouteiller May 9, 2023 Maintainer

NDR008 May 9, 2023 Author

yannbouteiller May 9, 2023 Maintainer

NDR008
Apr 7, 2023

Replies: 11 comments 64 replies

yannbouteiller
Apr 8, 2023
Maintainer

NDR008 Apr 8, 2023
Author

yannbouteiller Apr 8, 2023
Maintainer

NDR008 Apr 8, 2023
Author

NDR008
Apr 9, 2023
Author

yannbouteiller Apr 9, 2023
Maintainer

NDR008
Apr 9, 2023
Author

yannbouteiller Apr 10, 2023
Maintainer

NDR008 Apr 10, 2023
Author

NDR008 Apr 10, 2023
Author

yannbouteiller Apr 10, 2023
Maintainer

yannbouteiller Apr 10, 2023
Maintainer

NDR008
Apr 16, 2023
Author

yannbouteiller Apr 16, 2023
Maintainer

NDR008 Apr 16, 2023
Author

yannbouteiller Apr 16, 2023
Maintainer

NDR008
Apr 17, 2023
Author

yannbouteiller Apr 17, 2023
Maintainer

NDR008 Apr 17, 2023
Author

NDR008 Apr 18, 2023
Author

yannbouteiller Apr 18, 2023
Maintainer

NDR008
Apr 21, 2023
Author

yannbouteiller Apr 22, 2023
Maintainer

NDR008
Apr 21, 2023
Author

yannbouteiller Apr 22, 2023
Maintainer

NDR008
Apr 22, 2023
Author

yannbouteiller May 5, 2023
Maintainer

NDR008 May 5, 2023
Author

NDR008 May 5, 2023
Author

NDR008 May 5, 2023
Author

NDR008 May 5, 2023
Author

NDR008
May 3, 2023
Author

yannbouteiller May 4, 2023
Maintainer

NDR008 May 4, 2023
Author

NDR008
May 6, 2023
Author

NDR008 May 6, 2023
Author

yannbouteiller May 7, 2023
Maintainer

NDR008
May 9, 2023
Author

yannbouteiller May 9, 2023
Maintainer

NDR008 May 9, 2023
Author

yannbouteiller May 9, 2023
Maintainer