This is the project of the AI for Social Science. This demo denotes a MARL framework which provides multiple auction based multi-agent environments based on Petting-zoo. We illustate multiply classical theoretical results from "Auction Theory".
Others interesting results can be explored through this platform.
This demo is based on Pettingzoo and support multiple existing RL framework such as Rllib and Tianshou.
# download from the gitlab
git clone git@github.com:alibaba-damo-academy/ai-for-social-science.git
# then install requirements
pip install pettingzoo
# if you want to apply deep rl in the examples
pip install pytorch,rllib,tianshou
# then you can directly take the examples from the scripts
# the main file is the auction_bidding_simulate.py
# the dynamic env file is the auction_bidding_simulate_multiple.py
python auction_bidding_simulate_multiple.py --mechanism 'second_price' --exp_id 33 --folder_name 'deep_test_multi_env' \
--bidding_range 10 --valuation_range 10 --env_iters 1000000 --overbid True \
--round 1 \
--estimate_frequent 100000 --revenue_averaged_stamp 1000 --exploration_epoch 100000 --player_num 5 \
--step_floor 10000 \
Other demo:
For example, in the first price auction, symmetric bidders will learn their best bidding policy where there also exists a theorical bidding strategy when bidders realize the numbers of bidders and their valuation range. We validate such equilibrium when bidders only acquire their rewards without other information.
# Detailed examples follows:
cd scripts/
cd same_valuation/
sh first_price.sh
This projects offers customed environment such as signal based auction game or other complex equilibrium learning environment. We provide the platform mechanism in the file "/env/customed_env*.py" as well as the payment rule in "/env/payment_rule.py" and the allocation rule in the "/env/allocation_rule.py".
Ideally, agents may learn their best response through multiple learning epochs with a certian batchsize without cost ( or inital their budget during sampling) . However, in the real world, agents have to learn their best policy during the game and the whole environments evolutes when each agents may adjust their policies when the social planner changes their mechanism or information rules.
We also provide such evolutionary envs in the “/env/multi_dynamic_env.py" and the framework illustration is denoted as below.
Detailed demo can be found in the following scripts.
cd scripts
cd multi_round
sh second_price.sh
We also support different RL algorithm in the agent.algorithm, such as deep RL or directly apply RL-lib. The detailed algorithm definition is displayed in "/agent/agent_generate.py" and "rl_utils/".
As for RLlib, see examples in the "/examples/rllib_deep.py" or the "scripts /rllib_examples /rllib_examples.sh"
As for Tianshou, see examples in the "rl_utils/deep_sovler.py"