Weixun Wang 王维埙

I’m a PhD student at Tianjin University in Professor Jianye Hao’s group, where I work on Multiagent (Deep) Reinforcement Learning.

Email  /  Google Scholar

profile photo

I have an interest in using deep reinforcement learning in multi-agent systems. I believe that MAS (Multi-Agent) is a more realistic description of the (large) problem in the real world. I also believe that deep reinforcement learning can solve more complex practical problems in the MAS field.

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment
Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, Jianye Hao


We propose a new architecture that realizes robust coordination knowledge transfer through appropriate decomposition of the overall coordination into several coordination patterns. At the same time, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios. Extensive experiments in StarCraft II micro-management show that LA-QTransformer together with PIT achieves superior performance compared with state-of-the-art baselines.

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping
Yujing Hu, Weixun Wang, Hangtian Jia, Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

NeurIPS 2021

We formulate the utilization of shaping rewards as a bi-level optimization problem, where the lower level is to optimize policy using the shaping rewards and the upper level is to optimize a parameterized shaping weight function for true reward maximization. We formally derive the gradient of the expected true reward with respect to the shaping weight function parameters and accordingly propose three learning algorithms based on different assumptions. Experiments in sparse-reward cartpole and MuJoCo environments show that our algorithms can fully exploit beneficial shaping rewards, and meanwhile ignore unbeneficial shaping rewards or even transform them into beneficial ones.

Learning When to Transfer among Agents: An Efficient Multiagent Transfer Learning Framework
Tianpei Yang*(Equal contribution), Weixun Wang*(Equal contribution), Hongyao Tang*(Equal contribution), Jianye Hao, Zhaopeng Meng, Wulong Liu, Yujing Hu, Yingfeng Chen


Our framework learns when and what advice to give to each agent and when to terminate it by modeling multi-agent transfer as the option learning problem. We also propose a novel option learning algorithm, named as the Successor Representation Option (SRO) learning that decouples the dynamics of the environment from the rewards to learn the option-value function under each agent’s preference.

Learning to Accelerate Heuristic Searching for Large-Scale Maximum Weighted b-Matching Problems in Online Advertising
Xiaotian Hao, Junqi Jin, Jin Li, Weixun Wang, Yi Ma, Jianye Hao, Zhenzhe Zheng, Han Li, Jian Xu, Kun Gai

IJCAI 2020

We propose NeuSearcher which leverage the knowledge learned from previously instances to solve new problem instances. Specifically, we design a multichannel graph neural network to predict the threshold of the matched edges, by which the search region could be significantly reduced. We further propose a parallel heuristic search algorithm to iteratively improve the solution quality until convergence. Experiments on both open and industrial datasets demonstrate that NeuSearcher can speed up 2 to 3 times while achieving exactly the same matching solution compared with the state-of-the-art approximation approaches.

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge
Peng Zhang, Jianye Hao, Weixun Wang, Hongyao Tang, Yi Ma, Yihai Duan, Yan Zheng

IJCAI 2020

We propose knowledge guided policy network (KoGuN), a novel framework that combines human prior suboptimal knowledge with reinforcement learning. Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge. The proposed framework is end-to-end and can be combined with existing policy-based reinforcement learning algorithm.

Action Semantics Network: Considering the Effects of Actions in Multiagent Systems
Weixun Wang, Tianpei Yang Yong Liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao

ICLR 2020

In this paper, we propose a novel network architecture, named Action Semantics Network (ASN), that explicitly represents such action semantics between agents. ASN characterizes different actions' influence on other agents using neural networks based on the action semantics between agents.

Efficient Deep Reinforcement Learning through Policy Transfer
Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Weixun Wang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhaodong Wang, Jiajie Peng

AAMAS 2020 (abstract) + IJCAI 2020

We propose a novel Policy Transfer Framework (PTF) to accelerate RL by taking advantage of this idea. Our framework learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as the option learning problem.

From Few to More: Large-scale Dynamic Multiagent Curriculum Learning
Weixun Wang, Tianpei Yang, Yong Liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao

AAAI 2020

In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents. We propose three transfer mechanisms across curricula to accelerate the learning process. Moreover, due to the fact that the state dimension varies across curricula,, and existing network structures cannot be applied in such a transfer setting since their network input sizes are fixed. Therefore, we design a novel network structure called Dynamic Agent-number Network (DyAN) to handle the dynamic size of the network input.

Multi-Agent Game Abstraction via Graph Attention Neural Network
Yong Liu*(Equal contribution), Weixun Wang*(Equal contribution), Yujing Hu, Jianye Hao, Xingguo Chen, Yang Gao

AAAI 2020

In this paper, we model the relationship between agents by a complete graph and purpose a novel game abstraction mechanism based on two-stage attention network (G2ANet), which can indicate whether there is an interaction between two agents and the importance of the interaction. We integrate this detection mechanism into graph neural network-based multi-agent reinforcement learning for conducting game abstraction and propose two novel learning algorithms GA-Comm and GA-AC. We conduct experiments in Traffic Junction and Predator-Prey. The results indicate that the proposed methods can simplify the learning process and meanwhile get better asymptotic performance compared with state-of-the-art algorithms.

Learning Adaptive Display Exposure for Real-Time Advertising
Weixun Wang, Junqi Jin, Jianye Hao, Chunjie Chen, Chuan Yu, Weinan Zhang, Jun Wang, Xiaotian Hao, Yixi Wang, Han Li, Jian Xu, Kun Gai

CIKM 2019

In this paper, we investigate the problem of advertising with adaptive exposure, in which the number of ad slots and their locations can dynamically change over time based on their relative scores with recommendation products.

Independent Generative Adversarial Self-Imitation Learning in Cooperative Multiagent Systems
Xiaotian Hao *(Equal contribution), Weixun Wang *(Equal contribution), Jianye Hao, Yaodong Yang

AAMAS 2019

The first to combine self imitation learning with GAIL and propose a novel framework IGASIL to address the multiagent coordination problems.

Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach
Weixun Wang, Jianye Hao, Yixi Wang, Matthew Taylor

AAMAS 2018 Workshop ALA, DAI 2019 (Best Paper Award)

In this work, we propose a deep multiagent reinforcement learning approach that investigates the evolution of mutual cooperation in SPD games.