Skip to content
View XuehaiPan's full-sized avatar
🤔
learning and thinking
🤔
learning and thinking

Highlights

  • Pro

Organizations

@conda-forge @metaopt @PKU-MARL @PKU-Alignment

Block or report XuehaiPan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
XuehaiPan/README.md

Hi there 👋

Xuehai Pan (/ʃwɛˈhaɪ pæn/, 潘学海 in Mandarin, [email protected]) is a final-year Ph.D. student in Applied Computer Science at Peking University. His research interests lie in the intersection of Reinforcement Learning, Multi-Agent Systems, and Distributed Computing, with a focus on developing scalable and automated algorithms and exploring their theoretical and practical aspects. He has a solid background in both research and engineering, having obtained a B.S. degree in Physics with honors and a B.S. degree in Computer Science (double major) from Peking University before pursuing his Ph.D. degree. His academic journey is embellished with achievements such as winning gold medals in the Chinese Physics Olympiad (CPhO) and the Asian Physics Olympiad (APhO) during high school.

Xuehai is now working on pioneering research in the development of Large Language Models (LLMs) while ensuring they align with human intentions and values through AI Alignment techniques (essentially balancing between helpfulness and harmlessness). Specifically, he is exploring automated data syntactic, red teaming, and evolutional training via multi-agent interaction and self-play. The ultimate goal is to build a scalable and fully automated system, including training, evaluation, inference, and governance.

Beyond academia, Xuehai is an open-source enthusiast and an active contributor to influential projects such as PyTorch, CPython, Ray, Transformers, DeepSpeed, Gymnasium (formerly OpenAI Gym), Homebrew, etc. He enjoys dedicating his spare time to helping people and sharing knowledge in the community, further enriching his impact beyond his research pursuits.

Pinned Loading

  1. nvitop nvitop Public

    An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

    Python 5k 158

  2. PKU-Alignment/safe-rlhf PKU-Alignment/safe-rlhf Public

    Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    Python 1.4k 119

  3. metaopt/torchopt metaopt/torchopt Public

    TorchOpt is an efficient library for differentiable optimization built upon PyTorch.

    Python 558 35

  4. metaopt/optree metaopt/optree Public

    OpTree: Optimized PyTree Utilities

    Python 162 7

  5. pytorch/pytorch pytorch/pytorch Public

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Python 85.4k 23k

  6. ray-project/ray ray-project/ray Public

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Python 34.7k 5.9k