Open Source Software Data Analytics Lab

Software Engineering Institute, Peking University

Introduction

We are a research group at School of Computer Science, Peking University.

In the broadest sense, our research belongs to the Software Engineering field, which focuses on improving the efficiency and quality of software development. The software engineering community is diversed in all sorts of ways toward this goal. We often take the empirical approach, by observing things in real life and summarizing practices from experiences, to build theories and mechanisms. We often invent intelligent techniques and bots to help control complex system and its development.

More specifically, our current focus is on observing software repositories and measuring how developers live their lives for various purposes, like helping understand and control large complex software systems, society, and universe. We might use a wide spectrum of technologies and interdisplinary methodologies, depending on the specific problem we are tackling with.

See the CCF list for top venues in this field. See Publications for our latest publications. If you want to learn more, see Resources in this website.

Current Research Ts

  1. Open Source Software Supply Chain (modeling, risk analysis and resolutin)
  2. Characterizing open source ecosystem as complex system
  3. Open source license compatibility detection and conflict resolution
  4. Open Source Sustainbility (deprecation prediction)
  5. Profiling Developer (expertise, personality and learning trajectory)
  6. Software engineering bots (library migration recommender/GFI recommender/dependency update bot/release note bot…)

For Prespective Students

We are constantly looking for self-motivated students with sufficient programming skills. Students with strong interest in mining big data, observing open source ecosystems and improving current software development practices are extremely welcomed. Industry experiences and rich software development skills will be your great advantage. Background in software engineering, statistics, visualization, data mining, machine learning and natural language processing might help you prosper in this field but are not necessarily required.

Contact Professor Zhou Minghui for details about PhD openings and undergraduate internship opportunities.

Industry Collaborators

Contact

Address: Room 1537, Science Building No. 1, Peking University, Beijing, China

Latest Posts

LicenseRec V2.0.0 is officially launched!

📢 LicenseRec V2.0.0 is officially launched! New features include compliance analysis accurate to package versions and license incompatibility remediation based on SMT-Solver! LicenseRec is a license compliance analysis and open-source license recommendation tool that helps developers perform compliance analysis and select the optimal license for their open-source software projects.

One paper accepted by ESEC/FSE!

Kai’s study on automatically retrieving and validating source code repository information for PyPI packages is accepted by ESEC/FSE 2024. Congratulations to Kai!