VEM 2020

What can analyzing tens of terabytes of public trace data tell us about open source sustainability?
Bogdan Vasilescu - Carnegie Mellon University

Abstract: Open-source communities face significant sustainability challenges, from attracting and retaining a diverse set of contributors, to fundraising. Through interviews, surveys, and analysis of billions of commits and other public traces, we studied the organization, functioning, and overall health of open-source communities. The talk will highlight what the empirical evidence is for a range of research questions about non-technical issues, including: project-level risk factors associated with upstream and downstream dependencies, the value of diversity in open-source teams, factors contributing to longer-term engagement or premature disengagement of contributors, the effectiveness of donations as a funding model, and the role of transparency and signaling in increasing the health of open-source projects.

Bogdan Vasilescu is an Assistant Professor at Carnegie Mellon University.

Site: http://bvasiles.github.io

Program Understanding as a Learning Activity.
Fernando Castor - Universidade Federal de Pernambuco

Abstract: Reading code is an essential activity in software maintenance and evolution. Several studies with developers have investigated how different factors, such as the employed code constructs and naming conventions, can impact code readability, i.e., what makes a program easier or harder to read and appre- hend by developers, and code legibility, i.e., what influences the ease of identifying elements of a program. These studies evaluate readability and legibility by means of different comprehension tasks and response variables. In this paper, we examine these tasks and variables in studies aiming to compare programming constructs, coding idioms, naming conventions, and formatting guidelines, e.g., recursive code vs. iterative code. To that end, we have conducted a systematic literature review, where we found 54 relevant papers for our study. We found out that most of these studies evaluate code readability and legibility by measuring the correctness of the subjects’ results (83.3%) or simply asking their personal opinions (55.6%). Some studies (16.7%) rely exclusively on the latter response variable. Also, there are still relatively few studies that monitor developer’s physical signs, such as brain activation regions (5%). Our study shows that attributes such as time and correctness are multi-faceted. For example, correctness can be measured as the ability to predict the output of a program, answer questions about its general behavior, precisely recall specific parts, among other things. These results make it clear that different evaluation approaches require different competencies from study subjects, e.g., tracing the program vs. summarizing its goal vs. memorizing its text. To assist researchers in the design of new studies and improve our comprehension of existing ones, we model program comprehension as a learning activity by adapting a preexisting learning taxonomy. This adaptation indicates that some competencies, e.g., tracing, are often exercised in these evaluations whereas others, e.g., relating similar code snippets, are rarely targeted.

Fernando Castor is an associate professor at the Informatics Center (CIn), Federal University of Pernambuco (UFPE), Brazil.

Site: https://sites.google.com/a/cin.ufpe.br/castor/

VIII Workshop on Software Visualization, Evolution and Maintenance

Keynotes