This document summarizes a study on how software engineers understand code changes. The study found that understanding code changes is frequently practiced in major development tasks like reviewing changes, fixing bugs, and developing new features. Determining a change's risk and assessing its completeness were found to be important but difficult information needs. The study identified challenges like determining risk, understanding composite changes, and suggested practical improvements like better code navigation tools and change decomposition support.
1 of 47
Download to read offline
More Related Content
How do software engineers understand code changes?
1. How Do Software Engineers Understand
Code Changes?
An Exploratory Study in Industry
Yida Tao (HKUST), Yingong Dang (MSRA), Tao Xie (NCSU)
Dongmei Zhang (MSRA), Sunghun Kim (HKUST)
3. > if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
Why this change here? This is the
only one that doesnt seem to make
sense for me
3
4. > if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
Why this change here? This is the
only one that doesnt seem to make
sense for me
4
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
What is this used for, I cant spot it in
use anywhere and every component and
service seems to have it set to false.
5. > if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
Why this change here? This is the
only one that doesnt seem to make
sense for me
5
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
What is this used for, I cant spot it in
use anywhere and every component and
service seems to have it set to false.
> browser_hide_removing.js
>+ browser_imageReload.js
>+ image_Reload.html
These files are missing from this
patch, arent they?
6. > if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
Why this change here? This is the
only one that doesnt seem to make
sense for me
6
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
What is this used for, I cant spot it in
use anywhere and every component and
service seems to have it set to false.
> browser_hide_removing.js
>+ browser_imageReload.js
>+ image_Reload.html
These files are missing from this
patch, arent they?
>+ for (var i = aURL.length 1; i >= 1; i--) {
>+ var chPrev = aURL.charAt(i 1) ;
>+ var ch = aURL.charAt(i) ;
Im not sure why you walk this char
by char, javascript has awesome
string methods
7. > if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
Why this change here? This is the
only one that doesnt seem to make
sense for me
7
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
What is this used for, I cant spot it in
use anywhere and every component and
service seems to have it set to false.
> browser_hide_removing.js
>+ browser_imageReload.js
>+ image_Reload.html
These files are missing from this
patch, arent they?
>+ for (var i = aURL.length 1; i >= 1; i--) {
>+ var chPrev = aURL.charAt(i 1) ;
>+ var ch = aURL.charAt(i) ;
Im not sure why you walk this char
by char, javascript has awesome
string methods
8. > if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
Why this change here? This is the
only one that doesnt seem to make
sense for me
8
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
What is this used for, I cant spot it in
use anywhere and every component and
service seems to have it set to false.
> browser_hide_removing.js
>+ browser_imageReload.js
>+ image_Reload.html
These files are missing from this
patch, arent they?
>+ for (var i = aURL.length 1; i >= 1; i--) {
>+ var chPrev = aURL.charAt(i 1) ;
>+ var ch = aURL.charAt(i) ;
Im not sure why you walk this char
by char, javascript has awesome
string methods
10. Research Questions
RQ1: How frequent is code change understanding
practiced and in which development tasks it is
required?
RQ2: What are engineers information needs and
difficulty for understanding code changes?
RQ3: How to improve the effectiveness and efficiency
of the practices in understanding code changes?
10
11. Study Methodology
Literature
Review
11
Potential
information
needs
Questionnaire
Design
Investigate
RQ1, RQ2
Pilot Interview
Question is
relevant &
clear
12. Study Methodology
Literature
Review
12
Potential
information
needs
Questionnaire
Design
Investigate
RQ1, RQ2
Pilot Interview
Question is
relevant &
clear
Online Survey
16% response
rate (180
respondents)
Follow-up
Interview
Investigate
RQ3
Analysis
Answering
RQs
13. Survey Participants
13
Role Distribution Product Team
Dev
55%
Test
31%
PM
14%
OS
Desktop App
Web App
Mobile App
Service
Others
14. RQ1 Frequency ?
14
Development tasks ?
RQ2 Information needs ?
Difficulty ?
RQ3 Improvement ?
RQs
15. RQ1: Frequency of Understanding Code Changes
15
How often do you need to understand code changes?
o Several times each hour
o About once an hour
o Several times each day
o About once a day
o Several times each week
o About once a week
o Rarely
o Never
16. RQ1: Frequency of Understanding Code Changes
16
50
45
40
35
30
25
20
15
10
5
0
Absolute # of responses
Dev
Test
PM
17. RQ1: Tasks Requiring Code Change Understanding
Select the top three tasks that most often require you
to understand code changes
17
[Design/Planning] Refactoring
[Implementation] Developing new feature
[Implementation] Fixing bug
[Integration] Resolving merge conflict
[Verification] Reviewing others code changes
[Verification] Reviewing my own code changes
[Verification] Writing & updating test cases
Other, please specify
18. RQ1: Tasks Requiring Code Change Understanding
18
0% 15% 30% 45% 60% 75%
121
100
89
73
48
34
30
Reviewing others' changes
Fixing bug
Developing new feature
Reviewing my own changes
Writing/updating test cases
Refactoring
Resolving merge conflict
Percentage of participants who select the task
19. RQ1 Frequently practiced
19
Major development tasks
RQ2 Information needs ?
Difficulty ?
RQ3 Improvement ?
Answers to RQs
20. Potential Information Needs
Literature review (code-change analysis and management)
20
180 articles in 10 SE venues over the past decade
21. Potential Information Needs
Literature review (code-change analysis and management)
21
180 articles in 10 SE venues over the past decade
Reasoning & assessing the change
Completeness
Clones
Design
Exploring the changes context & impact
Risk
Consistency
Tests
...
Evaluating the change history
Change-proneness
Defect-proneness
22. Survey Questions
Rate the importance & difficulty of each information need
(formulated as question) in a change understanding task
Very
Important
Important
Somewhat
Important
Not
Important
22
3
2
1
0
23. Survey Questions
Rate the importance & difficulty of each information need
(formulated as question) in a change understanding task
Very
Important
Important
Somewhat
Important
Not
Important
23
Very
Difficult
Difficult
Relatively
Easy
Straightfor
-ward
3
2
1
0
24. Survey Questions
Rate the importance & difficulty of each information need
(formulated as question) in a change understanding task
24
Does this change
introduce code clones?
Does this change break
any code elsewhere?
Which tests should be run
to verify this change?
Is this changed location a
hotspot for past fixes?
Very
Important
Important
Somewhat
Important
Not
Important
Very
Difficult
Difficult
Relatively
Easy
Straightfor
-ward
3
2
1
0
25. RQ2: Information Needs
25
3
2
1
0
0 1 2 3
Difficulty of acquiring the
information
Importance
26. RQ2: Information Needs
26
3
2
1
0
Consistency Risk
0 1 2 3
Difficulty of acquiring the
information
Importance
Completeness
Design
27. RQ2: Information Needs
27
3
2
1
0
Consistency Risk
0 1 2 3
Difficulty of acquiring the
information
Importance
Completeness
Design
Rationale
28. RQ2: Information Needs
28
3
2
1
0
Consistency Risk
Defect-proneness
0 1 2 3
Difficulty of acquiring the
information
Importance
Completeness
Design
Rationale
Change-proneness
29. RQ1 Frequently practiced
29
Major development tasks
RQ2 Risk & Quality are important
but difficult to know
RQ3 Improvement ?
Answers to RQs
30. RQ3: Interview Items
30
3
2
1
0
Defect-proneness
0 1 2 3
Difficulty of acquiring the
information
Importance
Risk
Change-proneness Rationale
31. Determining a Changes Risk
31
3
2
1
0
0 1 2 3
Difficulty of acquiring the
information
Importance
Risk
32. Current Practice on Determining a Changes Risk
32
Manual Code Review
Error-prone
Cross-components
Unclear interface
Hidden assumptions
≒
Unit & Regression Testing
Time consuming
Depends on how thorough the tests are
≒
33. Support Determining a Changes Risk
Manual code review
33
Navigation in diff:
using code analysis
tools (e.g., go to
definition, find all
references,
caller/callee tree)
on the code
change
34. Support Determining a Changes Risk
Manual code review
34
Navigation in diff:
using code analysis
tools (e.g., go to
definition, find all
references,
caller/callee tree)
on the code
change
Diff
miss a level of
understanding object
Code
Analysis
relationships
Navigation
in diff
35. Support Determining a Changes Risk
Testing
35
which code must
be retested as it is
dependent upon
the change?
who owns testing
that dependency?
which tests must
be run?
An Intelli-sense for updating
these (affected) tests would be
nice as well.
36. Discussion
36
3
2
1
0
Defect-proneness
0 1 2 3
Difficulty of acquiring the
information
Importance
Rationale
Change-proneness
37. Discussion
37
Why is understanding the rationale of a change easy?
Availability & Quality of commit message
Its entirely up to the dev making the change as to how hard or
easy it is for someone else to figure out why the change was
made.
Why are historical metrics not that important?
Developers
Here and now
Short-term issue
Own Knowledge
Testers & PMs
Historical metrics might be good to reflect bugginess and
complexity of a specific area
38. Other Information Needs
In addition to the information needs listed above, what else
would you ask when you try to understand a code change? How
difficult is it for you to answer?
38
39. Other Information Needs
In addition to the information needs listed above, what else
would you ask when you try to understand a code change? How
difficult is it for you to answer?
39
Can this change be broken into
smaller discreet changes?
43. RQ1 Frequently practiced
43
Major development tasks
RQ2 Risk & Quality are important
but difficult to know
RQ3
Determining a
changes risk
Decomposing a
composite change
Answers to RQs
44. Summary
Evidence
Understanding code changes is a fundamental practice that
happens frequently in major development tasks
44
45. Summary
Evidence
Understanding code changes is a fundamental practice that
happens frequently in major development tasks
Challenges
Determining a changes risk
Assessing a changes consistency, completeness
Understanding composite change
45
46. Summary
Evidence
Understanding code changes is a fundamental practice that
happens frequently in major development tasks
Challenges
Determining a changes risk
Assessing a changes consistency, completeness
Understanding composite change
Practical Needs
Navigation in diff
Change decomposition
Available & informative commit message
46
47. Acknowledgment
All participants of survey / interview
Miryung Kim, Robin Moeur, Thomas Zimmermann, Jacek
Czerwonka, and Kathryn McKinley
47