際際滷

際際滷Share a Scribd company logo
How Do Software Engineers Understand 
Code Changes? 
An Exploratory Study in Industry 
Yida Tao (HKUST), Yingong Dang (MSRA), Tao Xie (NCSU) 
Dongmei Zhang (MSRA), Sunghun Kim (HKUST)
2
> if (hudRef && hud) { 
> if (hudRef.consolePanel) { 
> + hudRef.consolePanel.hidePopup() 
Why this change here? This is the 
only one that doesnt seem to make 
sense for me 
3
> if (hudRef && hud) { 
> if (hudRef.consolePanel) { 
> + hudRef.consolePanel.hidePopup() 
Why this change here? This is the 
only one that doesnt seem to make 
sense for me 
4 
>+ struct CIDEntry 
>+ { 
>+ const nsCID* cid; 
>+ bool service; 
What is this used for, I cant spot it in 
use anywhere and every component and 
service seems to have it set to false.
> if (hudRef && hud) { 
> if (hudRef.consolePanel) { 
> + hudRef.consolePanel.hidePopup() 
Why this change here? This is the 
only one that doesnt seem to make 
sense for me 
5 
>+ struct CIDEntry 
>+ { 
>+ const nsCID* cid; 
>+ bool service; 
What is this used for, I cant spot it in 
use anywhere and every component and 
service seems to have it set to false. 
> browser_hide_removing.js 
>+ browser_imageReload.js 
>+ image_Reload.html 
These files are missing from this 
patch, arent they?
> if (hudRef && hud) { 
> if (hudRef.consolePanel) { 
> + hudRef.consolePanel.hidePopup() 
Why this change here? This is the 
only one that doesnt seem to make 
sense for me 
6 
>+ struct CIDEntry 
>+ { 
>+ const nsCID* cid; 
>+ bool service; 
What is this used for, I cant spot it in 
use anywhere and every component and 
service seems to have it set to false. 
> browser_hide_removing.js 
>+ browser_imageReload.js 
>+ image_Reload.html 
These files are missing from this 
patch, arent they? 
>+ for (var i = aURL.length  1; i >= 1; i--) { 
>+ var chPrev = aURL.charAt(i  1) ; 
>+ var ch = aURL.charAt(i) ; 
Im not sure why you walk this char 
by char, javascript has awesome 
string methods
> if (hudRef && hud) { 
> if (hudRef.consolePanel) { 
> + hudRef.consolePanel.hidePopup() 
Why this change here? This is the 
only one that doesnt seem to make 
sense for me 
7 
>+ struct CIDEntry 
>+ { 
>+ const nsCID* cid; 
>+ bool service; 
What is this used for, I cant spot it in 
use anywhere and every component and 
service seems to have it set to false. 
> browser_hide_removing.js 
>+ browser_imageReload.js 
>+ image_Reload.html 
These files are missing from this 
patch, arent they? 
>+ for (var i = aURL.length  1; i >= 1; i--) { 
>+ var chPrev = aURL.charAt(i  1) ; 
>+ var ch = aURL.charAt(i) ; 
Im not sure why you walk this char 
by char, javascript has awesome 
string methods
> if (hudRef && hud) { 
> if (hudRef.consolePanel) { 
> + hudRef.consolePanel.hidePopup() 
Why this change here? This is the 
only one that doesnt seem to make 
sense for me 
8 
>+ struct CIDEntry 
>+ { 
>+ const nsCID* cid; 
>+ bool service; 
What is this used for, I cant spot it in 
use anywhere and every component and 
service seems to have it set to false. 
> browser_hide_removing.js 
>+ browser_imageReload.js 
>+ image_Reload.html 
These files are missing from this 
patch, arent they? 
>+ for (var i = aURL.length  1; i >= 1; i--) { 
>+ var chPrev = aURL.charAt(i  1) ; 
>+ var ch = aURL.charAt(i) ; 
Im not sure why you walk this char 
by char, javascript has awesome 
string methods
How do software engineers understand code changes? 
9
Research Questions 
RQ1: How frequent is code change understanding 
practiced and in which development tasks it is 
required? 
RQ2: What are engineers information needs and 
difficulty for understanding code changes? 
RQ3: How to improve the effectiveness and efficiency 
of the practices in understanding code changes? 
10
Study Methodology 
Literature 
Review 
11 
Potential 
information 
needs 
Questionnaire 
Design 
Investigate 
RQ1, RQ2 
Pilot Interview 
Question is 
relevant & 
clear
Study Methodology 
Literature 
Review 
12 
Potential 
information 
needs 
Questionnaire 
Design 
Investigate 
RQ1, RQ2 
Pilot Interview 
Question is 
relevant & 
clear 
Online Survey 
 16% response 
rate (180 
respondents) 
Follow-up 
Interview 
Investigate 
RQ3 
Analysis 
Answering 
RQs
Survey Participants 
13 
Role Distribution Product Team 
Dev 
55% 
Test 
31% 
PM 
14% 
 OS 
 Desktop App 
 Web App 
 Mobile App 
 Service 
 Others
RQ1  Frequency ? 
14 
 Development tasks ? 
RQ2  Information needs ? 
 Difficulty ? 
RQ3  Improvement ? 
RQs
RQ1: Frequency of Understanding Code Changes 
15 
How often do you need to understand code changes? 
o Several times each hour 
o About once an hour 
o Several times each day 
o About once a day 
o Several times each week 
o About once a week 
o Rarely 
o Never
RQ1: Frequency of Understanding Code Changes 
16 
50 
45 
40 
35 
30 
25 
20 
15 
10 
5 
0 
Absolute # of responses 
Dev 
Test 
PM
RQ1: Tasks Requiring Code Change Understanding 
Select the top three tasks that most often require you 
to understand code changes 
17 
[Design/Planning] Refactoring 
[Implementation] Developing new feature 
[Implementation] Fixing bug 
[Integration] Resolving merge conflict 
[Verification] Reviewing others code changes 
[Verification] Reviewing my own code changes 
[Verification] Writing & updating test cases 
Other, please specify
RQ1: Tasks Requiring Code Change Understanding 
18 
0% 15% 30% 45% 60% 75% 
121 
100 
89 
73 
48 
34 
30 
Reviewing others' changes 
Fixing bug 
Developing new feature 
Reviewing my own changes 
Writing/updating test cases 
Refactoring 
Resolving merge conflict 
Percentage of participants who select the task
RQ1  Frequently practiced 
19 
 Major development tasks 
RQ2  Information needs ? 
 Difficulty ? 
RQ3  Improvement ? 
Answers to RQs
Potential Information Needs 
Literature review (code-change analysis and management) 
20 
180 articles in 10 SE venues over the past decade
Potential Information Needs 
Literature review (code-change analysis and management) 
21 
180 articles in 10 SE venues over the past decade 
Reasoning & assessing the change 
 Completeness 
 Clones 
 Design 
  
Exploring the changes context & impact 
Risk 
Consistency 
Tests 
... 
Evaluating the change history 
Change-proneness 
Defect-proneness
Survey Questions 
Rate the importance & difficulty of each information need 
(formulated as question) in a change understanding task 
Very 
Important 
Important 
Somewhat 
Important 
Not 
Important 
22 
3 
2 
1 
0
Survey Questions 
Rate the importance & difficulty of each information need 
(formulated as question) in a change understanding task 
Very 
Important 
Important 
Somewhat 
Important 
Not 
Important 
23 
Very 
Difficult 
Difficult 
Relatively 
Easy 
Straightfor 
-ward 
3 
2 
1 
0
Survey Questions 
Rate the importance & difficulty of each information need 
(formulated as question) in a change understanding task 
24 
Does this change 
introduce code clones? 
Does this change break 
any code elsewhere? 
Which tests should be run 
to verify this change? 
Is this changed location a 
hotspot for past fixes? 
 
Very 
Important 
Important 
Somewhat 
Important 
Not 
Important 
Very 
Difficult 
Difficult 
Relatively 
Easy 
Straightfor 
-ward 
3 
2 
1 
0
RQ2: Information Needs 
25 
3 
2 
1 
0 
0 1 2 3 
Difficulty of acquiring the 
information 
Importance
RQ2: Information Needs 
26 
3 
2 
1 
0 
Consistency Risk 
0 1 2 3 
Difficulty of acquiring the 
information 
Importance 
Completeness 
Design
RQ2: Information Needs 
27 
3 
2 
1 
0 
Consistency Risk 
0 1 2 3 
Difficulty of acquiring the 
information 
Importance 
Completeness 
Design 
Rationale
RQ2: Information Needs 
28 
3 
2 
1 
0 
Consistency Risk 
Defect-proneness 
0 1 2 3 
Difficulty of acquiring the 
information 
Importance 
Completeness 
Design 
Rationale 
Change-proneness
RQ1  Frequently practiced 
29 
 Major development tasks 
RQ2  Risk & Quality are important 
but difficult to know 
RQ3  Improvement ? 
Answers to RQs
RQ3: Interview Items 
30 
3 
2 
1 
0 
Defect-proneness 
0 1 2 3 
Difficulty of acquiring the 
information 
Importance 
Risk 
Change-proneness Rationale
Determining a Changes Risk 
31 
3 
2 
1 
0 
0 1 2 3 
Difficulty of acquiring the 
information 
Importance 
Risk
Current Practice on Determining a Changes Risk 
32 
Manual Code Review 
Error-prone 
Cross-components 
Unclear interface 
Hidden assumptions 
≒ 
Unit & Regression Testing 
Time consuming 
Depends on how thorough the tests are 
≒
Support Determining a Changes Risk 
Manual code review 
33 
Navigation in diff: 
using code analysis 
tools (e.g., go to 
definition, find all 
references, 
caller/callee tree) 
on the code 
change
Support Determining a Changes Risk 
Manual code review 
34 
Navigation in diff: 
using code analysis 
tools (e.g., go to 
definition, find all 
references, 
caller/callee tree) 
on the code 
change 
Diff 
miss a level of 
understanding object 
Code 
Analysis 
relationships 
Navigation 
in diff
Support Determining a Changes Risk 
Testing 
35 
which code must 
be retested as it is 
dependent upon 
the change? 
who owns testing 
that dependency? 
which tests must 
be run? 
An Intelli-sense for updating 
these (affected) tests would be 
nice as well.
Discussion 
36 
3 
2 
1 
0 
Defect-proneness 
0 1 2 3 
Difficulty of acquiring the 
information 
Importance 
Rationale 
Change-proneness
Discussion 
37 
Why is understanding the rationale of a change easy? 
 Availability & Quality of commit message 
 Its entirely up to the dev making the change as to how hard or 
easy it is for someone else to figure out why the change was 
made. 
Why are historical metrics not that important? 
 Developers 
 Here and now 
 Short-term issue 
 Own Knowledge 
 Testers & PMs 
 Historical metrics might be good to reflect bugginess and 
complexity of a specific area
Other Information Needs 
In addition to the information needs listed above, what else 
would you ask when you try to understand a code change? How 
difficult is it for you to answer? 
38
Other Information Needs 
In addition to the information needs listed above, what else 
would you ask when you try to understand a code change? How 
difficult is it for you to answer? 
39 
Can this change be broken into 
smaller discreet changes?
Composite Code Change 
40
Understanding a Composite Code Change 
41
Decomposing a Composite Code Change 
42
RQ1  Frequently practiced 
43 
 Major development tasks 
RQ2  Risk & Quality are important 
but difficult to know 
RQ3 
 Determining a 
changes risk 
 Decomposing a 
composite change 
Answers to RQs
Summary 
Evidence 
Understanding code changes is a fundamental practice that 
happens frequently in major development tasks 
44
Summary 
Evidence 
Understanding code changes is a fundamental practice that 
happens frequently in major development tasks 
Challenges 
Determining a changes risk 
Assessing a changes consistency, completeness 
Understanding composite change 
45
Summary 
Evidence 
Understanding code changes is a fundamental practice that 
happens frequently in major development tasks 
Challenges 
Determining a changes risk 
Assessing a changes consistency, completeness 
Understanding composite change 
Practical Needs 
Navigation in diff 
Change decomposition 
Available & informative commit message 
46
Acknowledgment 
All participants of survey / interview 
Miryung Kim, Robin Moeur, Thomas Zimmermann, Jacek 
Czerwonka, and Kathryn McKinley 
47

More Related Content

How do software engineers understand code changes?

  • 1. How Do Software Engineers Understand Code Changes? An Exploratory Study in Industry Yida Tao (HKUST), Yingong Dang (MSRA), Tao Xie (NCSU) Dongmei Zhang (MSRA), Sunghun Kim (HKUST)
  • 2. 2
  • 3. > if (hudRef && hud) { > if (hudRef.consolePanel) { > + hudRef.consolePanel.hidePopup() Why this change here? This is the only one that doesnt seem to make sense for me 3
  • 4. > if (hudRef && hud) { > if (hudRef.consolePanel) { > + hudRef.consolePanel.hidePopup() Why this change here? This is the only one that doesnt seem to make sense for me 4 >+ struct CIDEntry >+ { >+ const nsCID* cid; >+ bool service; What is this used for, I cant spot it in use anywhere and every component and service seems to have it set to false.
  • 5. > if (hudRef && hud) { > if (hudRef.consolePanel) { > + hudRef.consolePanel.hidePopup() Why this change here? This is the only one that doesnt seem to make sense for me 5 >+ struct CIDEntry >+ { >+ const nsCID* cid; >+ bool service; What is this used for, I cant spot it in use anywhere and every component and service seems to have it set to false. > browser_hide_removing.js >+ browser_imageReload.js >+ image_Reload.html These files are missing from this patch, arent they?
  • 6. > if (hudRef && hud) { > if (hudRef.consolePanel) { > + hudRef.consolePanel.hidePopup() Why this change here? This is the only one that doesnt seem to make sense for me 6 >+ struct CIDEntry >+ { >+ const nsCID* cid; >+ bool service; What is this used for, I cant spot it in use anywhere and every component and service seems to have it set to false. > browser_hide_removing.js >+ browser_imageReload.js >+ image_Reload.html These files are missing from this patch, arent they? >+ for (var i = aURL.length 1; i >= 1; i--) { >+ var chPrev = aURL.charAt(i 1) ; >+ var ch = aURL.charAt(i) ; Im not sure why you walk this char by char, javascript has awesome string methods
  • 7. > if (hudRef && hud) { > if (hudRef.consolePanel) { > + hudRef.consolePanel.hidePopup() Why this change here? This is the only one that doesnt seem to make sense for me 7 >+ struct CIDEntry >+ { >+ const nsCID* cid; >+ bool service; What is this used for, I cant spot it in use anywhere and every component and service seems to have it set to false. > browser_hide_removing.js >+ browser_imageReload.js >+ image_Reload.html These files are missing from this patch, arent they? >+ for (var i = aURL.length 1; i >= 1; i--) { >+ var chPrev = aURL.charAt(i 1) ; >+ var ch = aURL.charAt(i) ; Im not sure why you walk this char by char, javascript has awesome string methods
  • 8. > if (hudRef && hud) { > if (hudRef.consolePanel) { > + hudRef.consolePanel.hidePopup() Why this change here? This is the only one that doesnt seem to make sense for me 8 >+ struct CIDEntry >+ { >+ const nsCID* cid; >+ bool service; What is this used for, I cant spot it in use anywhere and every component and service seems to have it set to false. > browser_hide_removing.js >+ browser_imageReload.js >+ image_Reload.html These files are missing from this patch, arent they? >+ for (var i = aURL.length 1; i >= 1; i--) { >+ var chPrev = aURL.charAt(i 1) ; >+ var ch = aURL.charAt(i) ; Im not sure why you walk this char by char, javascript has awesome string methods
  • 9. How do software engineers understand code changes? 9
  • 10. Research Questions RQ1: How frequent is code change understanding practiced and in which development tasks it is required? RQ2: What are engineers information needs and difficulty for understanding code changes? RQ3: How to improve the effectiveness and efficiency of the practices in understanding code changes? 10
  • 11. Study Methodology Literature Review 11 Potential information needs Questionnaire Design Investigate RQ1, RQ2 Pilot Interview Question is relevant & clear
  • 12. Study Methodology Literature Review 12 Potential information needs Questionnaire Design Investigate RQ1, RQ2 Pilot Interview Question is relevant & clear Online Survey 16% response rate (180 respondents) Follow-up Interview Investigate RQ3 Analysis Answering RQs
  • 13. Survey Participants 13 Role Distribution Product Team Dev 55% Test 31% PM 14% OS Desktop App Web App Mobile App Service Others
  • 14. RQ1 Frequency ? 14 Development tasks ? RQ2 Information needs ? Difficulty ? RQ3 Improvement ? RQs
  • 15. RQ1: Frequency of Understanding Code Changes 15 How often do you need to understand code changes? o Several times each hour o About once an hour o Several times each day o About once a day o Several times each week o About once a week o Rarely o Never
  • 16. RQ1: Frequency of Understanding Code Changes 16 50 45 40 35 30 25 20 15 10 5 0 Absolute # of responses Dev Test PM
  • 17. RQ1: Tasks Requiring Code Change Understanding Select the top three tasks that most often require you to understand code changes 17 [Design/Planning] Refactoring [Implementation] Developing new feature [Implementation] Fixing bug [Integration] Resolving merge conflict [Verification] Reviewing others code changes [Verification] Reviewing my own code changes [Verification] Writing & updating test cases Other, please specify
  • 18. RQ1: Tasks Requiring Code Change Understanding 18 0% 15% 30% 45% 60% 75% 121 100 89 73 48 34 30 Reviewing others' changes Fixing bug Developing new feature Reviewing my own changes Writing/updating test cases Refactoring Resolving merge conflict Percentage of participants who select the task
  • 19. RQ1 Frequently practiced 19 Major development tasks RQ2 Information needs ? Difficulty ? RQ3 Improvement ? Answers to RQs
  • 20. Potential Information Needs Literature review (code-change analysis and management) 20 180 articles in 10 SE venues over the past decade
  • 21. Potential Information Needs Literature review (code-change analysis and management) 21 180 articles in 10 SE venues over the past decade Reasoning & assessing the change Completeness Clones Design Exploring the changes context & impact Risk Consistency Tests ... Evaluating the change history Change-proneness Defect-proneness
  • 22. Survey Questions Rate the importance & difficulty of each information need (formulated as question) in a change understanding task Very Important Important Somewhat Important Not Important 22 3 2 1 0
  • 23. Survey Questions Rate the importance & difficulty of each information need (formulated as question) in a change understanding task Very Important Important Somewhat Important Not Important 23 Very Difficult Difficult Relatively Easy Straightfor -ward 3 2 1 0
  • 24. Survey Questions Rate the importance & difficulty of each information need (formulated as question) in a change understanding task 24 Does this change introduce code clones? Does this change break any code elsewhere? Which tests should be run to verify this change? Is this changed location a hotspot for past fixes? Very Important Important Somewhat Important Not Important Very Difficult Difficult Relatively Easy Straightfor -ward 3 2 1 0
  • 25. RQ2: Information Needs 25 3 2 1 0 0 1 2 3 Difficulty of acquiring the information Importance
  • 26. RQ2: Information Needs 26 3 2 1 0 Consistency Risk 0 1 2 3 Difficulty of acquiring the information Importance Completeness Design
  • 27. RQ2: Information Needs 27 3 2 1 0 Consistency Risk 0 1 2 3 Difficulty of acquiring the information Importance Completeness Design Rationale
  • 28. RQ2: Information Needs 28 3 2 1 0 Consistency Risk Defect-proneness 0 1 2 3 Difficulty of acquiring the information Importance Completeness Design Rationale Change-proneness
  • 29. RQ1 Frequently practiced 29 Major development tasks RQ2 Risk & Quality are important but difficult to know RQ3 Improvement ? Answers to RQs
  • 30. RQ3: Interview Items 30 3 2 1 0 Defect-proneness 0 1 2 3 Difficulty of acquiring the information Importance Risk Change-proneness Rationale
  • 31. Determining a Changes Risk 31 3 2 1 0 0 1 2 3 Difficulty of acquiring the information Importance Risk
  • 32. Current Practice on Determining a Changes Risk 32 Manual Code Review Error-prone Cross-components Unclear interface Hidden assumptions ≒ Unit & Regression Testing Time consuming Depends on how thorough the tests are ≒
  • 33. Support Determining a Changes Risk Manual code review 33 Navigation in diff: using code analysis tools (e.g., go to definition, find all references, caller/callee tree) on the code change
  • 34. Support Determining a Changes Risk Manual code review 34 Navigation in diff: using code analysis tools (e.g., go to definition, find all references, caller/callee tree) on the code change Diff miss a level of understanding object Code Analysis relationships Navigation in diff
  • 35. Support Determining a Changes Risk Testing 35 which code must be retested as it is dependent upon the change? who owns testing that dependency? which tests must be run? An Intelli-sense for updating these (affected) tests would be nice as well.
  • 36. Discussion 36 3 2 1 0 Defect-proneness 0 1 2 3 Difficulty of acquiring the information Importance Rationale Change-proneness
  • 37. Discussion 37 Why is understanding the rationale of a change easy? Availability & Quality of commit message Its entirely up to the dev making the change as to how hard or easy it is for someone else to figure out why the change was made. Why are historical metrics not that important? Developers Here and now Short-term issue Own Knowledge Testers & PMs Historical metrics might be good to reflect bugginess and complexity of a specific area
  • 38. Other Information Needs In addition to the information needs listed above, what else would you ask when you try to understand a code change? How difficult is it for you to answer? 38
  • 39. Other Information Needs In addition to the information needs listed above, what else would you ask when you try to understand a code change? How difficult is it for you to answer? 39 Can this change be broken into smaller discreet changes?
  • 41. Understanding a Composite Code Change 41
  • 42. Decomposing a Composite Code Change 42
  • 43. RQ1 Frequently practiced 43 Major development tasks RQ2 Risk & Quality are important but difficult to know RQ3 Determining a changes risk Decomposing a composite change Answers to RQs
  • 44. Summary Evidence Understanding code changes is a fundamental practice that happens frequently in major development tasks 44
  • 45. Summary Evidence Understanding code changes is a fundamental practice that happens frequently in major development tasks Challenges Determining a changes risk Assessing a changes consistency, completeness Understanding composite change 45
  • 46. Summary Evidence Understanding code changes is a fundamental practice that happens frequently in major development tasks Challenges Determining a changes risk Assessing a changes consistency, completeness Understanding composite change Practical Needs Navigation in diff Change decomposition Available & informative commit message 46
  • 47. Acknowledgment All participants of survey / interview Miryung Kim, Robin Moeur, Thomas Zimmermann, Jacek Czerwonka, and Kathryn McKinley 47