際際滷

際際滷Share a Scribd company logo
On the Naturalness of Buggy Code
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane,
Zhaopeng Tu, Alberto Bacchelli, Premkumar Devanbu.
published in ICSE 2016
Jinhan Kim
2018.2.9
Naturalness
 Real software tends to be natural, like speech or natural
language.
 It tends to be highly repetitive and predictable.
Naturalness of Software1
[1] A. Hindle, E. Barr, M. Gabel, Z. Su, and P. Devanbu. On the naturalness of software. In ICSE, pages 837847,
2012.
What does it mean when a code is considered
unnatural?
Research Questions
 Are buggy lines less natural than non-buggy lines?
 Are buggy lines less natural" than bug-fix lines?
 Is naturalness" a good way to direct inspection effort?
Background
Language Model
 Language model assign a probability to every sequence of
words.
 Given a code token sequence,  = 1 2
ngram Language Model
 Using only the preceding n - 1 tokens.
  = 1 2  $1
$gram Language Model2
 Improving language model by deploying an additional cache-
list of ngrams extracted from the local context, to capture the
local regularities.
[2] Z. Tu, Z. Su, and P. Devanbu. On the localness of software. In SIGSOFT FSE, pages 269280, 2014.
Study
Study Subject
Phase-1 (during active development)
 They chose to analyze each project for the period of one-year
which contained the most bug fixes in that projects history.
 Then, extract snapshots at 1-month intervals.
Data Collection
Phase-2 (after release)
Entropy Measurement
 $gram
 The line and file entropies are computed by averaging over all
the tokens belong to a line and all lines corresponding to a file
respectively.
Entropy Measurement
 Package, class and method declarations
 previously unseen identifiers  higher entropy scores
 For-loop statements and catch clauses
 being often repetitive  lower entropy scores
Abstract-syntax-based line-types
and computing a syntax-sensitive
entropy score
Syntax-sensitive Entropy Score
 Matching between line and AST node.
 Then, compute how much a lines entropy deviated from the
mean entropy of its line-type.
 => $gram+type
Relative bug-proneness
 => $gram+wType
Evaluation
RQ1: Are buggy lines less natural" than
non-buggy lines?
Are buggy lines less natural" than non-buggy lines?
Bug Duration
Bug Duration
Bugs that stay longer in a repository tend to have lower entropy than the
short-lived bugs
RQ2: Are buggy lines less natural" than
bug-fix lines?
Are buggy lines less natural" than bug-fix lines?
Example 1
Example 2
Example 3
Counterexample
RQ3: Is naturalness" a good way to direct
inspection effort?
DP: Defect Prediction
 Two classifier
 Logistic Regression(LR)
 Random Forest(RF)
 Process metrics
 # of developers
 # of file-commit
 Code churn
 Previous bug history
SBF: Static Bug Finder
 SBF uses syntactic and semantic properties of source code.
 For this study, PMD and FindBugs are used.
 NBF: Naturalness Bug Finder
 AUCEC: Area Under the Cost-Effectiveness Curve
Detecting Buggy Files
Detecting Buggy Lines
Result
 Buggy lines, on average, have higher entropies, i.e. are less
natural, than non-buggy lines.
 Entropy of the buggy lines drops after bug-fixes with statistical
significance.
 Entropy can be used to guide bug-finding efforts at both the file-
level and the line-level.
Appendix

More Related Content

Review: On the Naturalness of Buggy Code

  • 1. On the Naturalness of Buggy Code Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane, Zhaopeng Tu, Alberto Bacchelli, Premkumar Devanbu. published in ICSE 2016 Jinhan Kim 2018.2.9
  • 2. Naturalness Real software tends to be natural, like speech or natural language. It tends to be highly repetitive and predictable. Naturalness of Software1 [1] A. Hindle, E. Barr, M. Gabel, Z. Su, and P. Devanbu. On the naturalness of software. In ICSE, pages 837847, 2012.
  • 3. What does it mean when a code is considered unnatural?
  • 4. Research Questions Are buggy lines less natural than non-buggy lines? Are buggy lines less natural" than bug-fix lines? Is naturalness" a good way to direct inspection effort?
  • 6. Language Model Language model assign a probability to every sequence of words. Given a code token sequence, = 1 2
  • 7. ngram Language Model Using only the preceding n - 1 tokens. = 1 2 $1
  • 8. $gram Language Model2 Improving language model by deploying an additional cache- list of ngrams extracted from the local context, to capture the local regularities. [2] Z. Tu, Z. Su, and P. Devanbu. On the localness of software. In SIGSOFT FSE, pages 269280, 2014.
  • 11. Phase-1 (during active development) They chose to analyze each project for the period of one-year which contained the most bug fixes in that projects history. Then, extract snapshots at 1-month intervals.
  • 14. Entropy Measurement $gram The line and file entropies are computed by averaging over all the tokens belong to a line and all lines corresponding to a file respectively.
  • 15. Entropy Measurement Package, class and method declarations previously unseen identifiers higher entropy scores For-loop statements and catch clauses being often repetitive lower entropy scores Abstract-syntax-based line-types and computing a syntax-sensitive entropy score
  • 16. Syntax-sensitive Entropy Score Matching between line and AST node. Then, compute how much a lines entropy deviated from the mean entropy of its line-type. => $gram+type
  • 19. RQ1: Are buggy lines less natural" than non-buggy lines?
  • 20. Are buggy lines less natural" than non-buggy lines?
  • 22. Bug Duration Bugs that stay longer in a repository tend to have lower entropy than the short-lived bugs
  • 23. RQ2: Are buggy lines less natural" than bug-fix lines?
  • 24. Are buggy lines less natural" than bug-fix lines?
  • 29. RQ3: Is naturalness" a good way to direct inspection effort?
  • 30. DP: Defect Prediction Two classifier Logistic Regression(LR) Random Forest(RF) Process metrics # of developers # of file-commit Code churn Previous bug history
  • 31. SBF: Static Bug Finder SBF uses syntactic and semantic properties of source code. For this study, PMD and FindBugs are used. NBF: Naturalness Bug Finder AUCEC: Area Under the Cost-Effectiveness Curve
  • 34. Result Buggy lines, on average, have higher entropies, i.e. are less natural, than non-buggy lines. Entropy of the buggy lines drops after bug-fixes with statistical significance. Entropy can be used to guide bug-finding efforts at both the file- level and the line-level.