�ݺ�ߣ

On the “Naturalness” of Buggy Code
Baishakhi Ray, Vincent Hellendoorn, Saheel Godhane,
Zhaopeng Tu, Alberto Bacchelli, Premkumar Devanbu.
published in ICSE 2016
Jinhan Kim
2018.2.9

Naturalness
• Real software tends to be natural, like speech or natural
language.
• It tends to be highly repetitive and predictable.
Naturalness of Software1
[1] A. Hindle, E. Barr, M. Gabel, Z. Su, and P. Devanbu. On the naturalness of software. In ICSE, pages 837–847,
2012.

What does it mean when a code is considered
“unnatural”?

Research Questions
• Are buggy lines less “natural” than non-buggy lines?
• Are buggy lines less “natural" than bug-fix lines?
• Is “naturalness" a good way to direct inspection effort?

Language Model
• Language model assign a probability to every sequence of
words.
• Given a code token sequence, 𝑆 = 𝑡1 𝑡2 … 𝑡 𝑁

ngram Language Model
• Using only the preceding n - 1 tokens.
• ℎ = 𝑡1 𝑡2 … 𝑡𝑖−1

$gram Language Model2
• Improving language model by deploying an additional cache-
list of ngrams extracted from the local context, to capture the
local regularities.
[2] Z. Tu, Z. Su, and P. Devanbu. On the localness of software. In SIGSOFT FSE, pages 269–280, 2014.

Phase-1 (during active development)
• They chose to analyze each project for the period of one-year
which contained the most bug fixes in that project’s history.
• Then, extract snapshots at 1-month intervals.

Entropy Measurement
• $gram
• The line and file entropies are computed by averaging over all
the tokens belong to a line and all lines corresponding to a file
respectively.

Entropy Measurement
• Package, class and method declarations
• previously unseen identifiers – higher entropy scores
• For-loop statements and catch clauses
• being often repetitive – lower entropy scores
Abstract-syntax-based line-types
and computing a syntax-sensitive
entropy score

Syntax-sensitive Entropy Score
• Matching between line and AST node.
• Then, compute how much a line’s entropy deviated from the
mean entropy of its line-type.
• => $gram+type

Relative bug-proneness
• => $gram+wType

RQ1: Are buggy lines less “natural" than
non-buggy lines?

Are buggy lines less “natural" than non-buggy lines?

Bug Duration
Bugs that stay longer in a repository tend to have lower entropy than the
short-lived bugs

RQ2: Are buggy lines less “natural" than
bug-fix lines?

Are buggy lines less “natural" than bug-fix lines?

RQ3: Is “naturalness" a good way to direct
inspection effort?

DP: Defect Prediction
• Two classifier
• Logistic Regression(LR)
• Random Forest(RF)
• Process metrics
• # of developers
• # of file-commit
• Code churn
• Previous bug history

SBF: Static Bug Finder
• SBF uses syntactic and semantic properties of source code.
• For this study, PMD and FindBugs are used.
• NBF: Naturalness Bug Finder
• AUCEC: Area Under the Cost-Effectiveness Curve

Result
• Buggy lines, on average, have higher entropies, i.e. are “less
natural”, than non-buggy lines.
• Entropy of the buggy lines drops after bug-fixes with statistical
significance.
• Entropy can be used to guide bug-finding efforts at both the file-
level and the line-level.

�ݺ�ߣ

Review: On the Naturalness of Buggy Code

More Related Content

Review: On the Naturalness of Buggy Code