際際滷

際際滷Share a Scribd company logo
Intelligent Software Updates: Leveraging
the Software Ecosystem to Support when
to update Library Dependencies
Raula Gaikovina Kula, PhD
Assistant Professor
30th May 2019 - Montreal, Canada
https://raux.github.io/
https://saner2019.github.io/program/asiaPacificTrack.html
2
About Me  Software Engineering
Researcher
Mining
Software
Repositories
3
Software Engineering Laboratory
4 Faculty members
8 Doctoral students
7 Master students
6 Intern students
Assoc. Prof.
Takashi Ishio
Prof. Kenichi
Matsumoto
Assist. Prof. Hideaki Hata and
Raula Gaikovina Kula
4
5
My Goal of the Talk
 Convince you that
 Library Dependencies impact
the world
 More intelligent approaches are
needed to assist developers
6
This talk merges two papers!
7
In 2017, 2018, 2019, somewhere out there  a
developer is in need of a function, features..
8
Why write code when you can use a
library?
Library
Dependencies
Source Code
9
Software Ecosystem
Adapted from biological ecosystems:
Scypersrsky [1]: defined as a set of businesses functioning as a
unit and interacting with a shared market for software and services,
together with relationships among them.
Lungu [2]: a collection of software systems, which are developed
and co-evolve in the same environment
Stallman [3] : It is a mistake to describe the free software
community, or any human community, as an "ecosystem", because
that word implies the absence of (1) intention and (2) ethics"
10
The complex web of
software of software
https://www.npmjs.com/
475,000 building blocks
800,000
11
12
Plethora of Empirical Studies at API and Library
 Popularity Trends
 Mileva et al., IWPSE09
 De Roover, ICPC13
 Evolution Studies (lags in updates)
 Raemakers et al., ICSME12, MSR13
 Robbles et al., FSE12
 Bavota et al., ESE15
 Dependency Networks (Transitive)
 Decan et al., ESE18, SANER17
 Abdalkareem et al., FSE17
And many more!
13
14
Security Updates
Updates are sometimes strong
recommended, especially with
Security Vulnerabilities
15
Awareness Mechanisms
2019
16
Methodology
Relentless mining of Software
Repositories.
17
Wisdom of the crowd is the new search
17
https://www.amazon.com
https://www.google.co.jp
And is run by algorithms
18
Adhoc Mining of Libraries
 Third-party dependencies releases are
inconsistent, making comparing
difficult and time consuming:
 Different semantic versioning,
messy provenance tracking and
release cycles, even within the same
language
 Dependency Trees are heavy,
especially at ecosystem level
 No standardized rules
(i.e., semantic versioning)
18
19
Key Motivation:
Develop a systematic technology that spans any ecosystem of language
platforms to understand:
 Popularity refers to the usage of a library over time.
 Adoption refers to systems introducing a new library dependency.
 Diffusion, inspired by use-diffusion, is a measure of the spread of
library versions over dependent systems.
 Visualizations assist developers with the migration decision (not just
use the latest).
19
20
Empirical Study using the
Software Universe Graph
20
Our goal is to:
(1) construct real-world SUG models to show its practical
application and
(2) demonstrate visualization usefulness in library dependency
management through several case studies.
21
21
Example SUG Visualizations
22
22
Library Tracking Model
Simple example of the LU-based
metrics.
We show the Peak LU at time t1,
current LU at time t2 and library
residue (Peak LU / Current LU).
23
23
Visualization of Library Usage
A Library Migration Plot. In this example, the
release of a related security advisory CVE-2014-
0114 (black dashed line) that affects beanutils
versions 1.9.1 (marked with crossbones).
24
24
Library Migration in
Practice
RQ1: To what extent are developers updating their library
dependencies? We find that (i) although system heavily depend on libraries,
most systems rarely update their libraries and (ii) systems are
less likely migrate their library dependencies, with 81.5% of
systems remaining with a popular older version.
25
25
Effectiveness of
Awareness Mechanisms
RQ2: To what extent are developers updating their library
dependencies?
 New release of a popular library (i) there exist patterns of consistent
migration and patterns where an older popular library version is still
preferred.
 For a security advisory disclosure we find cases of developer (ii) non
responsiveness to security advisory disclosure, which is sometimes
due to an incomplete patch or a latent security advisory.
 3 new releases of popular libraries
 5 security vulnerabilities
26
26
Effectiveness of
Awareness Mechanisms
RQ3: Why are developers non responsive to a security advisory?
 Vulnerable projects contacted for feedback
 Understand feedback
69% of developers were unaware of their vulnerable
dependencies and proceeded to immediately migrate to a
safer dependency.
 Developers evaluate based on project specific priorities.
 Developers cite migration as a practice that requires extra
migration effort and added responsibility.
27
27
Awareness and Motivation is needed
for Updates
But ..
Libraries
Age
28
State-of-the-Art
https://snyk.io/
Roles and Responsibilities?
29
Some Visualizations
30
MSR2019 Distinguished Data showcase Paper!
31
Other Related Publications
32
6/9/2019
32
Although Library Dependency usage is
prevalent, updating a library is not..
33
Summary on the data available for library
updates
Library dependency data is huge, and easily available
Smart modelling with context is needed
34
Future Ideas: API function level to
ecosystem level
How can we span to the ecosystem level?
35
Future Ideas: Combine API function level
to the ecosystem?
How can we span to the ecosystem level?
36
What is the Lag between Updates?
37

More Related Content

Intelligent Software Updates: Leveraging the Software Ecosystem to Support when to update Library DependenciesIntelligent Software Updates: Leveraging the Software Ecosystem to Support when to update Library Dependencies

  • 1. Intelligent Software Updates: Leveraging the Software Ecosystem to Support when to update Library Dependencies Raula Gaikovina Kula, PhD Assistant Professor 30th May 2019 - Montreal, Canada https://raux.github.io/ https://saner2019.github.io/program/asiaPacificTrack.html
  • 2. 2 About Me Software Engineering Researcher Mining Software Repositories
  • 3. 3 Software Engineering Laboratory 4 Faculty members 8 Doctoral students 7 Master students 6 Intern students Assoc. Prof. Takashi Ishio Prof. Kenichi Matsumoto Assist. Prof. Hideaki Hata and Raula Gaikovina Kula
  • 4. 4
  • 5. 5 My Goal of the Talk Convince you that Library Dependencies impact the world More intelligent approaches are needed to assist developers
  • 6. 6 This talk merges two papers!
  • 7. 7 In 2017, 2018, 2019, somewhere out there a developer is in need of a function, features..
  • 8. 8 Why write code when you can use a library? Library Dependencies Source Code
  • 9. 9 Software Ecosystem Adapted from biological ecosystems: Scypersrsky [1]: defined as a set of businesses functioning as a unit and interacting with a shared market for software and services, together with relationships among them. Lungu [2]: a collection of software systems, which are developed and co-evolve in the same environment Stallman [3] : It is a mistake to describe the free software community, or any human community, as an "ecosystem", because that word implies the absence of (1) intention and (2) ethics"
  • 10. 10 The complex web of software of software https://www.npmjs.com/ 475,000 building blocks 800,000
  • 11. 11
  • 12. 12 Plethora of Empirical Studies at API and Library Popularity Trends Mileva et al., IWPSE09 De Roover, ICPC13 Evolution Studies (lags in updates) Raemakers et al., ICSME12, MSR13 Robbles et al., FSE12 Bavota et al., ESE15 Dependency Networks (Transitive) Decan et al., ESE18, SANER17 Abdalkareem et al., FSE17 And many more!
  • 13. 13
  • 14. 14 Security Updates Updates are sometimes strong recommended, especially with Security Vulnerabilities
  • 16. 16 Methodology Relentless mining of Software Repositories.
  • 17. 17 Wisdom of the crowd is the new search 17 https://www.amazon.com https://www.google.co.jp And is run by algorithms
  • 18. 18 Adhoc Mining of Libraries Third-party dependencies releases are inconsistent, making comparing difficult and time consuming: Different semantic versioning, messy provenance tracking and release cycles, even within the same language Dependency Trees are heavy, especially at ecosystem level No standardized rules (i.e., semantic versioning) 18
  • 19. 19 Key Motivation: Develop a systematic technology that spans any ecosystem of language platforms to understand: Popularity refers to the usage of a library over time. Adoption refers to systems introducing a new library dependency. Diffusion, inspired by use-diffusion, is a measure of the spread of library versions over dependent systems. Visualizations assist developers with the migration decision (not just use the latest). 19
  • 20. 20 Empirical Study using the Software Universe Graph 20 Our goal is to: (1) construct real-world SUG models to show its practical application and (2) demonstrate visualization usefulness in library dependency management through several case studies.
  • 22. 22 22 Library Tracking Model Simple example of the LU-based metrics. We show the Peak LU at time t1, current LU at time t2 and library residue (Peak LU / Current LU).
  • 23. 23 23 Visualization of Library Usage A Library Migration Plot. In this example, the release of a related security advisory CVE-2014- 0114 (black dashed line) that affects beanutils versions 1.9.1 (marked with crossbones).
  • 24. 24 24 Library Migration in Practice RQ1: To what extent are developers updating their library dependencies? We find that (i) although system heavily depend on libraries, most systems rarely update their libraries and (ii) systems are less likely migrate their library dependencies, with 81.5% of systems remaining with a popular older version.
  • 25. 25 25 Effectiveness of Awareness Mechanisms RQ2: To what extent are developers updating their library dependencies? New release of a popular library (i) there exist patterns of consistent migration and patterns where an older popular library version is still preferred. For a security advisory disclosure we find cases of developer (ii) non responsiveness to security advisory disclosure, which is sometimes due to an incomplete patch or a latent security advisory. 3 new releases of popular libraries 5 security vulnerabilities
  • 26. 26 26 Effectiveness of Awareness Mechanisms RQ3: Why are developers non responsive to a security advisory? Vulnerable projects contacted for feedback Understand feedback 69% of developers were unaware of their vulnerable dependencies and proceeded to immediately migrate to a safer dependency. Developers evaluate based on project specific priorities. Developers cite migration as a practice that requires extra migration effort and added responsibility.
  • 27. 27 27 Awareness and Motivation is needed for Updates But .. Libraries Age
  • 30. 30 MSR2019 Distinguished Data showcase Paper!
  • 32. 32 6/9/2019 32 Although Library Dependency usage is prevalent, updating a library is not..
  • 33. 33 Summary on the data available for library updates Library dependency data is huge, and easily available Smart modelling with context is needed
  • 34. 34 Future Ideas: API function level to ecosystem level How can we span to the ecosystem level?
  • 35. 35 Future Ideas: Combine API function level to the ecosystem? How can we span to the ecosystem level?
  • 36. 36 What is the Lag between Updates?
  • 37. 37