This document proposes a methodology to detect knowledge transfer from science to technology using network analysis and bibliometrics. As a case study, it analyzes the fields of academic papers and patented technologies in solar cells. It clusters papers and patents by citation networks and identifies characteristic keywords in each cluster. It then calculates semantic similarities between clusters to create a meta-network and detects the maximum flow route, identifying a boundary cluster related to "dye sensitized" research that transferred from academic papers to patented technologies.
1 of 19
More Related Content
Sasaki.informs2014(2)
1. Detecting Knowledge
Transitions Between Science
and Technology for Forecasting
Growing Fields.
1
Hajime SASAKI*1
Yuya KAJIKAWA*2
Ichiro SAKATA*1
*1)Innovation Policy Research Center, The University of Tokyo.
and Policy Alternatives Research Institute, The University of Tokyo.
*2)Innovation Management, Tokyo Institute of Technology.
2. Background (1):
Relationships between Science and Technology
“Close links between academia and industry
have many positive aspects not only for the business
partner(Zucker and Darby, 2000; Hall et al., 2001) but also for the academic
sector. ” (Dirk et al., 2009)
“Industry–science collaborations might even trigger new
basic research.” (Rosenberg, 1998)
“A positive relationship of patenting activities and
publication outcome as well as publication quality.”
(VanLooy et al., 2006; Czarnitzki et al., 2007; Breschi et al., 2007; Azoulay et al., 2006, etc.).
3. Background(2):
Fundamental Challenges We Face
? Exponential growth of knowledge
We are drowning in the
information sea.
Volume of
Knowledge
Industry Revolution
Time
Renaissance
Before modern history
Prerequisite to catch up with the pace
of development
Time
? Segmentation & Specialization
Difficult to grasp the whole picture in each field even specialists
3
4. Research Question
Question
? Can we detect a transition of knowledge from science to
technology and contribute to an early detection of
promising technologies in advance?
Purpose
? To propose a methodology to detect knowledge transfer
from science to technology, which will become growing
field using network analysis and bibliometrics.
5. Citation Network Analysis
? Node: Papers or Patents in the largest-graph component in network
data
? Link: Citation between Papers( or Patents)
? Year: Average Publication Year in the cluster
? Keywords: Characteristic words in the cluster
Node (Paper)
Link (Citation)
Cluster 3
Node: 6
Edge: 6
Year: 2007.96
Keywords: ----
Cluster 1
Node: 5
Edge: 4
Year: 2003.21
Keywords: ----
Cluster 2
Node: 3
Edge: 3
Year: 2004.36
Keywords: ----
5
Example)
6. Methodology Flow
Science Layer Technology Layer
Extract Patent Dataset
Common process
Create Citation Network (by year)
Network Clustering
Keyword Recognition
Determine Similarity
Create Meta Network
Detect max flow route
Extract Paper Dataset
1
2
3
4
5
6
Node: Paper or Patent
Link: Citation
Node: Cluster
Link: Similarity
7. Dataset: Solar cell (Photovoltaic, PV)
7
Science Layer
Dataset: Academic Paper
Database: Thomson Web of science?
Queries“photovoltaic” OR “solar cell”
Number of Papers: 50,913
Technology Layer
Dataset: Patent Gazzet
Database: Thomson Innovation?
Queries“photovoltaic” OR “solar cell”
Number of Patents: 63,972
8. Network Clustering
(Newman M.E.J, 2004)
wij: the possibility of the weights of edges in the
network that connected
nodes in cluster s to those in cluster j
? Connect clusters sparsely and extract clusters within which nodes are
connected densely is cut.
? A high value of Q represents good community division where only
dense edges remain within clusters and sparse edges between
clusters are cut off, and
? Q = 0 means that a particular division gives no more within-community
edges.
8
9. Characteristic Keywords in each Cluster: (tf-idf)
9
? TFIDF
– Good for larger tf (Ferm Frequency)
– Good for small df (Document Frequency)
tfidf(d, w) = tf(d,w) idf(w)
= tf(d,w) log(N /df(t)) N: number of
documents
Specific terms
10. C1(Cluster1)
“thin film”
C2(Cluster2)
“organic”
C3(Cluster3)
“dye sensitized”
C4(Cluster4)
“power tracking”
C5(Cluster5)
Citation Network in Academic Research
Dataset: “Solar Cell” published Until 2012
Paper: Node “HgCdTe photovoltaic”
Citation: Link
Mercury Cadmium Telluride
11. C1(Cluster1)
“conductor”
C2(Cluster2)
“protection sheet”
C3(Cluster3)
“dye sensitized”
C4(Cluster4)
“multi junction”
C5(Cluster5)
“silicon semiconductor”
Citation Network in Patented Technology
Dataset: “Solar Cell” published Until 2012
12. Top 10 keywords in each Cluster
Science(Academic Paper) Technology (Patent)
C1 untill 2012 C2 untill 2012 C3 untill 2012 C4 untill 2012 C5 untill 2012
film polymer tio2 power hgcdte
silicon conjugated dye system detector
thin film blend sensitized inverter infrared
thin organic sensitized solar module photovoltaic detector
deposition p3ht sensitized solar cgerildl photodiodes
solar cell poly dye sensitized renewable fesi2
solar pcbm dye sensitized sotrlacrking wavelength
layer conjugated polymdyeer sensitized soblaatrt ecreyll array
efficiency fullerene electrolyte maximum powerhg1
cdte bulk heterojunctzionno wind focal plane
C1 untill 2011 C2 untill 2011 C3 untill 2011 C4 untill 2011 C5 untill 2011
film polymer tio2 power hgcdte
silicon conjugated dye system detector
thin film p3ht sensitized module infrared
thin blend dye sensitized inverter photovoltaic detector
deposition organic sensitized solar tracking fesi2
solar cell pcbm sensitized solar cgerildl photodiodes
solar poly dye sensitized sorelanrewable wavelength
layer conjugated polymdyeer sensitized sowlainr dcell hg1
efficiency bulk heterojuncteiolenctrolyte energy array
cdte fullerene zno battery plane array
12
C1 untill 2012 C2 untill 2012 C3 untill 2012 C4 untill 2012 C5 untill 2012
photovoltaic resin dye layer type
electrode sheet sensitized oxide electrode
contact polyester dye sensitized silicon silicon
tab sealing sensitized solar subcell semiconductor substrate
layer protection sheetdye sensitized sojulanrction etching
module copolymer sensitized solar ctrealnl sparent condwucirtiinvge
photovoltaic cellsolar cell moduldeye sensitized socloanr dcueclltive surface
portion sheet for solar ceelllectrolyte transparent portion
material sheet for solar electrode semiconductor substrate
conductive cell module sensitizing solar subcell impurity
C1 untill 2011 C2 untill 2011 C3 untill 2011 C4 untill 2011 C5 untill 2011
layer resin dye transparent organic
photovoltaic sheet sensitized film dye
electrode polyester dye sensitized oxide photoactive
module solar cell modulesensitized solar layer electrode
contact sealing dye sensitized soslialircon layer
photovoltaic cellcell module sensitized solar ceelellctrode material
forming sheet for solar cedlyle sensitized sotrlanr scpeallrent condeulcetcitvreon
subcell sheet for solar electrode conductive comprises
surface copolymer sensitizing amorphous photovoltaic
tab module electrolyte plasma sensitized
Until 2010
?
Until 2009
?
?
Until 2010
?
Until 2009
?
?
Until 2012
Until 2011
Until 2012
Until 2011
17. 17
Top 10 Keywords in each boundary cluster
Science Layer Technology Layer
Cluster 2. in Academic Paper
学until 術2005
論文(2005. C2) 特許公報(2007. C2)
tio2 dye
dye sensitized
polymer dye sensitized
sensitized dye sensitized solar
dye sensitized dye sensitized solar cell
sensitized solar sensitized solar cell
sensitized solar cell sensitized solar
nanocrystalline sensitizing
dye sensitized solar pigment
dye sensitized solar cell electrode
Cluster 2. in Patent
until 2007
18. Summary
? It is necessary to understand knowledge relationships
between Science and Technology, for Innovation Strategy.
? We considered the relationship as a time expanded and
Heterogeneous network.
? We also considered this problem can be interpreted as a
Maximum flow problem of Semantic Similarity Network.
? Photovoltaic field as case study.
? Grasped boundary clusters related to “Dye sensitized” field.
18
19. Thank you for your attention.
References
? Zucker, L.G., Darby, M.R.,“Capturing technological opportunities via Japan’s star scientists.”, Journal of Technology Transfer 26, 37–58,
2000.
? Hall B.H., Link A.N., Scott J.T., “Barriers inhibiting industry from partnering with universities: evidence from the advanced technology
program.”, Journal of Technology Transfer 26, 87–98, 2001.
? Azoulay P., Ding W., Stuart T., 2006. “The impact of academic patenting on the rate, quality and direction of (public) research.”, NBER
working paper 11917, Cambridge, MA., 2006.
? Rosenberg N., “Chemical engineering as a general purpose technology. In: Helpman, E. (Ed.), General Purpose Technologies and Economic
Growth.”, MIT Press, Cambridge, pp. 167–192, 1998.
? Francis Narin, Kimberly S. Hamilton and Dominic Olivastro, “The increasing linkage between U.S. technology and public science”, Research
Policy, Volume 26, Issue 3, October 1997, Pages 317–330, 1997.
? N. Shibata, Y. Kajikawa, and I. Sakata. “Extracting the commercialization gap between science and technology - case study of a solar cell.”,
Technological Forecasting and Social Change, 77:1147–1155, 2010.
? Small H., “Citation Structure of an Emerging Research Area: Organic Thin Film.”, Proceedings of ISSI, pp. 718-725. 2007.
? 七丈, “共引用クラスタリングによる研究分野の動的把握に向けた試論”, 情報知識学会誌2013, Vol.23, No.3, 371-379, 2013.
? F.G. Engineer, G.L. Nemhauser, and M.W.P. Savelsgergh, “Dynamic programming-based column generation on Time-Expanded Network:
Application to the Dial-a-Flight problem”, INFORMS Journal on Computing, 23(1), pp. 105-119, 2011.
? N. Shah, S. Kumar, F. Bastani, and I.L. Yen, “Optimization models for assessing the peak capacity utilization of intelligent transportation
systems”, European Journal of Operational Research, 216, pp. 239-251, 2012.
? Newman M.E.J., “Fast algorithm for detecting community structure in networks”, Physical Review E, Vol. 69, p. 066133, 2004.
? Ino, H, Kudo. M, Nakamura. A, “A Comparative Study of Algorithms for Finding Web Communities”, Data Engineering Workshops, 2005. 21st
International Conference, 1257, 2005.
? Horiike. T, Takahashi. Y, Kuboyama. T and Sakamoto. H, “Extracting Research Communities by Improved Maximum Flow Algorithm”,
Knowledge-Based and Intelligent Information and Engineering Systems Lecture Notes in Computer Science Volume 5712, pp472-479, 2009.
? Victor Str?ele, Geraldo Zimbr?o, Jano M. Souza, “Modeling, Mining and Analysis of Multi-Relational Scientific Social Network”, Journal of
Universal Computer Science, vol. 18, no. 8 (2012), 1048-1068, 2012.
? A. V. Goldberg and R. E. Tarjan, “A New Approach to the Maximum Flow Problem”, Journal of the ACM 35:921-940, 1988.
? Péter ?rdi, Kinga Makovi, Zoltán Somogyvári, Katherine Strandburg, Jan Tobochnik, Péter Volf, László Zalányi, “Prediction of Emerging
Technologies Based on Analysis of the U.S. Patent Citation Network”, Scientometrics: Volume 95, Issue 1 (2013), Page 225-242, 2013.
Editor's Notes
We can find lot of resarrch saing that To close relationship with science and technology will have posityve efffect.
According to Dr. Narin, the rate of the number of patent which cite academic papers had rapidly grown from 1987 to 1994.
It means science knowledge are effect to technology field is growing more and more.
In the filed of DNA research, when Dr. Watthon and DrClis discovered 二重螺旋構造, the academic paper related to dna had published around 100 papers in a year. May be you can read all of paper with in the year.
Now we can find annually 10万本(100thouand) paper related to the field from academic data base. Any resarecher cannot g read all of the paper.
Knowledge is rapidly growing and more segmented.
It si difficult ot gras all figure in each field even spacialist.
Citaion network anaysls is used as one of the methodology in bebliometric analayasi.
Network is consited with Node and Link.
In this analysis we consider acadmiec paper and patent as NODE and consicer Citation relationship as Link.
We deveided 2 process one is science layer, the other on is tehcnolgy layer.
1st of all we extracted dataset in each layere. In science layer, we extarced adacemic paper related to targeg filed). In tehnoogy layer, we extraced Pateng
In both of layers, we proceed common 6 process.
Sustainable and renewable energies have been widely accepted as a key concept for our common future [25]. A solar cell or
photovoltaic cell, which is a device that converts solar energy into electricity via the photovoltaic effect, represents a promising
research front for our future sustainable ecosystem.
Newman's algorithm extracts tightly knit clusters with a high density of links within the cluster. The clustering algorithmis based on the idea of the maximization of modularity.
Modularity is defined as the fraction of links that fall within clusters, minus the expected value of the same quantity if the links fall at random without regard for the clustered structure of that network.
A high value of modularity represents a good division of clusters where only dense clusters remained within clusters and sparse links between clusters.
Newman's algorithm [23] can efficiently find the point to maximize modularity over all possible divisions by cutting off
links which connect clusters sparsely and extract clusters within which nodes are connected densely.
tfi,d is the number of occurrences of ith term in document d,
dfi is the number of documents containing ith term,
N is the total number of documents
This visualization shows the rsult of citation netrok in academic resarech related to Solar cell publicedh untill 2012
We utlize cosine similairty to mearsures semantic similarity between clusters.
This heatmap syos the cosicne similarity between the paper untill 2012 and patent antill 2012
You can find most high similar pair are C3 in paper and C3 in patent
C3 is the cluster mainly focusec on
It was also pointed out that research on dye-sensitized solar cells is focusing on
the improvement in cell performance to a conversion efficiency of 15% through the development of new dyes and advanced cell
structures, as well as that of production technology for large-area modules with integrated circuits on various substrates
We donsidered C3 in Patent As Growing fields.
Our problem set to dectech nowoledge transfere to Cluster 3 in patent.
We concencd this problem as masimum flow ploblemn in network. That is meta network of citation network.
What we analyze in Maximum flow is how many flow units each node can pass to each other.
Based on Figure 3, we can see that node 1 can pass up to ten units to nodes 2, 3, and 4.
Node 2 can pass up to 5 units to both nodes 6 and 7.
Node 3 can pass up to six flow units to node 6, and four units to node 7.
Node 4 can pass up to ten flow units to node 5.
Finally, nodes 5, 6 and 7 can pass up to ten, eleven and nine units to receptor node 8, respectively. Thus, the maximum flow from node 1 to node 8 is of 30 units.”
“The objective of the developed algorithm is to group nodes with the largest flow of knowledge between them.”
In this network, we cnceden Cluster in each year as and Similarity as Link
Victor Str啼le, Geraldo Zimbr黍, Jano M. Souza 2011, “Evaluating Knowledge Flow in Multirelational Scientific Social Networks”
We assume 2 heterogenous network.
We regared these 2cluster is boundary clusters dye sensitized knolwege had been transit from scienct to technology.
Zucker, L.G., Darby, M.R.,“Capturing technological opportunities via Japan’s star scientists.”, Journal of Technology Transfer 26, 37–58, 2000.
Hall B.H., Link A.N., Scott J.T., “Barriers inhibiting industry from partnering with universities: evidence from the advanced technology program.”, Journal of Technology Transfer 26, 87–98, 2001.
Azoulay P., Ding W., Stuart T., 2006. “The impact of academic patenting on the rate, quality and direction of (public) research.”, NBER working paper 11917, Cambridge, MA., 2006.
Rosenberg N., “Chemical engineering as a general purpose technology. In: Helpman, E. (Ed.), General Purpose Technologies and Economic Growth.”, MIT Press, Cambridge, pp. 167–192, 1998.
Francis Narin, Kimberly S. Hamilton and Dominic Olivastro, “The increasing linkage between U.S. technology and public science”, Research Policy, Volume 26, Issue 3, October 1997, Pages 317–330, 1997.
N. Shibata, Y. Kajikawa, and I. Sakata. “Extracting the commercialization gap between science and technology - case study of a solar cell.”, Technological Forecasting and Social Change, 77:1147–1155, 2010.
Small H., “Citation Structure of an Emerging Research Area: Organic Thin Film.”, Proceedings of ISSI, pp. 718-725. 2007.
七丈, “共引用クラスタリングによる研究分野の動的把握に向けた試論”, 情報知識学会誌 2013, Vol.23, No.3, 371-379, 2013.
F.G. Engineer, G.L. Nemhauser, and M.W.P. Savelsgergh, “Dynamic programming-based column generation on Time-Expanded Network: Application to the Dial-a-Flight problem”, INFORMS Journal on Computing, 23(1), pp. 105-119, 2011. ?
N. Shah, S. Kumar, F. Bastani, and I.L. Yen, “Optimization models for assessing the peak capacity utilization of intelligent transportation systems”, European Journal of Operational Research, 216, pp. 239-251, 2012.
Newman M.E.J., “Fast algorithm for detecting community structure in networks”, Physical Review E, Vol. 69, p. 066133, 2004.
Ino, H, Kudo. M, Nakamura. A, “A Comparative Study of Algorithms for Finding Web Communities”, Data Engineering Workshops, 2005. 21st International Conference, 1257, 2005.?
Horiike. T, Takahashi. Y, Kuboyama. T and Sakamoto. H, “Extracting Research Communities by Improved Maximum Flow Algorithm”, Knowledge-Based and Intelligent Information and Engineering Systems Lecture Notes in Computer Science Volume 5712, pp472-479, 2009.
Victor Str?ele, Geraldo Zimbr?o, Jano M. Souza, “Modeling, Mining and Analysis of Multi-Relational Scientific Social Network”, Journal of Universal Computer Science, vol. 18, no. 8 (2012), 1048-1068, 2012.
A. V. Goldberg and R. E. Tarjan, “A New Approach to the Maximum Flow Problem”, Journal of the ACM 35:921-940, 1988.
Péter ?rdi, Kinga Makovi, Zoltán Somogyvári, Katherine Strandburg, Jan Tobochnik, Péter Volf, László Zalányi, “Prediction of Emerging Technologies Based on Analysis of the U.S. Patent Citation Network”, Scientometrics: Volume 95, Issue 1 (2013), Page 225-242, 2013.