Manuel Sojer Reusing Open Source Code
GABLER RESEARCH Innovation und Entrepreneurship Herausgegeben von Professor Dr...
16 downloads
1520 Views
3MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Manuel Sojer Reusing Open Source Code
GABLER RESEARCH Innovation und Entrepreneurship Herausgegeben von Professor Dr. Nikolaus Franke, Wirtschaftsuniversität Wien, Professor Dietmar Harhoff, Ph.D., Universität München, und Professor Dr. Joachim Henkel, Technische Universität München
Innovative Konzepte und unternehmerische Leistungen sind für Wohlstand und Fortschritt von entscheidender Bedeutung. Diese Schriftenreihe vereint wissenschaftliche Arbeiten zu diesem Themenbereich. Sie beschreiben substanzielle Erkenntnisse auf hohem methodischen Niveau.
Manuel Sojer
Reusing Open Source Code Value Creation and Value Appropriation Perspectives on Knowledge Reuse With a Foreword by Univ.-Prof. Dr. Joachim Henkel
RESEARCH
Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.
Dissertation Technische Universität München, 2010
1st Edition 2011 All rights reserved © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011 Editorial Office: Stefanie Brich | Jutta Hinrichsen Gabler is a brand of Springer Fachmedien. Springer Fachmedien is part of Springer Science+Business Media. www.gabler.de No part of this publication may be reproduced, stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the copyright holder. Registered and/or industrial names, trade names, trade descriptions etc. cited in this publication are part of the law for trade-mark protection and may not be used free in any form or by any means even if this is not specifically marked. Coverdesign: KünkelLopka Medienentwicklung, Heidelberg Printed on acid-free paper Printed in the Netherlands ISBN 978-3-8349-2668-5
Preface Over the past decade, open source software (OSS) has attracted enormous interest from practitioners and academics alike. However, the focus of research on OSS is mostly on individuals’ and firms’ contributions to public OSS projects. In contrast, the receiving side of this open and collaborative development process has been given much less attention, despite indications that the reuse of OSS code is of high importance in the development of both OSS and of proprietary software. Questions regarding quantity, motivation, and drivers of OSS code reuse have been studied only by few authors, and in no case quantitatively. In particular, the role of individual programmers in the context of “ad hoc” code reuse and concomitant legal risks are largely unexplored. Manuel Sojer addresses the above issues in this ground-breaking book. After developing the theoretical foundations of his work, he presents two large-scale empirical studies on the reuse of publicly available OSS code. Both studies are based on carefully constructed models that draw on the Theory of Planned Behavior. Focusing on the amount of code reuse that programmers practice in public OSS projects, the first survey yields highly interesting findings regarding the drivers of and impediments to reuse in this setting, with important conclusion to be drawn for code reuse within firms. The second survey complements the first one by addressing the reuse of OSS and similar code in proprietary software development. In particular, with its focus on license risks this study takes a value appropriation rather than value creation perspective. Using an elaborate survey design, the author derives important results regarding the determinants of negligent or even deliberate violation of license obligations by employed programmers. As the first one, this study bears obvious significant implications for academics as for managers. This book is Manuel Sojer’s doctoral thesis at Technische Universität München. It is full of good ideas, flawless analyses, and novel findings, and I strongly recommend it to practitioners and academics alike. It was a pleasure to be Manuel Sojer’s thesis advisor.
Prof. Dr. Joachim Henkel
Foreword The reuse of existing knowledge is of crucial importance in innovation processes in general and particularly so in software development. In my dissertation I investigate this phenomenon from the perspective of individual software developers and research their reuse of open source code. In this context I analyze, on the one hand, the factors which influence the extent to which software developers reuse existing knowledge in the form of code and thereby increase the value creation of their firms. On the other hand, I investigate potential value appropriation risks for firms which may result from the reuse of existing knowledge in the form of code from the internet (such as open source code) through their software developers. Throughout my dissertation work I have received both advice and support from many people. I want to take this opportunity to express my gratitude to all those who have helped me along the way. First of all, I am indebted to my thesis advisor Prof. Dr. Joachim Henkel for his continuous support of my work, his valuable suggestions and comments regarding my research, his approachability and responsiveness at all times and his contagious fascination for interesting research questions. I am also grateful to my second advisor Prof. Dr. Isabell Welpe and to Prof. Dr. Christoph Kaserer who chaired my dissertation committee. The empirical approach of my dissertation would not have been possible without the more than 50 industry experts and software developers who made themselves available for interviews, as well as the roughly 2,000 participants in the two surveys I conducted. I am indebted to all of them for their valuable thoughts and their time. I am also grateful to my colleagues Oliver Alexy, Jörn Block, Annika Bock, Timo Fischer, Florian Jell, Stefanie Pangerl, Anja Schön, Frank Spiegel, Johannes Wechsler and Evelin Winands for the pleasant and inspiring atmosphere at the Schöller Chair in Technology and Innovation Management. Particular thanks go to Oliver Alexy, Timo Fischer and Johannes Wechsler for their comments and suggestions regarding both content and methodology of my work. Moreover, I owe thanks to my dear friend Werner Skalla for diligently reading and commenting my manuscript. I am also grateful to Michael Maier for the various activities with which he supported me as a student assistant during his time at the Schöller Chair of Technology and Innovation Management.
VIII
Foreword
Finally, I am indebted to my girlfriend Maria for her continuous encouragement, her optimism and her thoughtfulness, put in one word, for her companionship along the way. Above all, I thank my parents and my brother for always and unconditionally providing me with a safe harbor. My parents have nurtured my education from the very beginning and have always supported me on my way. In doing so they have laid the foundations for this dissertation which I dedicate to them.
Manuel Sojer
Table of Contents Table of Contents...............................................................................................................IX List of Figures ................................................................................................................ XIII List of Tables ...................................................................................................................XIV List of Abbreviations ....................................................................................................... XV Zusammenfassung ........................................................................................................ XVII Abstract ...........................................................................................................................XXI 1. Introduction..................................................................................................................... 1 1.1. Motivation: The Cisco/Linksys case ....................................................................... 2 1.2. Research objectives ................................................................................................. 3 1.3. Structure of the dissertation..................................................................................... 8 2. Foundations of value creation and value appropriation ............................................. 9 2.1. Concepts and terminology ..................................................................................... 10 2.2. Determinants of value creation.............................................................................. 14 2.3. Determinants of value appropriation ..................................................................... 15 2.4. Summary................................................................................................................ 18 3. Open source software developers’ perspectives on code reuse ................................. 20 3.1. Introduction ........................................................................................................... 20 3.2. Foundations of knowledge reuse ........................................................................... 23 3.2.1. 3.2.2. 3.2.3. 3.2.4.
Knowledge reuse to create value ............................................................... 23 Knowledge reuse in software development............................................... 26 The not-invented-here syndrome............................................................... 33 Intermediate conclusion............................................................................. 35
3.3. OSS and its development....................................................................................... 36 3.3.1. 3.3.2. 3.3.3. 3.3.4. 3.3.5. 3.3.6.
History of OSS .......................................................................................... 37 OSS licenses .............................................................................................. 39 OSS development ...................................................................................... 40 Motivations of OSS developers................................................................. 45 Code reuse in OSS development ............................................................... 50 Intermediate conclusion and detailed research questions.......................... 53
3.4. Research model and hypotheses ............................................................................ 54 3.4.1. The theory of planned behavior................................................................. 55 3.4.2. Qualitative pre-study ................................................................................. 56 3.4.3. Determinants of code reuse behavior ........................................................ 57
X
Table of Contents 3.5. Survey design and methodology ........................................................................... 72 3.5.1. 3.5.2. 3.5.3. 3.5.4.
Data source and sample selection.............................................................. 72 Survey design ............................................................................................ 75 Pretest ........................................................................................................ 77 Conducting the survey ............................................................................... 77
3.6. Descriptive and exploratory analyses .................................................................... 79 3.6.1. 3.6.2. 3.6.3. 3.6.4. 3.6.5. 3.6.6.
Survey participants and their OSS projects ............................................... 80 Importance and extent of code reuse ......................................................... 86 Developers’ reasons for and against code reuse ........................................ 90 Component and snippet reuse.................................................................... 99 Developers’ sources to search for existing code to reuse ........................ 105 Summary.................................................................................................. 108
3.7. Multivariate analysis of determinants of code reuse ........................................... 110 3.7.1. 3.7.2. 3.7.3. 3.7.4. 3.7.5.
Hypotheses .............................................................................................. 110 Variables.................................................................................................. 111 Statistical methods used .......................................................................... 115 Results ..................................................................................................... 115 Discussion and summary ......................................................................... 122
3.8. Conclusion........................................................................................................... 126 4. Commercial software developers’ perspectives on internet code reuse................. 131 4.1. Introduction ......................................................................................................... 131 4.2. Foundations of internet code reuse in commercial software development.......... 135 4.2.1. Obligations from internet code reuse....................................................... 135 4.2.2. Internet code reuse in commercial software development ...................... 141 4.2.3. Intermediate conclusion and detailed research questions........................ 145 4.3. Research model and hypotheses .......................................................................... 148 4.3.1. Theoretical models to predict ethical behavior........................................ 149 4.3.2. Qualitative pre-study ............................................................................... 150 4.3.3. Determinants of violations of internet code reuse obligations ................ 152 4.4. Survey design and methodology ......................................................................... 168 4.4.1. 4.4.2. 4.4.3. 4.4.4.
Data source and sample selection............................................................ 168 Survey design .......................................................................................... 169 Pretest ...................................................................................................... 173 Conducting the survey ............................................................................. 173
4.5. Descriptive and exploratory analyses .................................................................. 176 4.5.1. 4.5.2. 4.5.3. 4.5.4. 4.5.5.
Survey participants and their firms.......................................................... 176 Developer awareness of internet code reuse obligations......................... 180 Internet code reuse in commercial software development ...................... 186 Extent of (potential) violations of internet code obligations ................... 191 Summary.................................................................................................. 193
Table of Contents
XI
4.6. Research model testing and results...................................................................... 195 4.6.1. 4.6.2. 4.6.3. 4.6.4. 4.6.5.
Hypotheses .............................................................................................. 195 Statistical methods used .......................................................................... 196 Measurement model assessment and descriptive statistics...................... 198 Structural model assessment.................................................................... 208 Discussion and summary ......................................................................... 212
4.7. Conclusion........................................................................................................... 217 5. Conclusion ................................................................................................................... 228 Appendix........................................................................................................................... 237 A.1. Code reuse in open source software development............................................... 238 A.2. Code reuse in commercial software development............................................... 250 Bibliography..................................................................................................................... 263
List of Figures
XIII
List of Figures Figure 2-1: Concept of value creation ................................................................................. 13 Figure 3-1: OSS code reuse research model........................................................................ 58 Figure 3-2: Construction of OSS code reuse survey population ......................................... 74 Figure 3-3: OSS developers’ motivations to work on current main project ........................ 84 Figure 3-4: Share of reused code in functionality contributed to OSS projects .................. 89 Figure 3-5: Code reuse benefits perceived by OSS developers........................................... 91 Figure 3-6: Code reuse drawbacks and issues perceived by OSS developers ..................... 93 Figure 3-7: OSS developers’ subjective norm on code reuse.............................................. 95 Figure 3-8: OSS project policies on code reuse................................................................... 96 Figure 3-9: General impediments to code reuse perceived by OSS developers .................. 98 Figure 3-10: Number of reused components in OSS projects ........................................... 100 Figure 3-11: Share of snippets in lines of code contributed to OSS projects .................... 102 Figure 3-12: Component and snippet focused OSS developer groups .............................. 103 Figure 3-13: OSS developers’ sources to find existing code to reuse ............................... 106 Figure 3-14: Summary of tested OSS code reuse research model hypotheses .................. 122 Figure 4-1: Theory of reasoned action and theory of planned behavior............................ 153 Figure 4-2: Internet code reuse obligation violation research model ................................ 167 Figure 4-3: Construction of internet code reuse survey population................................... 169 Figure 4-4: Commercial software developers’ training regarding internet code............... 181 Figure 4-5: Importance of internet code reuse for commercial software developers ........ 186 Figure 4-6: Evolution of importance of internet code reuse over time.............................. 187 Figure 4-7: Frequency of (potential) violations of internet code obligations .................... 192 Figure 4-8: Structural model results for obligation violation model (scenario 1) ............. 209 Figure 4-9: Structural model results for obligation violation model (scenario 2) ............. 210 Figure 4-10: Structural model results for obligation violation model (scenario 3) ........... 211 Figure A-1: OSS developer survey questionnaire ............................................................. 238 Figure A-2: Commercial software developer survey questionnaire – scenario 1.............. 251 Figure A-3: Commercial software developer survey questionnaire – scenario 2.............. 258 Figure A-4: Commercial software developer survey questionnaire – scenario 3.............. 258
XIV
List of Tables
List of Tables Table 3-1: OSS code reuse survey response statistics ......................................................... 78 Table 3-2: Demographics of OSS code reuse survey participants....................................... 81 Table 3-3: Reliability of OSS developer motivation constructs .......................................... 82 Table 3-4: Loadings of OSS developer motivation items.................................................... 83 Table 3-5: Discriminant validity of OSS developer motivation constructs......................... 84 Table 3-6: Characteristics of OSS developers’ current main projects................................. 85 Table 3-7: Reliability of OSS code reuse importance constructs ........................................ 88 Table 3-8: Rotated factor loadings of benefits of OSS code reuse items ............................ 92 Table 3-9: Rotated factor loadings of drawbacks & issues of OSS code reuse items ......... 94 Table 3-10: OSS developers’ sources to find existing code by access to local search...... 108 Table 3-11: Summary of OSS code reuse research model hypotheses.............................. 110 Table 3-12: Descriptive statistics of dependent variables ................................................. 111 Table 3-13: Descriptive statistics of explanatory dummy variables.................................. 111 Table 3-14: Descriptive statistics of ordinal and metric explanatory variables................. 112 Table 3-15: Correlation matrix of independent variables.................................................. 114 Table 3-16: Model: Importance of past code reuse (ImpRePast) ...................................... 117 Table 3-17: Model: Share of code reuse in past contributions (ReuseSharePast) ............. 119 Table 3-18: Model: Importance of future code reuse (ImpReFut) .................................... 121 Table 4-1: Internet code reuse survey response statistics .................................................. 174 Table 4-2: Correlation of social desirability scale with other variables ............................ 175 Table 4-3: Demographics of internet code reuse survey participants................................ 177 Table 4-4: Characteristics of commercial software developers’ firms .............................. 179 Table 4-5: Commercial software developers’ internet code reuse knowledge .................. 182 Table 4-6: Summary of hypotheses regarding violations of internet code obligations ..... 196 Table 4-7: Reliability, convergent validity and descriptive statistics of constructs .......... 202 Table 4-8: Construct correlations and discriminant validity ............................................. 205 Table 4-9: Summary of research model hypotheses testing .............................................. 213 Table A-1: Standardized coefficients of OSS developer code reuse models .................... 248 Table A-2: Marginal effects of OSS developer code reuse models................................... 249 Table A-3: Quiz on commercial software developers’ internet code knowledge.............. 259 Table A-4: Loadings of internet code reuse model items.................................................. 260
List of Abbreviations ACM
Association for Computing Machinery
AGPL
GNU Affero General Public License
AMR
Academy of Management Review
AVE
Average Variance Extracted
BSD
Berkeley Software Distribution
C’s Į
Cronbach’s Į
CASE
Computer-Aided Software Engineering
CBSEM
Covariance-Based Structural Equation Modeling
CR
Composite Reliability
EPL
Eclipse Public License
FSF
Free Software Foundation
GPL
GNU General Public License
IP
Intellectual Property
IR
Indicator Reliability
IS
Information Systems
KMO
Kaiser-Meyer-Olkin
LGPL
GNU Lesser General Public License
LISREL
LInear Structural RELations
MPL
Mozilla Public License
OLS
Ordinary Least Squares
OSD
Open Source Definition
OSI
Open Source Initiative
OSL
Open Software License
OSS
Open Source Software
PLS
Partial Least Squares
R&D
Research and Development
RBV
Resource-Based View
S.D.
Standard Deviation
TAM
Technology Acceptance Model
TPB
Theory of Planned Behavior
TRA
Theory of Reasoned Action
VC
Venture Capital
Zusammenfassung Der Wiederverwendung von existierendem Wissen kommt in Innovationsaktivitäten große
Bedeutung
zu,
da
dadurch
Effektivität,
Effizienz
und
Qualität
der
Entwicklungsaktivitäten gesteigert werden können. Von besonderer Relevanz ist die Wissenswiederverwendung in der Softwareentwicklung, die auch den empirischen Kontext dieser Arbeit bildet. In diesem Umfeld wird vor allem die Wiederverwendung von existierendem Code als einer Form expliziten Wissens propagiert. Die bestehende Forschung hat sich sowohl mit Wissenswiederverwendung im Allgemeinen als auch mit der Wiederverwendung von Code in der Softwareentwicklung beschäftigt. Eine zentrale Erkenntnis dieser Arbeiten ist die hohe Bedeutung der individuellen Entwickler in der Wissenswiederverwendung. Allerdings existieren trotz dieser Feststellung kaum Studien, die sich explizit und im Detail den individuellen Entwicklern widmen und versuchen, deren Rolle zu analysieren. Vor diesem Hintergrund untersucht diese Arbeit mit Hilfe zweier großzahligempirischer Studien die Wiederverwendung von Open Source Software (OSS) Code und anderem Code, der über das Internet verfügbar ist, in Softwareentwicklungsprojekten. Die erste Studie nimmt dabei eine Wertschöpfungsperspektive ein und untersucht auf der Ebene
individueller
Softwareentwickler
Treiber,
die
das
Ausmaß
der
Code-
Wiederverwendung in ihrer Arbeit beeinflussen. Im Anschluss daran analysiert die zweite Studie unter einer Wertaneignungsperspektive Faktoren, die beeinflussen, ob individuelle Softwareentwickler bei der Wiederverwendung von Code Lizenzen verletzen und damit möglicherweise den ökonomischen Erfolg ihres Arbeitgebers gefährden. Mit zwölf explorativen Interviews und einer quantitativen Umfrage mit 684 Teilnehmern untersucht die erste Studie die Wiederverwendung von existierendem OSSCode durch individuelle Softwareentwickler in öffentlichen OSS-Projekten. Dieser Kontext eignet sich besonders gut, um die Rolle individueller Entwickler im Bezug auf Wertschöpfungsvorteile durch Wissenswiederverwendung zu betrachten. Im Gegensatz zu beispielsweise Softwareentwicklern in Unternehmen, die oftmals nur aus dem begrenzten Angebot an Code in ihren unternehmensinternen Bibliotheken wählen können, steht Softwareentwicklern in öffentlichen OSS-Projekten grundsätzlich sämtlicher existierender OSS-Code zur Wiederverwendung zur Verfügung. Weiterhin werden Softwareentwickler in öffentlichen OSS-Projekten nicht durch unternehmensinterne Regelungen beeinflusst.
XVIII
Zusammenfassung
Somit sollte das Wiederverwendungsverhalten von Softwareentwicklern in öffentlichen OSS-Projekten primär von ihren eigenen Überlegungen beeinflusst sein. Die Ergebnisse dieser Studie zeigen, dass Softwareentwickler im OSS-Umfeld zu einem erheblichen Maße existierendes Wissen in Form von Code wiederverwenden. Darüber hinaus können konkrete Faktoren identifiziert werden, die dieses Verhalten beeinflussen. So verwenden Softwareentwickler mit einem größeren persönlichen Netzwerk innerhalb der OSS-Gemeinschaft und Softwareentwickler mit einem breiteren Spektrum von Projekterfahrungen innerhalb von OSS ceteris paribus mehr existierenden Code wieder. Auch die OSS-Eigenheit, nach dem Start eines neuen Projekts möglichst schnell ein „plausible promise“ in Form von funktionierender Software zu liefern, scheint zu vermehrter Wiederverwendung von existierendem Code durch die Entwickler zu führen. Schließlich hat für Softwareentwickler, denen das Lösen von komplexen technischen Problemen besonders viel Freude bereitet, die Wiederverwendung von existierendem Code eine geringere Bedeutung. Im Anschluss daran widmet sich die zweite Studie mit 20 explorativen Interviews und einer quantitativen Umfrage mit 1.133 Teilnehmern den Wertaneignungsrisiken, die durch die Wiederverwendung von existierendem Wissen durch individuelle Entwickler entstehen können. Als konkretes Beispiel werden dafür die Probleme untersucht, die für Unternehmen entstehen können, wenn ihre Softwareentwickler in proprietärer Software Code wiederverwenden, der im Internet verfügbar ist. Typischerweise steht Code aus dem Internet unter Lizenzen (wie beispielsweise der weit verbreiteten GNU General Public License (GPL)), die Bedingungen für die Wiederverwendung stellen. Einige dieser Lizenzen fordern, dass Software, die derartigen Code auf bestimmte Art und Weise enthält, ihren Nutzern in Quellcodeform verfügbar gemacht wird und von diesen ohne Einschränkungen modifiziert und weitergegeben werden darf. Damit wird die ursprünglich proprietäre Software zu OSS, was es für Unternehmen schwieriger oder teilweise auch unmöglich macht, mit dem Verkauf dieser Software Geld zu verdienen. Die Ergebnisse der Studie zeigen, dass heute der Großteil der Softwareentwickler in Unternehmen Code aus dem Internet in ihren Projekten wiederverwendet, hinsichtlich der aus den Lizenzen möglicherweise resultierenden Risiken für ihre Arbeitgeber allerdings nicht optimal vorbereitet ist. Weiterhin können konkrete Faktoren identifiziert werden, die beeinflussen, ob Softwareentwickler Code aus dem Internet in einer Art und Weise wiederverwenden, die potentiell die Wertaneignung ihres Arbeitgebers gefährdet. Hierbei zeigt sich, dass Softwareentwickler, die die Konsequenzen eines solchen Verhaltens –
Zusammenfassung
XIX
sowohl für ihr Unternehmen als auch für sich persönlich – als weniger gravierend einschätzen, eher zu problematischem Wiederverwendungsverhalten neigen. Selbiges gilt für Softwareentwickler, die in ihrer Arbeit stark negative Konsequenzen aus nicht eingehaltenen Terminvorgaben befürchten und Softwareentwickler, die es als sehr aufwändig und schwierig empfinden, mögliche Lizenzprobleme abzuklären und zu berücksichtigen. Darüber hinaus werden Softwareentwickler auch von ihrem sozialen Umfeld
und
dem
ethischen
Klima
innerhalb
ihres
Unternehmens
in
ihrem
Wiederverwendungsverhalten beeinflusst. Die Ergebnisse der beiden Studien sind relevant für Theorie und Praxis. Aus Sicht der Theorie liegt ihr Wert darin, dass die Rolle von individuellen Entwicklern bei der Wiederverwendung von existierendem Wissen in Innovationsprozessen mit Hilfe von großzahlig-empirischen Daten aus dem Umfeld der Softwareentwicklung beleuchtet wird. Dabei zeigt sich, dass verschiedene Eigenschaften und Überzeugungen einzelner Entwickler sowohl die Wertschöpfungsvorteile der Wiederverwendung von existierendem Wissen als auch mögliche Wertaneignungsprobleme, die aus der Wiederverwendung von existierendem Wissen resultieren, beeinflussen. Mit ihren Ergebnissen trägt die Arbeit sowohl zur Managementliteratur als auch zur Wirtschaftsinformatikforschung bei. Aus Sicht der Praxis ist diese Arbeit im Speziellen für Unternehmen mit Softwareentwicklungsaktivitäten
hilfreich
und
im
Allgemeinen
anwendbar
für
Unternehmen verschiedenster Branchen, die die Möglichkeit nutzen möchten, in offenen Innovationsprozessen existierendes Wissen wiederzuverwenden. Mit Hilfe der Ergebnisse dieser Arbeit sollte es Unternehmen möglich sein, zum einen den Anteil von wiederverwendetem Wissen an der Arbeit ihrer Entwickler (und damit ihre Wertschöpfung)
zu
steigern
und
zum
anderen
sicherzustellen,
dass
aus
Wissenswiederverwendung der Entwickler keine Wertaneignungsprobleme resultieren.
der
Abstract Reusing existing knowledge is crucial in innovation activities to enhance their effectiveness, efficiency and quality. This is especially so in software development which is also the empirical context of this dissertation. In this space primarily the reuse of code as one form of explicit knowledge is important. Existing research has investigated knowledge reuse in general as well as the particular instance of code reuse in software development. One important finding of this scholarly work is the importance of individual developers in the context of knowledge reuse. However, there is a paucity of studies dealing explicitly and in detail with individual developers. Addressing this gap, this dissertation employs two large-scale empirical studies to investigate the reuse of open source software (OSS) code and other code which is available on the internet in software development projects. The first study takes a value creation angle and analyzes factors on the level of individual developers which influence the extent of code reuse in their work. From a value appropriation perspective the second study examines drivers which lead individual software developers to violate licenses when reusing code and thereby potentially create economic and legal risks for their employer. Based on 12 exploratory interviews and a quantitative survey with 684 participants the first study investigates the reuse of existing OSS code by individual software developers in public OSS projects. This context is well suited to explore the relationship between individual developers and the value creation benefits of knowledge reuse. Contrary to software developers in firms who are often constrained to reusing the limited amount of code existing in their firms’ reuse repositories, software developers in public OSS projects can generally turn to the abundance of OSS code available on the internet when building their own code base. Moreover, software developers in public OSS projects are not affected by firm-internal rules regarding code reuse. Thus, the code reuse behavior of software developers in public OSS projects should be primarily determined by their own considerations. The results of this study highlight that OSS software developers reuse a substantial amount of knowledge in the form of code in their work. In addition to that multiple factors influencing this behavior could be identified. First, software developers with larger personal networks within the OSS community and software developers with broader past
XXII
Abstract
project experiences, ceteris paribus, reuse more existing code. Second, the OSS particularity of delivering a “plausible promise” in the form of providing functioning software shortly after the launch of a new project seems to facilitate higher levels of code reuse. Finally, software developers who enjoy tackling difficult technical problems appear to deem code reuse less important for their work. Following this analysis, the second study employs 20 exploratory interviews and a quantitative survey with 1,133 participants to explore the value appropriation risks which may result from knowledge reuse by individual developers. As a specific example of this situation the problems firms may face if their software developers reuse code available from the internet in proprietary software are analyzed. Typically, code available from the internet comes under licenses containing conditions that need to be met in order to be allowed to reuse the code (e.g. the popular GNU General Public License (GPL)). Some of these licenses demand that software in which internet code has been integrated in certain ways is made available to its users in source code form and with the permission to be modified and passed on by these users without restrictions. As a consequence of this, software which was originally proprietary may become OSS, making it difficult or even impossible for firms to generate profits from selling this software. The results of this study point out that while many software developers in firms reuse code from the internet today, they are not well prepared to deal with the risks potentially resulting from this behavior for their employers. In addition to that, multiple factors could be identified which influence whether software developers reuse code from the internet in a way possibly putting the value appropriation of their employers in jeopardy. First, software developers who expect less severe consequences for both their firms and themselves from such behavior are more likely to engage in problematic reuse behavior. The same is true for software developers who perceive stronger negative consequences from missing deadlines in their firms and those who find it lengthy and difficult to investigate and account for potential license issues in the code they want to reuse. Finally, software developers’ reuse behavior is also influenced by peer norms and the ethical climate within their firms. The results of both studies contribute to theory and hold managerial implications. Their main contribution to research is shedding light on the role of individual developers in knowledge reuse with large-scale empirical data collected in the software development context. The results stress that different characteristics and beliefs of individual developers influence both the value creation benefits of knowledge reuse and the potential value
Abstract
XXIII
appropriation risks which may result from it. With its findings this dissertation contributes to literature on both management and information systems. The managerial implications of this dissertation are particularly relevant to firms with software development activities, but can also, and more generally, be applied in firms from various industries which want to reuse existing knowledge in their innovation activities. Leveraging the findings of this dissertation should help firms to increase the share of reused knowledge in the work of their developers (and thus their value creation) while at the same time ensuring that knowledge reuse does not lead to value appropriation risks.
1.
Introduction Since the formulation of its historic roots in Richard Stallman’s (1999) revolutionary
ideas about software freedom in 1984 and its actual inception in 1998 open source software (OSS)1 has come a long way, evolving into a major social and economic phenomenon. For example, the infrastructure of the internet is largely based on OSS programs such as Apache HTTP Server2 or sendmail3 and on the consumer side 25% of all internet users surf the web with the OSS browser Firefox4 (Net Applications 2010). Commercially, Fauscette (2009) estimates the revenues generated with OSS products in 2008 at nearly $3 billion.5 Finally, the development platform SourceForge.net (2010) hosted the impressive number of over 225,000 OSS projects in April 2010. At the core of this success of OSS is an exponentially growing (Deshpande & Riehle 2008) enormous code base of approximately 4.9 billion lines of code equaling 2.1 million people-years of software development (Black Duck Software 2009c). Due to the particularities of OSS this code base is largely available in source code form on the internet and can be reused by software developers when creating new software. This dissertation explores the reuse of OSS code and other code available on the internet by developers in public OSS projects on the one hand and by software developers in commercial firms on the other. From a broader view these analyses allow to gain deeper insights into concept of knowledge reuse employed in strategic management literature (e.g. Szulanski 1996; Langlois 1999; Markus 2001; Majchrak et al. 2004) of which code reuse is a particular and prominent instance. In this dissertation, the investigation of OSS code reused by OSS developers takes a value creation perspective on knowledge reuse while the reuse of internet code by software developers in commercial firms is analyzed from a value appropriation angle.
1
For better readability the term open source is used in this dissertation, but it also refers to free and libre software, which differs from open source in ideological considerations but not in technical ones. See http://www.gnu.org/philosophy/free-sw.html, last accessed 16.11.2009, for further information.
2
http://httpd.apache.org, last accessed 02.02.2010. In January 2010 Apache HTTP Server had a market share of 54% (Netcraft 2010).
3
http://www.sendmail.org, last accessed 02.02.2010. In 2008 sendmail had a market share of 27% (SecuritySpace 2008).
4
http://www.mozilla.com/firefox, last accessed 03.02.2009.
5
This includes indirect revenues, e.g. from maintaining and servicing OSS products.
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8_1, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
2
Introduction The research objectives of this dissertation are inspired by the Cisco/Linksys case
presented in Section 1.1 and are introduced in detail in Section 1.2. Finally, the structure of the thesis is laid out in Section 1.3.
1.1.
Motivation: The Cisco/Linksys case
In March 2003 Cisco Systems, a global leader in networking equipment, acquired The Linksys Group, another networking equipment firm, for $500 million to enter the fast growing consumer and small office/home office market (Cisco Systems 2003). Only later, when they were contacted by the Free Software Foundation (FSF),6 did Cisco learn that the software in the WRT54G router, which had come to Cisco with the Linksys acquisition, did contain code parts licensed under the GNU General Public License (GPL) (Lyons 2003; Egger & Hogg 2006). These code parts had entered the router’s software when a software developer had not created this software completely from scratch by herself, but had integrated existing OSS code freely available on the internet into her work (Olson 2008). The GPL is a copyright-based software license which is frequently applied to OSS. Similar to all other OSS licenses it requires that for software governed by it the source code7 has to be made available to the users of the software and those users have to be allowed to modify and pass on the software without having to ask the original creator of the software for permission and without having to pay a fee to the original creator. Besides these general OSS requirements the GPL demands that other software which is tightly integrated with software governed by it is also licensed under its terms, which entails that also the source code of this other software has to be made available to its users with the above permission for modification and redistribution. This particular requirement was applicable to Cisco, too, because the GPL licensed code was deeply interlocked with the other software code in the router. Not complying with the obligations of an OSS license may result in not being allowed to use the software governed by it and damage payments (e.g. Rosen 2004; St. Laurent 2004). For Cisco not being allowed to use the software would have implied not being able to sell the router any further. In order to avoid this 6
The FSF is a non-profit organization promoting and defending the ideas behind OSS (Free Software Foundation 2009b).
7
While software is usually distributed in binary form which is machine-readable only, source code is human-readable. Companies are often reluctant to share the source code of their software because this would allow others to understand and potentially imitate their products (e.g. de Laat 2005; Fitzgerald & Bassett 2005; Davidson 2006).
Introduction
3
situation, Cisco complied with the GPL obligations, put the whole router software under GPL terms and provided its source code for public download (Olson 2008). In response to this availability of the full source code of the router software under GPL terms, hobbyist software developers used this code as a starting point and implemented additional features in it. This modified code, when uploaded back to the router, massively extended its capabilities with functionality which had until then only been available in highly priced enterprise-class products (Weiss 2005). This situation provided technology savvy consumers with the option to purchase a rather inexpensive router and download the modified router software from the internet to have access to much more valuable functionality. For Cisco, this availability of a rather cheap router with leading edge functionality negatively impacted their profits as some customers were not willing to pay premium prices for more advanced routers any more. In order to cut these losses, Cisco removed the GPL licensed code in a subsequent version of the router (Blankenhorn 2005).
1.2.
Research objectives
The Cisco/Linksys case is an example to illustrate that knowledge reuse can affect both firm value creation and firm value appropriation. In general, these two perspectives are often employed by strategic management researchers following the resource-based view (RBV) (e.g. Coff 1999; Amit & Zott 2001; Peteraf & Barney 2003) and scholars in management of technology and innovation (e.g. Teece 1986; Jacobides et al. 2006; Henkel 2007) when explaining firm profitability.8 Value creation establishes the “size of the pie” (Gulati & Wang 2003, p. 209) of both monetary and non-monetary (e.g. consumer surplus) benefits which firms create with their products and services. Value appropriation, following value creation, determines who is able to capture which “share of the pie” (Gulati & Wang 2003, p. 209). That is, it establishes the split between profits and consumer surplus and determines the profit share the different actors involved in value creation (e.g. focal firm, suppliers) receive. In the Cisco/Linksys case the developer who had integrated the code parts licensed under the GPL instead of developing the respective functionality from scratch by herself had reused existing explicit knowledge in the form of software code. Is doing so she had most likely enhanced the value creation of her firm, but she also put value appropriation in jeopardy. 8
Often researchers address only one of the two perspectives in their work, but thereby implicitly acknowledge the existence of the other.
4
Introduction Interestingly, while value creation and value appropriation are usually treated as firm
level concepts (e.g. Peteraf & Barney 2003; Lavie 2007; Pitelis 2008), in the context of knowledge reuse they are both heavily dependent on the actions of individual developers who ultimately decide whether, what and how to reuse. Because of that, this research focuses on individual developers when discussing knowledge reuse. Value creation perspective on knowledge reuse. As a lever to value creation knowledge reuse can mitigate the costs of innovation (e.g. Zander & Kogut 1995; Langlois 1999; Majchrak et al. 2004). By reusing existing knowledge developers can enhance the effectiveness and efficiency of innovation and create results of higher quality, thereby enhancing the value creation of their firms. In the Cisco/Linksys case, the developer who had reused the existing code had probably saved time and consequently development costs in doing so because writing the code herself would have taken longer. Further, most likely she had reused a popular piece of OSS which had been tried and tested by many other developers before her. With this history the OSS was presumably software of higher quality than software the developer would have been able to create herself. The topic of knowledge reuse to create value is especially relevant to software development (Cusumano 1991; Markus 2001), but it is also in this area where knowledge reuse frequently does not meet the expectations set (e.g. Kim & Stohr 1998; Lynex & Layzell 1998; Desouza et al. 2006). Scholars have speculated that this may be due to human factors (e.g. Maiden & Sutcliffe 1993; Sherif & Vinze 2003; Morad & Kuflik 2005), but the perspectives of individual developers on knowledge reuse are not understood well (e.g. Sen 1997; Ye & Fischer 2005) and especially large-scale quantitative data on this subject are lacking. However, understanding these perspectives is of paramount importance because it is ultimately individual developers who decide whether to reuse existing knowledge or not and thereby influence the value creation of their firms. A specific instance of knowledge reuse in the domain of software development which has also received only limited scholarly attention so far is that of code reuse in public OSS development (Haefliger et al. 2008). Despite having been largely neglected in previous research, exploring this particular instance is very interesting. First, analyzing code reuse in public OSS development holds the promise of shedding light on the above mentioned perspectives of individual developers and of facilitating a better understanding of the human factors involved in knowledge reuse. This understanding should help firms to better exploit knowledge reuse as a lever to value creation. Public OSS development is an interesting context for this endeavor because, contrary to software developers in
Introduction
5
commercial firms who are often restricted by firm policies and intellectual property (IP) issues, OSS developers in public projects have an abundance of existing knowledge in the form of code available for reuse. Thus, their reuse behavior should be mainly determined by their own characteristics and less by exogenous reasons. Further, due to the observability of public OSS development, much is known about OSS developers (e.g. Hars & Ou 2002; Lakhani & Wolf 2005) and the processes in which they create software (e.g. Lee & Cole 2003; Senyard & Michlmayr 2004). This knowledge forms a solid base for the analysis of OSS developers’ code reuse behavior to build on. As a second motivation to explore code reuse in public OSS development, a large body of literature has addressed the provision of OSS code to others to use and build upon (e.g. Ghosh et al. 2002; West 2003; Henkel 2006). However, research addressing the other side of this process, that is scholarly work describing the building upon and reusing of existing OSS code is scant, leaving the picture of OSS development as an open innovation process (Chesbrough 2003) incomplete. Consequently, addressing the perspectives of individual developers on knowledge reuse in the context of code reuse in public OSS projects is the first research objective of this dissertation. Research objective 1: What are the perspectives of individual OSS developers on code reuse? How and why do individual OSS developers leverage existing code or not? What determines the extent to which OSS developers reuse existing code? To address these questions the code reuse behavior of OSS developers is analyzed with 12 interviews and a quantitative survey with 684 participants. The quantitative data from the survey is examined with multivariate models employing Tobit, ordered Probit and logistic regression. Value appropriation perspective on knowledge reuse. Besides the positive value creation effects of knowledge reuse, the Cisco/Linksys case also points out the value appropriation risks which can be introduced by reusing existing knowledge. With the exception of knowledge reused from the public domain, all explicit (i.e. codified) knowledge is governed by IP rights (de Laat 2005). Through these the creator of the knowledge can set obligations which others reusing the knowledge have to comply with (e.g. Rosen 2004; Boyle 2009).
6
Introduction The share of value a firm can appropriate depends on its bargaining power versus other
parties also competing for value (e.g. Teece 1986; Bowman & Ambrosini 2000). Obligations attached to reused knowledge can affect this bargaining power and consequently influence value appropriation. In the Cisco/Linksys case the code reused by the developer was available on the internet and in line with the GPL it was legitimate for everybody to access and reuse it. However, by putting her software under the GPL, the original creator of the code also formulated the obligations for those reusing her code to make available the source code of the software to its users, to put other software tightly integrated with the original GPL licensed software also under the GPL, and to allow free modification and distribution of all resulting GPL licensed software. By integrating the GPL code into the router software the individual software developer did reduce Cisco’s bargaining power versus its customers because in order to be allowed to continue selling the router without modifications, Cisco had to provide the source code of the full router software to them and could not restrain them from passing on the enhanced modified versions. As a consequence of Cisco’s weaker bargaining power the split between their profits and customers’ consumer surplus had to be adjusted in such a way that Cisco’s “share of the pie” was reduced while their customers appropriated additional value. The situation would have been even worse for Cisco if the GPL issue had not surfaced in software sold bundled with a necessary complementary asset (i.e. the router) but separately and on its own (e.g. consumer software such as an office application). If such software were “contaminated” by the GPL due to knowledge reuse, it would be rather difficult for the creator of the software to appropriate more than a relatively small share of the value created with the software. This is because due to the GPL users of the software have the right to pass it on without having to ask the original owner for permission. Consequently, the software would basically be available for free and nobody would be willing to pay the original creator for it anymore.9 The topic of value appropriation risks through knowledge reuse is again especially relevant in software development. The large amount of OSS and other code as explicit knowledge available on the internet under licenses such as the GPL is an interesting 9
Addressing this potential issue, software firm VMware (2008, p. 34) for example writes in the “risk”section of their quarterly filings to the U.S. Securities Exchange Commission that reusing OSS code “[…] could disrupt the distribution and sale of some of our products.”
Introduction
7
resource pool for commercial firms and their developers to tap into when developing software (e.g. Spinellis & Szyperski 2004; Ajila & Wu 2007; Ven & Mannaert 2008). However, most internet code comes with obligations which need to be accounted for when reusing it. Failing to comply with these obligations can create issues for firms which may among other problems endanger their value appropriation (Rosen 2004; Arne 2008; Bennett & Ivers 2008). Existing scholarly work has addressed this topic and drafted some guidelines how to best leverage code available on the internet in commercial software development settings while accounting for the resulting obligations (e.g. Levi & Woodard 2004; Madanmohan & De 2004; Ruffin & Ebert 2004). However, this research typically assumes that code from the internet is reused in systematic fashion in commercial firms, that is internet code reuse is integrated into the software development processes of these firms. Yet, also individual software developers may reuse code from the internet in an adhoc way, that is spontaneously searching the internet for existing code and integrating it into their work (e.g. Bennett & Ivers 2008; Kaneshige 2008; Olson 2008). While some scholars have speculated that it is especially in this second form of reuse where obligations may be violated (e.g. Levi & Woodard 2004; Davidson 2006; McGhee 2007), little is known about this form of reuse and especially quantitative evidence is lacking. Research has not yet analyzed how well aware individual software developers in commercial firms are of the obligations coming with internet code reuse and which role ad-hoc internet code reuse plays for them and their work. Further, the determinants influencing whether developers properly account for the potentially resulting obligations when reusing code from the internet in ad-hoc fashion are unidentified. Consequently, addressing the perspectives of individual commercial software developers on the ad-hoc reuse of code available from the internet and the obligations which may come with it is the second research objective of this dissertation. Research objective 2: How important is ad-hoc reusing existing code from the internet for individual commercial software developers and how well aware are they of the potentially resulting obligations? What determines whether commercial software developers run the risk of potentially violating obligations and thereby possibly creating issues such as value appropriation risks for their firms when ad-hoc reusing existing code from the internet? To address these questions 20 interviews were conducted with professionals in the field of commercial software development and IP. After this qualitative pre-study a quantitative
8
Introduction
survey with 1,133 participants was carried out among software developers. The quantitative data from the survey is analyzed with structural equation modeling techniques and multivariate models employing Tobit, logistic and ordered logistic regression.
1.3.
Structure of the dissertation
The dissertation comprises five chapters which follow the two research objectives introduced above. After this first introductory chapter, Chapter 2 reviews existing literature from the domains of strategic management and management of technology and innovation to discuss the perspectives of value creation and value appropriation as the two angles from which knowledge reuse is analyzed in this thesis. Addressing research objective 1, Chapter 3 investigates knowledge reuse in the context of code reuse in OSS development. The results on the one hand lead to a better understanding of how individual developers can contribute to the value creation of their firms and on the other hand complete the picture of OSS as an open innovation process. In the course of this, the chapter also establishes the foundations of knowledge reuse in software development and the particularities of OSS which also Chapter 4 builds upon. Chapter 4 addresses research objective 2 and deals with issues such as value appropriation risks potentially resulting from knowledge reuse. Specifically, it analyzes the integration of code available on the internet into commercial software development projects by individual developers and focuses on how these developers deal with the obligations potentially resulting from internet code reuse. Concluding the dissertation, Chapter 5 summarizes key findings, highlights implications for theory and for practitioners and suggests avenues for future research.
2.
Foundations of value creation and value appropriation This part of the dissertation lays the foundations for the two perspectives of value
creation and value appropriation which are applied to knowledge reuse in Chapters 3 and 4. Two different strands of modern management literature make use of the perspectives of value creation and value appropriation.10 First, strategic management scholars following the RBV frequently refer to value creation and/or value appropriation in their work when explaining profit differences between businesses with their respective resource endowments. Often, the perspectives are referred to implicitly (e.g. Barney 1991; Castanias & Helfat 1991; Amit & Schoemaker 1993; Peteraf 1993; Amit & Zott 2001), but sometimes also explicitly (e.g. Coff 1999; Blyler & Coff 2003; Peteraf & Barney 2003; Alvarez & Barney 2004; Adner & Zemsky 2006; Lavie 2007). Second, researchers in the field of technology and innovation management (e.g. Teece 1986; Jacobides et al. 2006; Henkel 2007) rely on the duality of value creation and value appropriation to address the question how firms profit from technological innovation. Unfortunately, despite underlying much research in both above domains and its acknowledged importance (e.g. Priem 2007), a commonly agreed on concept of value creation and value appropriation does not exist yet. Especially a concept which explicitly entails both value creation and value appropriation and links their interaction to the process of firm profit generation is still missing (Mocciaro Li Destri & Dagnino 2005; Lepak et al. 2007; Pitelis 2008). This is partly because many scholars either focus only on one perspective in their work (e.g. Teece 1986; Stabell & Fjeldstad 1998) or apply both at the same time without clearly and consistently defining where value creation ends and value appropriation begins (e.g. Amit & Zott 2001; Kim & Mahoney 2002). In addition to that, there are often also terminological issues when some scholars label as value creation what other scholars would term value appropriation.11
10
Often researchers address only one of the two perspectives in their work, but thereby implicitly acknowledge the existence of the other.
11
See for example the discussion between Priem and Butler (2001a, b) and Barney (2001) or the discussion between Makadok (2001; Makadok & Coff 2002) and Priem (2001).
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8_2, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
10
Foundations of value creation and value appropriation
2.1.
Concepts and terminology
Despite this lack of a concept commonly agreed on, this dissertation follows ideas ingrained in the RBV and uses terminology recently endorsed in a Special Topic Forum in the Academy of Management Review (AMR) (Lepak et al. 2007) to discuss value creation and value appropriation. Papers in this AMR Special Topic Forum build on a concept of “value” framed by Bowman and Ambrosini (2000, 2001) who employ a distinction between use value and exchange value based on classical economist thinking. These two notions, along with the terms of consumer surplus and opportunity cost as employed by e.g. Brandenburger and Stuart (1996) or Lippman and Rumelt (2003a), are discussed in the following before these four concepts are used to define value creation and value appropriation. Use value12 reflects the value of a product, service, job or task13 as perceived by a customer (Bowman & Ambrosini 2000, 2001). It is highly individual and subjective, meaning that different customers may perceive different use values of the same product (Amabile 1996). For example a certain color of a car may result in a high perceived use value for one customer while another customer does not honor this product quality. Drivers of perceived use value can be e.g. rarity, aesthetic appeal or performance or any combination of these (Pitelis 2008). From a rather formal position Brandenburger and Stuart (1996) explain use value with the following thought experiment. They start with establishing a status quo in which a potential customer does not possess a specific product. Following to this, they assume that the potential customer receives this product for free and argue that the customer must find this situation preferable to the status quo.14 Then they begin successively taking away money from the customer and posit that with only little money taken away the customer will still prefer the situation over the status quo of not having the product. Ultimately, however, there will be a point when the customer gauges the new situation equivalent to the status quo and even worse if further money is taken away. Based on this thought experiment the use value of a product can be described as “the amount of money when equivalence arises” (Brandenburger & Stuart 1996, p. 8) between the customer’s status quo of not having the product and the new situation of owning the product, but possessing less money. 12
Instead of the term “use value” some authors also speak of “maximum willingness-to-pay” (e.g. Brandenburger & Stuart 1996; Priem 2007) or “perceived benefit” (e.g. Besanko et al. 2000).
13
In the following the term “product” is used to represent “product, service, job or task”.
14
Thereby they implicitly assume that the product is a “positive” product and does e.g. not hurt the customer as its only effect on her.
Foundations of value creation and value appropriation
11
Exchange value as the second concept is the monetary amount realized at the distinct point in time when a product is exchanged (Bowman & Ambrosini 2000, 2001). It is equal to the price the seller of a product receives from the buyer. Only in the rare situation of a monopoly supplier who is aware of customers’ individual use values and can price discriminate will the exchange value equal the customer’s individual use value. In all other situations the exchange value of a product will be less than its use value (Priem 2001). Consumer surplus15 is the difference between use value and exchange value (Bowman & Ambrosini 2001). At equal prices customers will choose the product that delivers the highest consumer surplus to them (e.g. Ghemawat 1991; Besanko et al. 2000).16 If a firm wants to increase consumer surplus to ensure that customers select its product over that of a competitor, it can either enhance the use value of its product as perceived by the customer or it can reduce the requested exchange value of the product (Conner 1991; Hoopes et al. 2003).17 Opportunity cost18 as the last basic concept is the sum of all costs associated with the inputs necessary for creating a specific use value. In this it entails e.g. capital cost, labor cost and cost for inputs from suppliers (Brandenburger & Stuart 1996; Besanko et al. 2000; Blyler & Coff 2003). Important to note is that opportunity cost does not reflect the actual prices paid by a firm for the inputs it acquires, such as the actual wage that an employee of the firm receives. It is rather defined analogously to use value, however, in reverse fashion. Flipping Brandenburger and Stuart’s (1996) thought experiment on use value, imagine a firm which wants to acquire an input from a potential supplier. In the status quo situation the supplier keeps the input and does not receive any money. With this point of reference established the input is taken away from the supplier and it receives money instead. The amount of money that leads the supplier to gauge the new situation as equivalent to the status quo defines its opportunity cost. Value creation. Based on the four concepts elaborated above, value creation can be defined. It is important to note that the concept employed here describes total societal 15
Marketing scholars sometimes speak of “delivered value” when referring to “consumer surplus” (e.g. Kotler 1991).
16
Assuming a discrete choice model in which the number of units the customer intends to buy is fixed and the customer only decides whether to buy at all and from which firm.
17
It is obviously also possible to apply these two levers to increase consumer surplus at the same time by enhancing the use value of a product and simultaneously reducing the requested exchange value.
18
RBV scholars sometimes refer to this concept as “economic cost” (e.g. Peteraf & Barney 2003). However, the notion of “opportunity cost” (Brandenburger & Stuart 1996; Besanko et al. 2000) seems to better describe the underlying concept.
12
Foundations of value creation and value appropriation
value created following the original concept of value created in the RBV (e.g. Peteraf & Barney 2003). It is thus closely related to the economic concept of total surplus which describes the sum of all economic rents while it differs from e.g. Porter’s (1985)19 terminology who describes consumer surplus when speaking of created value or Priem (2007)20 who relates to the creation of use value when speaking of value creation. In the terminology followed in this dissertation value created is defined as use value of a product less its opportunity costs (see Figure 2-1) (Brandenburger & Stuart 1996; Coff 1999; Besanko et al. 2000; Barney 2003; Peteraf & Barney 2003; MacDonald & Ryall 2004; Hallberg 2009).21 More explicitly, value creation requires two conditions to hold: − First, the use value of a product as perceived by customers needs to be greater than zero (Collis & Montgomery 1995). This condition highlights the importance of the customer in value creation. Speaking with Sirmon et al. (2007, p. 273), “value creation begins by providing value to customers”. Similarly, Lepak et al. (2007, p. 182) describe the customer as “the focus of value creation.” − Second, the opportunity costs of creating the product need to be less than the use value as perceived by the customers (Besanko et al. 2000). A firm offering a product with a perceived use value lower than the opportunity costs required to create it would destroy societal value.22 Value creation as outlined here describes total societal value created by the actions of multiple parties. It entails the total value created for any stakeholder23 involved and is,
19
Porter (1985, p. 3) argues that “superior value stems from offering lower prices than competitors for equivalent benefits or providing unique benefits that more than offset a higher price.” As Porter speaks of price and not opportunity cost, his reasoning suggests that he defines value created as use value less exchange value which is equal to the definition of consumer surplus. By including exchange value in the calculation the distinction between value creation and value appropriation is softened as exchange value already determines the share of value created that the customer appropriates. Despite this terminological difference, Porter seems to be also aware of the value concept as applied in this dissertation as his generic strategies of superior differentiation and/or lower costs are identical to the value creation levers in the RBV concept of value applied here.
20
In Priem’s (2007) view, “value creation, however, involves innovation that establishes or increases the consumer’s valuation of the benefits of consumption (i.e., use value).” As he bases his argumentation on the terminology of Bowman and Ambrosini (2000) also employed in this dissertation, it is evident, that his “value creation” refers only to the creation of use value.
21
This terminology does however not account for spill-over effects to other products or industries.
22
While in such a situation total societal value would be destroyed it is possible that the focal firm would still appropriate value.
23
The term stakeholder is employed in a very wide sense here. It would e.g. encompass the firm’s customers, suppliers, employees and owners, but also its competitors and companies offering complementary assets.
Foundations of value creation and value appropriation
13
ceteris paribus, independent of the price the focal firm charges for its product and independent of the prices the focal firm pays for the inputs it needs to create the product. Summarizing, value creation defines the “size of the pie” (Gulati & Wang 2003, p. 209) which results from an actions of multiple parties around the focal firm.
Value created Use value
Opportunity cost
As required For inputs
As perceived by customer
Figure 2-1: Concept of value creation
Value appropriation.24 Once value has been created and the size of the pie has been set, the value created needs to be divided up between the stakeholders involved (MacDonald & Ryall 2004; Priem 2007; Pitelis 2008). Consequently, value appropriation as the second step serves the purpose of determining the “share of the pie” (Gulati & Wang 2003, p. 209) that the respective stakeholders receive and thus determines how much of the total value created they can capture. Customers appropriate the consumer surplus as the difference between use value and exchange value while the difference between exchange value and opportunity costs is split between all other stakeholders (the focal firm, suppliers, employees, competitors and companies offering complementary assets) (Bowman & Ambrosini 2001). Thus, value appropriation requires that prices have been determined at least implicitly, both for the product and for all inputs that were necessary for its creation.25 The size of the shares that the individual stakeholders can appropriate depends on their respective bargaining positions (Brandenburger & Stuart 1996; Coff 1999; Bowman & Ambrosini 2001; Lippman & Rumelt 2003a, b; MacDonald & Ryall 2004; Lavie 2007). Stakeholders with a strong bargaining position will appropriate a large share of the value created while stakeholders with a weak bargaining position might even not be able to appropriate any value created at all. 24
Value appropriation is sometimes also labeled “valued capture” (e.g. Bowman & Ambrosini 2000; Lepak et al. 2007; Pitelis 2008), “value realization”, “value dispersion”, “value distribution” or “value allocation”. See Priem (2007) for an overview.
25
Since transactions between the stakeholders do not necessarily have to be of a monetary nature explicit prices are not a prerequisite for value appropriation.
14
Foundations of value creation and value appropriation Having established the terminology and concepts of value creation and value
appropriation, the two following sections review the determinants of value creation (Section 2.2) and of value appropriation (Section 2.3).
2.2.
Determinants of value creation
A systematic and comprehensive account of the determinants of value creation which is commonly agreed on is yet outstanding (Adner & Zemsky 2006; Pitelis 2008).26 Having been the guest editors of a Special Topic Forum on value creation in the AMR, Lepak et al. (2007, p. 180) summarize that “[…] there is little consensus on what value creation is or on how it can be achieved.” Nonetheless, multiple scholars have listed levers of value creation and partly also tried to systematize them. As one of the first researchers addressing the topic, Schumpeter (1942, p. 132) for example speaks of value creation in the form of entrepreneurial activity as “to reform or revolutionize the pattern of production by exploiting an invention or, more generally, an untried technological possibility for producing a new commodity or producing an old one in a new way, by opening up a new source of supply of materials or a new outlet for products, by reorganizing an industry […].” According to his statement, a wide range of activities can create value and innovation seems to play an accentuated role. In recent work, Lepak et al. (2007, p. 182) support this view by offering an unstructured list containing “invention”, “innovation”, “R&D”, “knowledge creation”, “structure and social conditions” and “incentives, selection, and training” as activities to create value. In an advanced approach to structure and synthesize many of the levers previously identified to create value, Pitelis (2008, p. 21) proposes “technology and innovativeness”, “unit cost economies/ increasing returns”, “firm infra-structure and strategy” and “human (and other) resources” as the “four generic, first-order determinants of value creation” which either directly or through their interaction and overlap lead to value creation.27 Despite this lack of a generally accepted account of the determinants of value creation, the definition of value creation introduced in Section 2.1 allows at least to identify two generic determinants. As value creation is determined by the difference between use value and opportunity cost, a firm that manages to either increase the use value perceived by its 26
Some scholars have however proposed value creation levers for specific businesses or industries. E.g. Amit and Zott (2001) have analyzed the value creation levers for e-businesses.
27
While this framework is a rather structured approach to frame determinants of value creation, it has not yet found large following in the literature. Furthermore, as Pitelis (2008) notes himself, the framework is not free of overlap.
Foundations of value creation and value appropriation
15
potential customers (by whichever means), or to reduce the opportunity costs incurred to create the respective use value (by whichever means) can enhance value creation.28
2.3.
Determinants of value appropriation
Once value has been created various parties will compete for it and try to appropriate large shares of it. Among these parties are typically the focal firm, its customers, competitors and suppliers as well as providers of complementary assets (Teece 1986; Pisano & Teece 2007).29 Both strands of literature which address value appropriation acknowledge the existence of these competitions for value, however, scholars in the tradition of the RBV draft a more generic picture of them while technology and innovation management researchers explicitly address appropriating value from technological innovations. Resource-based view scholars point to two competitions for the value created which determine firm value appropriation (Bowman & Ambrosini 2000; Becerra 2008). The first competition is between the focal firm, its customers and its competitors. The second competition takes place between the focal firm, its suppliers and parties providing complementary assets. In the competition between the focal firm, its customers and its competitors, imitability and substitutability are the critical factors which determine whether a firm will be in a bargaining position strong enough to appropriate the value it has created or not (Dierickx & Cool 1989; Barney 1991; Amit & Schoemaker 1993; Collis & Montgomery 1995). The bargaining power of the focal firm is weaker if competitors can copy or substitute its product and when customer switching costs are low, because then customers have a choice between different similar offerings and firms will compete for customers by offering them higher consumer surpluses. Discussing the issue of imitability and substitutability Rumelt (1984) has coined the concept of “isolating mechanisms”30 which Moran and Ghoshal (1999, p. 408) describe as “[…] mobility barriers that restrict the extent to which, essentially, all firms are able to mimic any particular firm’s behavior and, thereby, to replicate that firm’s performance and, ultimately, appropriate some or all of its rent 28
Obviously, a combination of both approaches is possible.
29
Coff (1999) further adds companies’ employees as another group of stakeholders which may claim value. In the following they are treated together with suppliers because the mechanics of value appropriation for these two groups are similar.
30
Instead of “isolating mechanisms” others scholars use the terms “impregnable bases” (Penrose 1959) or “resource position barriers” (Wernerfelt 1984).
16
Foundations of value creation and value appropriation
streams.” Consequently, if the value created by a firm is protected by isolating mechanisms, competitors will fail at replicating it and the focal firm will appropriate the majority of the value it has created versus its customers and competitors. Scholars have identified firm-specificity, social complexity and causal ambiguity as effective isolating mechanisms (Reed & DeFillippi 1990; Barney 1991; Amit & Schoemaker 1993; Coff 1999). Also legal property rights may be applicable as isolating mechanisms (Peteraf 1993; Lavie 2007). Rumelt (1987) further mentions producer learning, buyer switching costs, reputation, buyer search costs, channel crowding and economies of scale as isolating mechanisms. Addressing the competition between the focal firm and its suppliers, Peteraf (1994) points out that the focal firm will not appropriate any value if its suppliers are bidding up the price of their supplies to the point where they appropriate all the value the focal firm can capture from its customers. Suppliers may be in a position to bid up the price if the input they offer is rare and the focal firm needs to deal with them because of that. In the competition versus its suppliers of rare inputs, the focal firm’s bargaining position is determined by the degree of “mobility” (Peteraf 1993, p. 183) of the input it intends to purchase from the supplier and by the knowledge the focal firm and other firms also interested in purchasing the input possess about the value creation potential of the input. If the input the focal firm requires to create value is “perfectly immobile” it can only be used for the value creation of the focal firm and has no other use outside of it (Dierickx & Cool 1989). In such a situation the focal firm should be able to appropriate at least some value created versus it suppliers. If, however, the input is “perfectly mobile” and can be used equally efficiently in any other firm (either for the same value creation or another one),31 then the owner of the input should be able to appropriate the value created if all firms interested in purchasing the input possess the same knowledge about it (Klein et al. 1978; Peteraf 1993). Yet, if the focal firm possesses superior knowledge about the value the input may help to create it may still be able to appropriate some of the value as other firms competing for the input which are not aware of the full value creation potential of the input will not bid up to the maximum price the focal firm is willing to pay (Barney 1986; Peteraf 1993).32
31
Using alternative terminology, this situation can also be described as a monopsony.
32
It is of course also possible that the superior knowledge the focal firm possesses about the value creation potential of the input suggests that the other firms also competing for the input overestimate its potential. In such a case the focal firm would typically not bid for the input at all (Barney 1986).
Foundations of value creation and value appropriation
17
Between the two extremes of “perfect immobility” and “perfect mobility” the input is “imperfectly mobile” if it is somewhat more valuable within the focal firm than anywhere else (Montgomery & Wernerfelt 1988; Peteraf 1993). In this situation the value is split between the focal firm and its supplier. The exact split is determined by input characteristics such as firm specificity, replacement costs to the firm and switching costs (Coff 1999). Furthermore, superior knowledge about the value the input may help to create is again part of the equation. Summing up, scholarly work in the tradition of the RBV suggests that in the challenge to appropriate the value they have created, firms need isolating mechanisms to prevent their customers and competitors from appropriating the majority of the value they have created and benefit from rather immobile rare inputs and superior knowledge when competing for value with their suppliers. Technology and innovation management. Geared specifically toward profiting from innovation, technology and innovation management scholars have identified two main determinates33 which influence the share of value created that an innovator can appropriate (Teece 1986): The appropriability regime and control over complementary assets.34 The appropriability regime is related to isolating mechanisms and describes how easily an innovation can be imitated. It encompasses the applicability and effectiveness of legal mechanisms of protection as well as particularities of the innovation which act as “natural barriers to imitation” (Pisano & Teece 2007, p. 281). If the appropriability regime is “tight” or “strong” (Teece 1986, p. 287), imitation of the innovation is difficult and the innovator will typically be able to appropriate a large share of the value created by its innovation as it will not have to offer its customers higher consumer surpluses to fend of competing offerings. Yet, if the appropriability regime is “weak” (Teece 1986, p. 287), innovators’ positions with regard to complementary assets as the second determinant become important. Both the position of the innovator versus competitors with regard to access to complementary assets as well as the position of the innovator versus the providers of complementary assets determine the share of value which the innovator can appropriate (Teece 1986). 33
In his original article Teece (1986) further mentions the position of the industry in the technology lifecycle as a determinant of value appropriation. He points out that innovators in industries with high development and prototyping costs are unlikely to profit most from their innovation if they go to market before the emergence of the “dominant design” (Abernathy & Utterback 1978; Dosi 1982).
34
Beyond assets, the same logic also applies to complementary capabilities (Teece 1986, 2006). Pisano (2006) even argues that complementary capabilities are even more important than complementary assets.
18
Foundations of value creation and value appropriation If the innovator is poorly positioned versus competitors with respect to accessing
complementary assets, competitors are likely to appropriate a large share of the innovation (Teece 1986). This may happen if e.g. competitors already have required complementary assets in-house while the innovator still needs to build these assets. If the innovator does not control the complementary assets required to commercialize their innovation and can not build them internally, it needs to interact with other parties owning the complementary assets. In such situations the innovator could end up in an unfavorable position versus the providers of the complementary assets required, because the bargaining position or “economic muscle” (Pisano & Teece 2007, p. 281) of the innovator versus the owner of the complementary assets will influence value appropriation. Complementary assets can be generic, co-specialized or specialized. Generic complementary assets are general purpose assets and are not tailored to the innovation. For specialized complementary assets there is a unilateral dependence between innovation and complementary asset, that is either the innovation depends on the complementary asset but not the other way around or the asset depends on the innovation but the innovation does not depend on the asset. Finally, complementary assets are co-specialized if dependence goes in both directions simultaneously. If the innovator needs specialized35 or cospecialized complementary assets held by other parties and not available on a competitive market, these other parties are in a position to appropriate value created by the innovator’s innovation, because their complementary assets are “bottleneck[s] with regard to commercializing the innovation” (Teece 1986, p. 297). Concluding, in the specific context of appropriating value from innovations, the bargaining position of the innovator as the creator of value is determined by the strength of the appropriability regime and their position with regard to complementary assets versus their competitors and the owners of the complementary assets.
2.4.
Summary
Summing up the perspectives of value creation and value appropriation, value creation establishes the magnitude of the value a business is related to (i.e. the size of the pie) while value appropriation determines the amount of value a specific entity can capture (i.e. the share of the pie).
35
In this case the complementary assets need to be specialized in such a way, that the innovation depends on the complementary asset, but the complementary asset does not depend on the innovation.
Foundations of value creation and value appropriation
19
Generically, firms can enhance value creation by either increasing use value as perceived by their potential customers or by reducing the opportunity costs incurred to create the respective use value. The share of value created which a firm can appropriate is determined by its bargaining position versus customers, competitors, suppliers and other companies offering complementary assets. Generally, isolating mechanisms, rather immobile rare inputs and superior knowledge help a firm attain a strong bargaining position. In the language of technology and innovation management scholars, a firm’s bargaining position is determined by the imitability of its innovation and by its access to complementary assets. For firms to maximize their financial performance usually the combination of value creation and value appropriation is important because firms typically need to create value in the first place before appropriating parts of it in the second step (e.g. Schumpeter 1942; Arrow 1962; Coff 1999; Jacobides et al. 2006; Nelson 2006).36 With the perspectives of value creation and value appropriation established, the next chapter takes a value creation angle on knowledge reuse and investigates the reuse of OSS code in public OSS projects. After that, Chapter 4 sheds light on the value appropriation issues which may come with knowledge reuse by analyzing the reuse of internet code in commercial software development.
36
Patent trolls (e.g. Fischer & Henkel 2009) may be an exception to this.
3.
Open source software developers’ perspectives on code reuse37
3.1.
Introduction
Literature on innovation management argues that knowledge reuse is an important lever for value creation, because by reusing existing knowledge in innovation processes firms can mitigate the costs of innovation (e.g. Zander & Kogut 1995; Langlois 1999; Majchrak et al. 2004). Knowledge reuse has historically been particularly relevant for innovation in the software industry and also many of the major advances in knowledge reuse research have been made in this space. Software reuse, as the software specific form of knowledge reuse, has long been identified as crucial to overcome the “software crisis” (Naur & Randell 1968) because it allows for more efficient and more effective development of software of higher quality (e.g. Krueger 1992; Kim & Stohr 1998). Despite the acknowledged importance of software reuse as a lever to value creation and despite the substantial body of scholarly work on how to realize its benefits (e.g. Barnes & Bollinger 1991; Fafchamps 1994; Frakes & Isoda 1994; Isoda 1995), a large number of studies has found that software reuse is still problematic (e.g. Kim & Stohr 1998; Lynex & Layzell 1998; Morisio et al. 2002; Desouza et al. 2006). Many of these studies have speculated that this failure of reuse might be due to human factors (e.g. Maiden & Sutcliffe 1993; Kim & Stohr 1998; Sherif & Vinze 2003; Morad & Kuflik 2005) and thereby point out that while value creation is usually investigated from a firm-level perspective, it is individual developers with their decisions to reuse or not to reuse who heavily influence firm value creation in this particular context. At this point there is a gap in existing research on software reuse. There is a paucity of – especially quantitative – work on the role of individual developers in the process of reusing existing knowledge. Little is known about individual developers’ beliefs and thoughts about software reuse as well as their behavior when reusing existing knowledge during software development (Maiden & Sutcliffe 1993; Sen 1997; Ye & Fischer 2005). Addressing this issue, this part of the dissertation strives to scrutinize the role and the behavior of individual developers in knowledge reuse. The context of this investigation is 37
This part of the dissertation has partly already been available in Sojer and Henkel (2010a).
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8_3, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
Open source software developers’ perspectives on code reuse
21
code reuse in OSS development. Code reuse is the most important form of knowledge reuse in software development and can serve as an example for the reuse of explicit knowledge (Krueger 1992; Kim & Stohr 1998). OSS development is a special instance of software development, which typically takes place in informal collaborations of globally distributed teams communicating over the internet (e.g. Markus et al. 2000; von Krogh & Von Hippel 2006). It provides a unique environment to research code reuse and especially developers’ perspectives on it for multiple reasons. First, contrary to software developers in commercial firms who are often constrained to reusing the limited amount of code existing in their firms’ reuse repositories, OSS developers can turn to the abundance of OSS code available on the internet when building their own code base.38 Thus, analyzing the code reuse behavior of OSS developers should offer a picture with less distortion and more reused code than an analysis of software developers in firms. Second, analyzing the code reuse behavior of OSS developers spread all over the world and active in a broad variety of very different projects should result in more variance and consequently a more facetted picture than analyzing the code reuse behavior of developers from one or only a few firms. Developers from commercial firms should be strongly influenced by their firms and thus be rather homogenous within their firms. Finally, because OSS innovation processes take place largely in the open they are understood well (e.g. Raymond 2001; Senyard & Michlmayr 2004; Mockus et al. 2005). The same is true for OSS developers and their motivations and beliefs which have been researched thoroughly (e.g. Raymond 2001; Lerner & Tirole 2002; Lakhani & Wolf 2005). This existing base of knowledge about OSS innovation processes and OSS developers can provide a solid platform to understand code reuse and its antecedents with a special focus on the role of individual developers. Generalizing the resulting findings of this study of knowledge reuse in one particular context contributes to a better understanding of the role of individual developers in knowledge reuse in general and allows firms to make better use of knowledge reuse as a lever to value creation. Besides the implications which an analysis of code reuse in OSS development can hold for reuse research in general, a better understanding of the mechanics of code reuse in OSS 38
Similar to OSS developers, developers in firms can of course access the full OSS code available. However, the license restrictions of OSS should weigh heavier on them than on OSS developers. For example, firm representatives interviewed for the study in Chapter 4 frequently pointed out that they must not reuse any OSS code licensed under the GPL which instantaneously reduces the universe of OSS code which they can reuse by more than 50%.
22
Open source software developers’ perspectives on code reuse
is also interesting in itself as it contributes to current OSS research aiming “[…] to understand more fully how it [OSS] is developed” (Crowston et al. 2009, p. 3). A large number of scholars (e.g. Gruber & Henkel 2005; West & Gallagher 2006; Fleming & Waguespack 2007) have referred to OSS as a specific instance of open innovation (Chesbrough 2003). In the context of OSS this implies on the one hand that developers in an OSS project allow others outside of their project to access their work and use it in their own innovation processes. On the other hand it also entails that OSS developers in one project reuse ideas and knowledge from other projects. Following this picture of OSS as an instance of open innovation a large body of literature has emerged exploring the “giving” side of this open innovation process addressing the making available of developments for others to use and build upon by individuals and firms (e.g. Ghosh et al. 2002; West 2003; Henkel 2006, 2009). The other, “receiving” side which describes the reuse of existing OSS code when developing new software has however received only very little scholarly attention. The existing scholarly work on code reuse in OSS development is limited to four high-level code or dependency analyses (German 2007; Mockus 2007; Spaeth et al. 2007; Chang & Mockus 2008) and two case study papers (von Krogh et al. 2005; Haefliger et al. 2008). Due to this lack of data on the “receiving” side, the picture of OSS as an open innovation process is not complete yet and especially large-scale quantitative data on the level of individual developers are missing. With this starting point, this part of the dissertation aims at analyzing code reuse as the most important instance of software reuse in the context of OSS development, thereby contributing to research on knowledge reuse in general and especially to questions regarding the role of individual developers in the process of creating value for their firms by reusing existing knowledge. Specifically, this part of the dissertation presents the first large-scale quantitative analysis of code reuse in OSS development with the single developer as the unit of analysis. Using the quantitative data collected, this dissertation provides a much richer picture of the detailed mechanics of code reuse in OSS than the existing limited body of research. In the course of the analysis answers to the following blocks of questions are presented. First, how important is code reuse for OSS development and to which extent do OSS developers practice it? Second, what are developers’ reasons for and against code reuse? Third, how do OSS developers reuse existing code, that is which forms of code do they
Open source software developers’ perspectives on code reuse
23
prefer to reuse, how do they integrate the reused code with their own code and where do they turn to when searching for existing code to reuse? Fourth and finally, which factors influence the code reuse behavior of individual OSS developers? The remainder of this part of the dissertation is organized as follows. The next section (3.2) reviews relevant literature on knowledge reuse, establishing it as a lever to firm value creation influenced by individual developer decisions and elaborating on its general mechanics. After that, the specificities of software reuse are described to provide the technical context of code reuse in OSS. The section ends with an overview of existing scholarly work on the not-invented-here syndrome which is often referred to in the knowledge reuse context when individual developer issues are mentioned. Section 3.3 discusses OSS as the empirical setting of this study and briefly touches on its history and licenses before the processes of developing OSS, the motivations of developers to participate in OSS projects and existing scholarly work on code reuse in OSS development are reviewed. The section concludes with the formulation of specific research questions regarding code reuse in the context of OSS. Section 3.4 develops a research model explaining the code reuse behavior of OSS developers which helps to guide the quantitative study. After that, Section 3.5 describes the survey design and methodology employed to collect data, before first quantitative results are presented in descriptive and exploratory fashion in Section 3.6. Section 3.7 finally elaborates on the multivariate analyses testing the research model and Section 3.8 concludes this part of the dissertation with a summary of the most important findings, an overview of theoretical contributions and managerial implications and a discussion of limitations and future research avenues.
3.2.
Foundations of knowledge reuse
This section establishes the concept of knowledge reuse in general and links it to firm value creation on the one hand and individual developers on the other. After that, existing research on knowledge reuse in software development is reviewed and the last block of this section discusses the not-invented-here syndrome.
3.2.1. Knowledge reuse to create value In many industries the basis of firm competition and consequently the sources of competitive advantage have shifted toward knowledge and knowledge-based resources. Knowledge differs from data and information as it is a “fluid mix of framed experience, values, contextual information and expert insight that provide[s] a framework for
24
Open source software developers’ perspectives on code reuse
evaluation and incorporating new experiences and information” (Davenport & Prusak 1997, p. 5). Knowledge can be tacit or explicit (e.g. Markus 2001). Tacit knowledge has a personal quality, it is the know-how of an individual which can be applied in certain contexts, but it cannot be articulated or communicated easily (Sambamurthy & Subramani 2005). In contrast to that, explicit knowledge can be codified and transmitted with little effort. Extraction and separation from its original “owner” are not a problem. Especially in knowledge intensive industries – e.g. software development (Boh 2008) or consulting (Sarvary 1999) – a firm’s success is highly dependent on its ability to create, acquire, integrate and deploy knowledge (Teece et al. 1997; Takeishi 2002; Watson & Hewett 2006). In such industries knowledge has emerged as one of the most important strategic resources and its management is crucial (Barney 1991; Conner & Prahalad 1996; Spender 1996). In particular, the ability to leverage valuable knowledge already existing has been identified as critical due to the general paucity of valuable knowledge and the difficulties and costs of creating new knowledge (Szulanski 1996; O'Dell & Grayson 1998; Dixon 2000). One important way of leveraging existing knowledge is reusing it by transferring it from the situation in which it was initially acquired to other situations (Argote et al. 2000). Firms have repeatedly been shown to gain competitive advantage and to drive their performance through knowledge reuse (Kogut & Zander 1992, 1993; Nonaka 1994; Nahapiet & Goshal 1998; Argote et al. 2000). In terms of value creation, leveraging existing knowledge can lead to increased value creation by reducing the opportunity cost required to deliver a defined use value as typically both time and costs are saved when existing knowledge can be reused and does not have to be created from scratch (Langlois 1999; Ofek & Sarvary 2001; Watson & Hewett 2006). Alternatively, additional value can be created by using the efficiencies generated through knowledge reuse to increase the use value at constant opportunity costs.39 Ofek and Sarvary (2001, p. 1443) for instance report companies reusing existing knowledge to enhance “[…] the quality of the services/products offered […].” For firms to successfully create value through the reuse of existing knowledge, their developers typically have to engage in both knowledge sharing and applying existing knowledge to new situations (Goodman & Darr 1998; Markus 2001).40 Knowledge sharing
39
Obviously also a combination of both approaches is possible.
40
This assumes that developers reuse mostly internal knowledge. If they, however, mostly rely on reusing knowledge acquired externally, e.g. in open innovation processes (Chesbrough 2003), they may also focus on the knowledge application part exclusively.
Open source software developers’ perspectives on code reuse
25
as the first step entails the collection and making available to others of valuable existing knowledge (Appleyard 1996). Knowledge application as the second step and focus of this dissertation consists of seeking, evaluating, adapting and using existing knowledge when developing solutions to new problems (Alavi & Leidner 1999; Majchrak et al. 2004). On the side of knowledge sharing, the major issue that can impede knowledge reuse is the lack of motivation of the initial knowledge source to share its knowledge. This is often the case if the costs of sharing the knowledge are very high, e.g. because the effort required to produce good documentation is prohibitive, or if the knowledge source is not adequately rewarded for sharing (Szulanski 1996; Markus 2001). On the side of knowledge application, where the focus of this dissertation resides, research points to three classes of problems that can impede the effective reuse of knowledge (Sambamurthy & Subramani 2005): − Coordination problems occur if the knowledge required exists or is believed to exist, but the individual who could make use of it is not aware of its existence or is not aware of its location (Boh 2008). − Transfer problems can occur if knowledge is found to be sticky and heavily related to its original context which makes reusing it in new settings difficult (Szulanski 2000). Similarly, for tacit knowledge, causal ambiguity often makes it difficult to explicitly frame the knowledge which needs to be transferred (Nonaka 1994; Zander & Kogut 1995; Grant 1996). Further, the individual requiring the knowledge might lack the absorptive capacity to understand the transferred knowledge (Cohen & Levinthal 1990). − Acceptance problems occur when the individual prefers to avoid reusing suitable existing knowledge and rather devises a new solution from scratch. These problems are often related to the individual’s motivation and the incentive structures in place (Markus 2001) and are sometimes referred to as not-invented-here syndrome (Katz & Allen 1982), which is elaborated on in more detail in Chapter 3.2.3. Moreover, if the knowledge source is not deemed to be reliable (Walton 1975) or the relationship between knowledge source and knowledge seeker is arduous, the latter may prefer to avoid reusing the existing knowledge altogether (Szulanski 1996). Summarizing, firms can enhance their value creation through increased use value or reduced opportunity cost by reusing existing knowledge. The success of knowledge reuse
26
Open source software developers’ perspectives on code reuse
however is highly dependent on individuals within the firms and whether they want to reuse existing knowledge or not. As has already been pointed out earlier, knowledge reuse has been particularly relevant for software development and much research on knowledge reuse has originated in this domain (Cusumano 1991; Markus 2001). The next chapter concretizes the general knowledge reuse concepts presented in this chapter in the specific domain of software development, laying the foundations for the subsequent analysis of code reuse in OSS development.
3.2.2. Knowledge reuse in software development The concept of knowledge reuse in software development was first coined at the NATO Software Engineering Conference in 1968 by McIlroy (1968).41 The conference goal was to address the “software crisis” (Naur & Randell 1968), which describes the difficulty of building large and reliable software systems in a controlled and cost-effective way (Kim & Stohr 1992).42 Since its inception, software reuse – as the software development specific form of knowledge reuse is labeled – has been considered one of the key tools or even the “silver bullet” (Brooks 1987) to overcome the “software crisis” (Mili et al. 1995; Kim & Stohr 1998; Frakes & Kang 2005). In line with this ambitious goal, much research and much practical work in firms developing software have been conducted to unleash the full potential of software reuse. However, despite all this effort, the full promise of software reuse has not been realized yet and many corporate software reuse activities have failed (Krueger 1992; Kim & Stohr 1998; Mili et al. 1999; Morisio et al. 2002; Ye & Fischer 2005; Desouza et al. 2006; Sherif et al. 2006).43 In the following, scholarly work on software reuse is reviewed, pointing out which artifacts can be reused during software development, how the process of software reuse works, which benefits can be expected from software reuse at which costs and also which factors influence the success of software reuse in software development firms. 41
It can however be argued that – despite not being formalized as a concept – knowledge reuse in software development is as old as software development itself, because programmers are likely to have always been reusing existing artifacts, e.g. in form of some lines of code of their earlier work (Frakes & Kang 2005). Further, software development with high-level programming languages such as C could also be considered as knowledge reuse as these high-level languages summarize knowledge blocks of low-level languages (Krueger 1992; Frakes & Kang 2005).
42
Despite already being proclaimed in 1968, overcoming the “software crises” is still a major topic in software engineering research and practice and is not considered solved yet (Gibbs 1994).
43
There do however also exist success stories, e.g. Apte et al. (1990), Lim (1994), Isoda (1995), Morisio et al. (2000).
Open source software developers’ perspectives on code reuse
27
Artifacts in software reuse Similar to the general concept of knowledge reuse, software reuse is defined as “[…] the process of creating software systems from existing software rather than building software systems from scratch” (Krueger 1992, p. 131).44 Software reuse relies on reusing explicit knowledge in the form of artifacts which either have been developed in previous software development processes or which have explicitly been developed to be reused in software development processes. The artifact most commonly reused in software reuse is code, but software reuse also entails the reuse of designs, architectures, cost estimates, project plans, requirements specifications, test cases, user interfaces, documentation, customized tools etc. (Krueger 1992; Isoda 1995; Mili et al. 1995; Morisio et al. 2002). Because code reuse is the most important form of software reuse, this dissertation will focus on it. Code reuse can further be broken down into snippet reuse and component reuse: − Snippet reuse: In this form of code reuse developers “scavenge” fragments of existing software systems and use them when building new ones (Krueger 1992). The artifacts reused in this form of code reuse are either multiple continuous lines of source code (code scavenging) or the structure of a larger block of code in which many details are deleted while the structure is retained as a design template (design scavenging) (Krueger 1992).45 − Component reuse: Contrary to snippet reuse, component reuse deals with artifacts which have been designed explicitly for the purpose of being reused (Lau & Wang 2007). It is based on the idea of developing new software systems with existing building blocks which have already been developed, documented, tested and potentially certified (Krueger 1992). Components are encapsulated software knowledge, e.g. functions such as statistical algorithms or also data types such as trees or graphs (Lau & Wang 2007). The software reuse process Analogous to the distinction between knowledge sharing and knowledge application presented in the previous chapter, the software reuse process can be split into “development for reuse” and “development with reuse” (Barnes & Bollinger 1991; Kim & 44
For similar definitions see e.g. Lim (1994, p. 23), Kim and Stohr (1998, p. 115), Morisio et al. (2002, p. 341) or Frakes and Kang (2005, p. 529).
45
There is obviously a continuum between code and design scavenging.
28
Open source software developers’ perspectives on code reuse
Stohr 1998; Ye & Fischer 2005). “Development for reuse” entails both the explicit production of reusable software artifacts and the identification and extraction of reusable artifacts from existing software with the purpose of making them available for reuse in the future (Joos 1994; Lim 1994; Sen 1997). Once reusable software artifacts have been produced or identified, they typically are classified and catalogued in reuse libraries (Frakes & Isoda 1994; Kim & Stohr 1998). “Development with reuse” as the focus of this dissertation on the other side comprises all steps necessary to consume existing software artifacts. Typically, these are retrieving the existing artifacts, understanding and evaluating them, modifying them to fit the new context and integrating them into the new software system (Krueger 1992; Lim 1994; Mili et al. 1995; Ravichandran & Rothenberger 2003). The particularities of this process differ with the type of the artifact being reused as is exemplified with the two main artifacts of this study: − Snippet reuse: Retrieving snippets is typically considered not very efficient, as such artifacts are usually not advertised to be reused. Instead, the developer has to think about in which existing software systems reusable fragments might exist. In similar fashion, also understanding of snippets is not trivial as they were not explicitly developed for reuse and the developer has to look them through line by line.46 Snippets scavenged from other software systems frequently need to be modified, because e.g. in the old software system the code dealt with integer variables and is supposed to deal with float variables in the new system. Modification is performed by manually editing the code which requires that the developer has a solid understanding of the lowest-level details of the reused software (Krueger 1992). Similarly to modification, developers in nearly all cases have to change the code when integrating it, as e.g. variable names are inconsistent with the new context. Again a solid understanding of the lowest-level details is required (Krueger 1992). − Component reuse: Contrary to snippet reuse, component reuse deals with artifacts which have specifically been built to be reused. Thus, it is easier to retrieve them as they can be categorized according to the functionality they provide and because they are often stored in libraries and catalogues. The specific development for reuse also makes understanding components easier than understanding snippets. Components are typically documented well and often the developer does not need to analyze the
46
Naturally, short snippets are easier to understand, but most likely also do not contain much functionality.
Open source software developers’ perspectives on code reuse
29
code of the component in order to understand it, but it is sufficient to look at its predefined interfaces (Kim & Stohr 1998; Ravichandran & Rothenberger 2003). Similarly to snippet reuse, developers might be required to modify the components they want to reuse if they do not perfectly meet the requirements. If they have access to the component’s source code, they can do so by changing the source code. However, they thereby forego efficiency as they need to understand the lowest-level details of the components and also put quality benefits at risk because they might introduce quality issues through their changes and further invalidate previous testing and certifications (Krueger 1992; Mili et al. 1995; Ravichandran & Rothenberger 2003). As an alternative to changing the component code, developers frequently have the option to modify the component through parameters (Barnes & Bollinger 1991; Kim & Stohr 1998). In this situation, the original component developer has predicted the requirement of different behaviors from her component and has provided “switches” through which developers reusing the component can choose the required behavior (Krueger 1992; Ravichandran & Rothenberger 2003). Integrating components is typically easy as most software development environments allow the linking of different modules to one software system (Krueger 1992). This is especially true for object-oriented programming languages such as Java or C++ (Stroustrup 1996; Ravichandran & Rothenberger 2003). Benefits of software reuse Software reuse can enhance value creation in software innovation through increased development efficiency and reduced development times, improved software quality and better maintainability of the software. Software reuse increases development efficiency and reduces development times as developers save time and effort by not having to build new software systems from scratch, but partially reuse existing artifacts which have already been created, tested and documented (Cusumano & Kemerer 1990; Kim & Stohr 1992; Rine & Sonneman 1998).47 Further, software reuse allows leveraging expertise and thereby increases efficiency. Developers who are experts in certain areas and thus work more efficiently in these fields, can specialize on these areas and develop reusable software artifacts that can be reused by
47
Note that efficiency increases lead to reduced development times only if software reuse takes place on the critical path of development (Lim 1994).
30
Open source software developers’ perspectives on code reuse
other developers who are not experts, but still need artifacts with this functionality (Fafchamps 1994; Lim 1994). As the second benefit, reusing existing software when developing new systems leads to increased quality, because for one, reusable artifacts are typically subject to rigorous testing and further, defect fixes are accumulated with each reuse (Kim & Stohr 1992; Lim 1994; Frakes & Kang 2005). Moreover, software quality attributes that are affected positively by software reuse are understandability, adaptability and portability (Kim & Stohr 1998). Understandability and adaptability of software are improved by reusing familiar and well-documented artifacts. Portability describes the extent to which a software system can be used in different contexts such as on different operating systems or in different hardware environments. It is supported by reusing artifacts that have been specifically designed to be reusable in different contexts. Besides the quality benefits discussed, software reuse however also incorporates quality risks. If developers do not fully understand the artifacts they reuse (and for efficient software reuse, they are not required to), these artifacts may impact the software quality negatively (Frakes & Kang 2005). As the third benefit, software reuse reduces the maintenance cost of software systems, because less maintenance is required in the first place due to the lower defect densities. Further, the software can be maintained more easily as it is documented better and thus can be changed and adapted more easily (Lim 1994; Kim & Stohr 1998).48 Moreover, when multiple systems have reused the same artifact without changing it, maintenance needs to be performed only on one copy of the artifact, independent of the number of systems in which this artifact is reused (Apte et al. 1990; Morisio et al. 2000). Costs of software reuse Besides the compelling benefits of software reuse presented above, it also comes at a cost, the majority of which is located on the “development for reuse” side (Lim 1994; Kim & Stohr 1998). Margono and Lindsey (1991) report the development costs of reusable software artifacts to be 200% of that of not reusable ones. In different environments Lim (1994) speaks of 111% and Tracz (1995) finds 200% of the development costs of nonreusable artifacts. The majority of the additional costs for building reusable artifacts accounts for analyzing the multiple contexts in which the artifact might be reused later, taking the particularities of these contexts into consideration and providing extensive 48
Maintenance often accounts for more than 60% of the total software development costs (Boehm 1981).
Open source software developers’ perspectives on code reuse
31
documentation and information about the artifact which other developers need in order to evaluate it when considering reuse (Frakes & Isoda 1994; Lim 1994; Poulin 1995; Rothenberger et al. 2003). On the side of “development with reuse” costs are incurred for finding, understanding, adapting and integrating the reused artifact (Kim & Stohr 1998). Here, Margono and Lindsey (1991) find these costs to be on average between 10% and 20% of the cost that would have been incurred when developing the artifact from scratch. Lim (1994) reports an average of 19% of the costs of developing from scratch.49 Success and failure factors of software reuse In order to help firms realize the benefits of software reuse, scholars have sought to identify success and failure factors of software reuse. First, not every software system domain is equally suited for software reuse (Card & Comer 1994; Isoda 1995; Rine & Sonneman 1998; Morisio et al. 2002). Only when a firm will develop multiple similar systems in a certain area which is well understood, internal software reuse will function properly as its costs can be amortized over several software systems. Second, software reuse needs to be organized in a thought-through corporate reuse program that is supported by top-management (Frakes & Isoda 1994; Joos 1994; Griss 1995; Rine & Sonneman 1998) because only then the following success factors can be ensured:50 − Upfront investment in reusable artifacts and reuse infrastructure: In order to develop software systems with reusable artifacts, these artifacts have to be created in the first place and made available in a way which allows easy finding and evaluation (Frakes & Isoda 1994; Isoda 1995; Ravichandran & Rothenberger 2003). This typically also requires the introduction of dedicated processes to create reusable artifacts in a consistent manner (Card & Comer 1994; Morisio et al. 2002). On top of the introduction of such processes, the roll-out of tools supporting software reuse is considered beneficial (Lee & Litecky 1997; Kim & Stohr 1998; Rine & Sonneman 1998). Top-management commitment is crucial in building and populating the reuse
49
These data reflect situations where one or more existing artifacts were actually reused, but do not contain situations in which software was developed from scratch because reuse was considered too expensive.
50
It is important to note that it is the combination of these success factors which enables software reuse in software development in firms. Single factors such as the implementation of a reuse library are not sufficient if the other factors are missing (Poulin 1995; Morisio et al. 2002). Beyond software development, e.g. Dixon (2000) draws a similar conclusion for knowledge reuse in general and also Markus (2001, p. 79) finds that “successful knowledge […] reuse requires a complete solution” entailing processes, incentives, repositories and adjusted or newly created organizational roles.
32
Open source software developers’ perspectives on code reuse repository as a long-term perspective is required to see the pay-offs of this investment (Isoda 1995; Kim & Stohr 1998; Rine & Sonneman 1998). Whether the roll-out of a sophisticated reuse library is a success factor is debated with e.g. Lee and Litecky (1997) and Mili et al. (1998) arguing in favor of such a library and e.g. Frakes and Fox (1995) and Rine and Sonnemann (1998) claiming that it is not needed.
− Modification of existing software development processes (especially requirements definition and analysis, high-level design and testing) in order to include searching for and integrating of reusable artifacts (Morisio et al. 2002). Only processes which explicitly include software reuse will make sure it is considered whenever appropriate (Card & Comer 1994). Moreover, the standardization of data formats and software architectures makes reusing existing artifacts easier (Griss 1995; Rine & Sonneman 1998; Frakes & Kang 2005). − Organizational changes: Often the separation of developers into those “developing for reuse” and those “developing with reuse” is helpful as this leads to developers who can focus exclusively on the development of reusable artifacts. Otherwise the reuse program is dependent on developers who may or may not use the little slack time during their projects to work on reusable resources (Fafchamps 1994; Griss 1995; Rine & Sonneman 1998). Moreover, linkages between the units “developing for reuse” and “developing with reuse” need to be created in order to ensure strategic alignment, good communication and reduce reluctance to employ “foreign” code (Fafchamps 1994; Lynex & Layzell 1998). − Taking care of human factors: Employees need to be made aware of software reuse and need to be encouraged to practice it in order to overcome resistance to change, worries about job security and syndromes like not-invented-here (Card & Comer 1994; Lynex & Layzell 1997; Rine & Sonneman 1998). Further, they need to be trained according to the new development processes (Frakes & Isoda 1994; Joos 1994; Griss 1995; Sherif & Vinze 2003). Additionally, incentives need to be changed to motivate developers to both share their knowledge and also reuse existing knowledge (Poulin 1995; Morisio et al. 2002). For instance, if developers are compensated based on their effort measured in lines of code created, they most likely will not be interested in increasing their productivity with software reuse (Due 1995; Isoda 1995). As Kim and Stohr (1998) as well as Lynex and Layzell (1998) point out, reuse incentives do not necessarily have to be monetary, but can also be honorable mentions, praise by superiors etc.
Open source software developers’ perspectives on code reuse
33
Despite the significant advances in reuse research presented above and the detailed suggestions scholars have developed to help firms enhance their value creation by reusing knowledge, software reuse in commercial firms is still not without issues and its antecedents are not fully understood yet (Krueger 1992; Kim & Stohr 1998; Mili et al. 1999; Morisio et al. 2002; Ye & Fischer 2005; Desouza et al. 2006; Sherif et al. 2006). Researchers both in the domain of general knowledge reuse (Argote & Ingram 2000; Argote et al. 2000; Sambamurthy & Subramani 2005) and software reuse (Card & Comer 1994; Morisio et al. 2002) point to individuals and their social and organizational networks as very critical for successful knowledge reuse and suspect that failure of reuse is often related to such individual developer issues. These aspects are however also understood least and deserve further investigation. It is in this context of human factors that the notinvented-here syndrome is mentioned frequently (e.g. Card & Comer 1994; Fafchamps 1994; Fichman & Kemerer 2001; Sherif & Vinze 2003; Morad & Kuflik 2005). This syndrome and its antecedents are discussed in the next chapter.
3.2.3. The not-invented-here syndrome One of the assumed reasons why reuse fails in software development is that individual developers on the “development with reuse” side do not attempt to reuse even though reuse would be possible (Frakes & Fox 1995, 1996; Ye & Fischer 2005). Such behavior could be related to the not-invented-here syndrome which describes a general negative attitude to acquiring knowledge which originates from outside of the own context (Katz & Allen 1982). As a consequence of this attitude, external ideas are being rejected and external knowledge is underutilized which impedes value creation (Katz & Allen 1982; Mehrwald 1999). In the most comprehensive work on the not-invented-here syndrome, Mehrwald (1999, p. 50) defines it as “[…] a negatively biased, invalid, generalizing and rigid attitude of individuals or groups to externally developed technology, which may lead to an economically detrimental neglect or suboptimal use of external technology.”51 Important about the not-invented-here syndrome is firstly that the negative attitude toward external knowledge it reflects is not rational, that is even if reusing external knowledge would be better from an economic perspective, it is rejected. Second, this attitude is systematic in the 51
Translated from German by Lichtenthaler and Ernst (2006). The definitions of Katz and Allen (1982, p. 7) and Coleman (1990, p. 443) are similar, however focus on organizations or groups which are reluctant toward reusing external knowledge and do not explicitly mention “individuals”. Despite this, the notinvented-here syndrome could also exist within one organization when individuals of one unit are reluctant to accept knowledge originating from another unit (Lichtenthaler & Ernst 2006).
34
Open source software developers’ perspectives on code reuse
sense that the individual would behave similarly in comparable situations (Lichtenthaler & Ernst 2006). Research has identified multiple antecedents which can lead to individuals exhibiting the not-invented-here syndrome. On the level of the individuals themselves the following aspects have been mentioned: − Overestimation of own skills in a given context and the resulting belief to have a “monopoly on knowledge” (Katz & Allen 1982, p. 7) in that area can lead to an unjustified underestimation of the quality of external knowledge (Katz & Allen 1982; Menon & Pfeffer 2003; Michailova & Husted 2003). − Fear of losing status and self-confidence when developers or teams have to concede that somebody else had had a better idea than they themselves can make them avoid external ideas (Coleman 1990; Mehrwald 1999; Michailova & Husted 2003). − Negative or lack of prior experiences with external knowledge can lead individuals to ignore external ideas by default (Mehrwald 1999). In addition to antecedents on the individual level, scholarly work also points to drivers in individuals’ social ecosystem: − Individualist cultures which generally reject outsiders and their ideas are often reluctant to accept external knowledge (de Pay 1995; Michailova & Husted 2003). Beyond cultural aspects, psychology research describes a similar phenomenon of ingroup favoritism and out-group derogation (e.g. Brewer 1979; Tajfel & Turner 1986). − Social environments in which colleagues have a negative perspective on external knowledge can lead to development from scratch being considered more prestigious (Mehrwald 1999; Michailova & Husted 2003). − Incentive systems which reward development from scratch more than relying on external knowledge make external ideas a less preferred choice (de Pay 1995; Mehrwald 1999). If individuals or organizations, due to the aforementioned antecedents exhibit the notinvented-here syndrome, they are likely to wrongly evaluate external knowledge and prefer internal ideas over better external ones (Katz & Allen 1982; de Pay 1995; Mehrwald 1999). Consequently, such behavior which avoids reusing existing knowledge is detrimental to value creation as opportunity costs are higher than they needed to be or use value is lower than it could be.
Open source software developers’ perspectives on code reuse
35
3.2.4. Intermediate conclusion Knowledge reuse in general and software reuse as its specific form in software innovation are strong levers to enhance value creation. Software reuse specifically can reduce opportunity costs through efficiency gains and easier maintenance and can increase use value through higher software quality. However, firms can only realize the positive value creation effects of knowledge reuse if their developers choose to rely on existing knowledge during innovation. From a process perspective, knowledge reuse consists of a “development for reuse” and a “development with reuse” part, with the “development with reuse” part being the focus of this dissertation. Much research effort has been spent in both the field of knowledge reuse and the domain of software reuse and scholars have made substantial advances, e.g. developing sets of success factors necessary to reuse existing knowledge in innovation processes. However, firm efforts to systematically reuse existing knowledge still fail frequently and the antecedents of effective reuse are not fully understood yet. While scholars have speculated that this failure of reuse might be related to individual developers, research putting them at the center of the analysis is scarce, or as Sen (1997, p. 418) formulates it, “unfortunately, the human role in the reusability process has received little attention.” Even more to the point, Maiden and Sutcliffe (1993, p. 176) write, “most software reuse research has ignored the role of the software engineer.” This is surprising, because ultimately it is the individual developer who decides whether to reuse existing knowledge or not. Isoda (1995, p. 183) for instance concedes: “Unless they [software engineers] find their own benefit from applying software reuse to their development project, they will not, of their own free will, perform reuse.” Consequently and speaking with Ye and Fischer (2005, p. 200), one of the key aims of research on reuse should be to understand “[…] what triggers software developers to initiate the reuse process […]” There is however only little research addressing this question and especially large-scale quantitative studies are lacking.52 Striving to help close the above mentioned gap, this part of the dissertation investigates knowledge reuse and its antecedents with special focus on human factors by analyzing 52
The only two large-scale quantitative surveys among software developers the author is aware of are Frakes and Fox (1995) and Mellarkod et al. (2007). Of these two, the first study however investigates only a few developer specific issues in the software reuse context and does not apply multivariate methods and the second one uses rather generic constructs based on the technology acceptance model which makes it difficult to derive deeper insights about developer believes and behavior in the context of software reuse.
36
Open source software developers’ perspectives on code reuse
code reuse in OSS development in order to help firms to better leverage knowledge reuse to create value. OSS development provides a unique context for this purpose for multiple reasons. First, contrary to developers in commercial firms, OSS developers can turn to most of the abundant existing OSS code when reusing. Second, OSS developers are a very heterogeneous population compared to samples drawn from a few firms. Third, the existing knowledge about OSS developers and the software development processes they follow provides a solid platform of scholarly work to start from in the analysis. This platform, together with OSS development as the empirical context of the study is introduced in the next section.
3.3.
OSS and its development
OSS development is a special instance of software development which typically takes place in the form of informal collaborations of globally distributed teams which communicate over the internet (Markus et al. 2000; von Krogh & Von Hippel 2006; Crowston et al. 2007). Its specificities and the existing body of research about it make it a unique context to research knowledge reuse with a special focus on individual developers. Further, understanding code reuse in OSS development is also important in itself to fully grasp OSS as an instance of open innovation (Chesbrough 2003). While there is some initial research addressing this topic, the detailed mechanics and processes of code reuse in OSS have not been yet analyzed thoroughly with quantitative data on the level of individual developers. This section describes OSS and its development as the empirical setting of this study. At first, the historic roots of OSS are touched upon. This is followed by a brief overview of OSS licensing which sets it apart from proprietary software typically developed by commercial firms. After that, the process of developing software in the “OSS fashion” is described. This is followed by an overview of the motivations which make developers contribute to OSS projects. The section on OSS concludes with a review of the limited existing research on code reuse in OSS. Based on the existing literature on both knowledge reuse presented in Section 3.2 and the specificities of OSS discussed in this chapter, research questions regarding code reuse in OSS are derived which help to shed light on the mechanics and processes of code reuse in OSS and provide insights regarding knowledge reuse in general and especially address the role of individuals in knowledge reuse processes as influencers of firm value creation.
Open source software developers’ perspectives on code reuse
37
3.3.1. History of OSS The notion of “open source software” was first introduced in 1998, the underlying idea, however, is much older and goes back to the way software was developed in the 1960s and 1970s. It is also closely related to the development of UNIX. Back then, AT&T’s Bell Laboratories invented the mainframe operating system UNIX and with it the computer programming language C (Raymond 1999a; Lerner & Tirole 2002). Due to governmental regulation, AT&T was not allowed to exploit UNIX as a commercial product. As a consequence, it licensed its source code for free or a nominal fee, mainly to universities and others interested and did not provide any service and support (Weber 2004). As no one at AT&T was going to help them, the mostly scientific users of UNIX formed communities through which they supported each other. In these communities they also shared their innovations and improvements of UNIX in human readable source code form freely among each other and built on each other’s work when improving and adapting UNIX (Lerner & Tirole 2002; Weber 2004). In doing so, the early users of UNIX practiced OSS development long before the term emerged. Eventually, however, the regulations restricting AT&T from commercializing UNIX were lifted as AT&T was broken up in 1984 (Weber 2004). In the wake of this event AT&T started enforcing its copyright around UNIX for commercial purposes. It began offering commercial licenses for UNIX in which licensees were provided AT&T’s UNIX source code to make modifications, but were only granted the right to distribute their versions of UNIX in binary form (Raymond 1999a; Lerner & Tirole 2002).53 Furthermore, AT&T’s UNIX licenses were no longer for free, but became more expensive year after year (Weber 2004; de Laat 2005). As a consequence of this, the free revealing of innovations ceased and each licensee of UNIX (e.g. SUN, IBM) began distributing their own binary version of UNIX which was incompatible with other versions (Weber 2004). The “free software” movement started as a direct reaction to this privatization of UNIX. In 1984 Richard Stallman, a programmer at MIT, quit his job to create a new operating system (called GNU54) from scratch (Stallman 1999; Weber 2004). This new operating system was to be compatible with UNIX and should revive the tradition of sharing source code. As Stallman (1999, p. 55) explains his goal: “With a free operating system, we could again have a community of cooperating hackers – and invite anyone to 53
If software is available in binary form only as opposed to source code form, it can not be modified by its users.
54
GNU is a recursive acronym for “GNU’s not UNIX.”
38
Open source software developers’ perspectives on code reuse
join.” To support this goal, Stallman founded the Free Software Foundation (FSF) in 1985 which was intended to “preserve, protect and promote the freedom to use, study, copy, modify, and redistribute computer software, and to defend the rights of Free Software users” (Free Software Foundation 2009b). “Free software” in Stallman’s view was never software which was available for free, but rather software for which its users had access to its source code and had the right to modify this source code and also pass on the modified code (DiBona et al. 1999; Weber 2004).55 Given his experiences with AT&T taking the UNIX source code private, Stallman further wanted to ensure that code which had once been “free” would remain so forever (Stallman 1999). To achieve this goal, he licensed his GNU software under licenses which incorporated these “freedoms” and demanded that modifications of the software also be licensed under the same licenses (Stallman 1999). The most popular of these licenses which are described in the next chapter is the GNU General Public License (GPL). While many programmers were sympathetic to Stallman’s ideas of source code availability for pragmatic reasons, they often disagreed with his fundamentalism (Raymond 2001; Weber 2004). Further, programmers – also those sympathetic with Stallman’s ideas – often wanted to combine proprietary code with “free” code which Stallman’s GPL did not permit easily (Weber 2004). Lastly, “free software” turned out to be an unfortunate label despite Stallman’s continuous efforts to explain that he aimed at “freedom” and not at “gratis.” Consequently, in the wake of Netscape’s announcement to make the source code of its popular web browser publicly available, the term “open source” was coined and promoted in spring 1998 by some leading actors of the already existing “free software” movement (DiBona et al. 1999; Perens 1999). They introduced the new concept of “open source” in order to adopt a new rhetoric of pragmatism and market-friendliness and do away the old moralizing and confrontational attitude which had been associated with the term of “free software” while at the same time keeping the idea of free source code access (Perens 1999; Raymond 1999b; Open Source Initiative 2009a).
55
Stallman describes his intended meaning of “free” with “free as in freedom” (Stallman 1999, p. 56) and explicitly not as “free as in beer”. Further, he explicitly allows selling of software as long as the source code is included and the buyer has the right to pass on the software without interference from the original copyright holder.
Open source software developers’ perspectives on code reuse
39
Similar to Stallman, the founders of OSS relied on licenses to realize their goal of source code availability. The mechanics of these licenses are discussed in the following chapter.
3.3.2. OSS licenses56 Strictly speaking, software is OSS if it comes under an OSS license. Whether a license is an OSS license is determined by the Open Source Initiative57 (OSI), a non-profit organization which includes a license in its list of OSS licenses if it complies with its Open Source Definition (OSD) and completes its approval process (Open Source Initiative 2009b). The OSI owns a trademark on “Open Source” and through this ensures that only licenses which adhere to its basic ideas behind OSS can label themselves as OSS license (Perens 1999). Central to the OSD and thus ingrained in every OSS license is the requirement that every user of OSS has the right to access the human-readable source code of her software and may pass the software and its source code on to others without having to pay a royalty or other fees to the original copyright holder. Further, the user has to be allowed to modify the source code and distribute modified version of it in both source code and binary form (Open Source Initiative 2009c). As of April 2010 the OSI lists 66 licenses as approved OSS licenses (Open Source Initiative 2010). However, the distribution of these licenses among OSS projects is highly skewed with the GPL being quite dominant and accounting for more than 50% of the OSS existing (Lerner & Tirole 2005; Black Duck Software 2009a).58 While every OSS license has to comply with the OSD, OSS licenses differ in many other aspects. An important characteristic which distinguishes various OSS licenses is the degree of restrictiveness which the license imposes on license choices of derived software. Lerner and Tirole (2005) propose three classes of OSS licenses based on the restrictiveness
56
For further discussion of the legal situation of OSS licensing beyond the scope of this part of the dissertation see Chapter 4.2.1.
57
http://www.opensource.org, last accessed 02.10.2009.
58
While the concept of OSS was created to do away with some of the issues of Stallman’s free software ideas, it does include the licenses drafted by Stallman because they comply with the OSD.
40
Open source software developers’ perspectives on code reuse
of redistribution rights and speak of highly restrictive, restrictive, and unrestrictive licenses:59 − Highly restrictive licenses: If software is licensed under a highly restrictive license, subsequent derivative software based on the original must also be licensed similarly. An example is the GPL license which demands that software derived from software licensed under its terms is also licensed under the GPL.60 This also implies that software which is tightly integrated with GPL licensed software has to be licensed under the GPL.61 − Restrictive licenses: Similarly to highly restrictive licenses, subsequent derivative software based on software under a restrictive license usually must also be licensed similarly. However, restrictive licenses provide some exceptions under which the derivative software can be released under a different license. An example in this category is the GNU Lesser General Public License (LGPL) which allows its programs to link with other programs which are not themselves available under the LGPL. − Unrestrictive licenses: Licenses in this least restrictive class allow subsequent derivative software based on its software to be licensed under any license the developer of the subsequent software chooses. There is no obligation to inherit the license of the original software for any derivative software. However, there may still be obligations which the developer of the derivative software has to respect. The Berkeley Software Distribution License (BSD license) as an example of an unrestrictive license for instance demands that in the derivative software credit for the underlying original code is given to its copyright holders.
3.3.3. OSS development As the previous chapter has pointed out, software is OSS if it comes under an OSS license. However, since much OSS is developed by informal collaborations in public OSS projects (Lee & Cole 2003; Crowston & Scozzi 2008), the term “OSS” is often also 59
While other authors use other labels to describe the different classes of OSS licenses (e.g. Fershtman & Gandal 2007; Sen et al. 2008), the distinction into three classes and the description of these classes is widely accepted.
60
Other examples for highly restrictive OSS licenses – but less common than the GPL – are the GNU Affero General Public License (AGPL), the Open Software License (OSL) or the Ricoh Source Code Public License (Sen et al. 2008).
61
This situation may lead to value appropriation issues for commercial firms combining GPL licensed code with code they want to keep secret (see Chapter 4).
Open source software developers’ perspectives on code reuse
41
understood to imply that the software has been developed in the “OSS fashion” (e.g. Brown & Booch 2002; von Hippel & Von Krogh 2003; Henkel 2007). This usage of the term “OSS” to describe a way of developing software is somewhat imprecise, since OSS licensed software may well be developed internally by commercial firms following the traditional software development patterns, yet it is quite common. The way software is usually developed in OSS is described in the following. OSS as informal collaboration of distributed individuals Software development in OSS projects typically differs strongly from traditional software development processes as they are still largely practiced in commercial firms (Vixie 1999; Scacchi 2004; Senyard & Michlmayr 2004). Traditional software development, which has been termed the “cathedral” approach to software development by Raymond (2001, p. 21), is usually performed by a static team of expert developers separated from the users of their software (e.g. Jones 2003; Weber 2004). This team follows a pre-established process – often the traditional sequential waterfall model (e.g. Cusumano et al. 2003) – and releases a finished software product to their customers at the end of the process (Raymond 2001; Senyard & Michlmayr 2004). Contrary to this, OSS development, which Raymond (2001, p. 21) has labeled as a “bazaar”, is open to contributions from everybody interested and especially encourages (potential) software users to become involved in software development (Raymond 2001; Weber 2004).62 This is possible because both interested developers and users can access, review and modify the software source code as the project progresses (Senyard & Michlmayr 2004). Contributions to the project can be in the form of code, but can also be non-technical, such as documentation, tutorials, bug reports or feature requests (Scacchi 2004). Consequently, much OSS is developed by organizationally and geographically distributed developers from all over the world in a form of community-based development (Bonaccorsi & Rossi 2003; Lee & Cole 2003). This development typically takes place on the internet and various internet-based means of computer-mediated communication (e.g. email, mailing lists, forums, chat systems) are employed to organize OSS projects (Raymond 2001; Mockus et al. 2005; Crowston et al. 2009). A further difference between OSS development and traditional software development is that OSS development is not sequential. Rather, the different phases of software development occur concurrently with development, testing, requirements analysis etc. happening in parallel (Raymond 2001; 62
Raymond (2001, p. 38) speaks of the users of his OSS project Fetchmail as his “most valuable resource.”
42
Open source software developers’ perspectives on code reuse
Scacchi 2004; Senyard & Michlmayr 2004).63 While the lack of tight coordination in OSS projects could be seen as a weakness from the perspective of traditional software development, this is in many situations overcompensated by the large number of developers and users who can access the source code and thereby contribute to the project (Raymond 2001; Mockus et al. 2005).64 Raymond (2001, p. 19) for instance makes this point when claiming that “given enough eyeballs, all bugs are shallow.” Two consequences resulting from this informal form of development relate directly to this research and the research model presented later (see Chapter 3.4.3): − Due to their informal organizations, OSS projects do not have static hierarchies with regard to authority.65 Developers emerge as leaders in a project due to their outstanding commitment and technical expertise (Scacchi 2004; Giuri et al. 2008). However, leaders do not possess “formal authority” (Lerner & Tirole 2002, p. 222) and have only limited ability to discipline project team members (Raymond 2001; Weber 2004; Scozzi et al. 2008). They can offer recommendations, but whether these are followed depends on their standing in the project team (Lerner & Tirole 2002; Scacchi 2004). − Related to the lack of static hierarchies is the mechanism to assign tasks in OSS project. Contrary to commercial software development, developers in OSS projects self-assign their tasks and often choose those activities which are most interesting or beneficial to them (Raymond 2001; Mockus et al. 2005; Crowston et al. 2007). Following this general description of the mechanics of software development in an OSS project, in the following two special aspects are highlighted. First, the starting of an OSS project by its founder or its founders and second the joining of an OSS project by new developers.
63
It is important to note that while these software development phases do exist in both traditional software development and OSS development, they may look rather different in the different settings. For example requirements engineering in OSS projects rather emerges as a by-product of community discourse and is not a formal step as in traditional software development (Vixie 1999; Scacchi 2002).
64
Raymond (2001) stresses that the bug reports from users who are aware of the source code are much more helpful than that of other users. Thus, “enough eyeballs” (Raymond 2001, p. 19) are an advantage for OSS compared with traditional software development only because these “eyeballs” can access the source code.
65
There is however a strong hierarchy with regard to committing rights. Thus, those developers with a committer status can decide which code enters their project, but they cannot order other developers to take care of certain tasks in a certain way.
Open source software developers’ perspectives on code reuse
43
Starting an OSS project While the above description of OSS projects as bazaars is quite accurate, this metaphor does not cover the starting phase of an OSS project (Bergquist & Ljungberg 2001; Raymond 2001). OSS projects are typically started by one founder or a small team of founders who have some functional requirement which can be fulfilled by software, however, they cannot find existing software which satisfies this requirement for them (Vixie 1999; Raymond 2001).66 Importantly, the founders’ need for the requirement has to be rather strong as delivering the required functionality themselves through an OSS project requires long-term commitment and substantial energy (Senyard & Michlmayr 2004). Raymond (2001, p. 23) summarizes this process of initiating an open source project as the “scratching [of] a developer’s personal itch.” Once the project has been started, the founder or the team of founders typically builds a first version of the software. Importantly, this first and initial version of the software is usually not developed in the open in a Bazaar-style environment, but rather in traditional “cathedral”-fashion (Senyard & Michlmayr 2004). This implies that there is no or limited informal collaboration between distributed individuals and that software development follows a typical process of requirements, design, implementation and testing (e.g. Vixie 1999). However, the founder or the founding team has to prepare their project for the Bazaar-style development in this phase. A precondition for others to join the project is that it offers interesting tasks and also seems feasible (Raymond 2001; von Krogh et al. 2003). The founder can achieve this by delivering a “plausible promise” (Raymond 2001, p. 47), which Lerner and Tirole (2002, p. 220) describe as “a critical mass of code to which the programming community can react. Enough work must be done to show that the project is doable and has merit.” Delivering this plausible promise entails multiple aspects of the project. First, the project has to offer some functionality which is already working (Raymond 2001). Typically, this means that the project is capable of fulfilling the developers’ initial requirements (Senyard & Michlmayr 2004). If the project has no or very little functionality and lacks technical stability, e.g. because it constantly crashes, potential volunteer developers will not be interested in contributing. Developers pondering to join a project
66
Alternatively, the founders may be aware of existing software which delivers the required functionality or related functionality, but on purpose choose not to use this software or in the case of software delivering related functionality purposefully choose not to participate in this project to implement their own requirements there.
44
Open source software developers’ perspectives on code reuse
will only do so if they can imagine the project to become successful in the future (Raymond 2001; Shah 2006). Second, while the software already has to contain some working functionality, it must not be too complete either at this stage (Weber 2004). New developers will only join if the project’s code base is not too complex already and if there are still project tasks available which they are capable of contributing (Senyard & Michlmayr 2004). Third, the project’s software architecture has to be modular. This means that the software system is divided into many subsystems with clear communication and interfaces which allow other developers to contribute to the project without close coordination (Michlmayr & Hill 2003; MacCormack et al. 2006). Lerner and Tirole (2002, p. 220) are quite explicit about this requirement for a modular architecture when they state: “Without an ability to parcel out work in different areas to programming teams who need little contact with one another, the effort [of an OSS project] is likely to be unmanageable.” Joining an OSS project While OSS developers who have started an OSS project usually aim at attracting other developers to support them, there are typically also requirements which a developer interested in joining a project has to fulfill (von Krogh et al. 2003; Ducheneaut 2005).67 Software development is a knowledge-intensive activity requiring high levels of knowledge, experience and learning by those involved in it (Pliskin et al. 1991; Fichman & Kemerer 1997). Unless developers interested in joining a project possess relevant knowledge and experience and seem committed to learn as the project grows, they would be rather a burden than a help (von Krogh et al. 2003; Ducheneaut 2005). Due to this, researchers point out that newcomers to technical projects in general typically must present some level of technical expertise as well as understanding of what the community expects in terms of behavior, in order to be accepted as a new contributor (Wenger 1998; Lovgren & Racer 2000). This general phenomenon has also been found to exist in OSS projects. Von Krogh et al. (2003) introduce the notion of “joining scripts” as the process would-be developers have to go through in order to be accepted as member of the project. Ducheneaut (2005) describes the same process, but labels it as “trajectory.”
67
The joining of new project team members is of course also important after the start-up phase of a project as some key members are likely to leave throughout the life of the project and need to be replaced with new developers.
Open source software developers’ perspectives on code reuse
45
68
Analyzing the joining script of the OSS project Freenet, von Krogh et al. (2003) show that with on average 23 emails to the developer list, substantial prove of expertise is necessary before developers become part of the project team. Further, investigating differences in mails to the developer list between successful and unsuccessful joiners, they find that the project team appreciates “[…] hand[s]-on solutions to technical problem[s] [from developers willing to join], and that demonstration of technical knowledge in the form of software code submissions matters more than signaling of interest and experience” (von Krogh et al. 2003, p. 1229). Having established how software is developed in the OSS fashion, the next chapter turns to the reasons for individual developers to participate in OSS projects.
3.3.4. Motivations of OSS developers The success of an OSS project is based on the contributions of developers which are willing to invest their time and effort into the project. Empirical work has found that this time invested is substantial (e.g. Hertel et al. 2003; Lakhani & Wolf 2005), which is startling at first sight because given the free availability of its source code, OSS projects are basically a public good. Consequently, there is a broad stream of literature which has set out to understand the motivation of developers to participate in OSS projects, and asks the question, “why should thousands of top-notch programmers contribute freely to the provision of a public good” (Lerner & Tirole 2002, p. 198)? In answering this question, scholars have pointed out that despite contributing to a public good, developers may derive private benefit from its provision (e.g. Lerner & Tirole 2002; von Hippel & Von Krogh 2003). Literature seeking to describe these private benefits which explain developers’ participation in OSS projects is typically based on general work on the sources of human motivation. Sources of human motivation Scholarly work on motivation originates from psychology research where various frameworks have been developed to explain human motivation (e.g. Herzberg 1982; Deci & Ryan 1985; Maslow 1987). Somebody is motivated if she “is energized or activated toward an end” and “moved to do something” (Ryan & Deci 2000, p. 54). Motivation can be differentiated into the level of motivation (i.e. how much motivation an individual has 68
http://freenetproject.org, last accessed 08.10.2009.
46
Open source software developers’ perspectives on code reuse
regarding a specific action) and the type of motivation which energizes the individual for a certain task (Ryan & Deci 2000). The type of motivation explains why the individual wants to perform a task. As basic classification of motivation types Deci and Ryan (1985) propose the distinction between intrinsic and extrinsic motivation.69 An action is done for intrinsic motivation if it is performed “for its inherent satisfactions rather than for some separable consequences” (Ryan & Deci 2000, p. 56). In such situations external pressures or rewards are irrelevant. The action is performed for the fun or challenge of doing it. Contrasting with intrinsic motivation, an activity is performed due to extrinsic motivation if it “is done in order to achieve some separable outcome” (Ryan & Deci 2000, p. 60). In this case it is not the fun or the enjoyment which an individual gains from performing the activity, but the task is rather done for its instrumental value. Despite this rigid distinction between intrinsic and extrinsic motivation, an action is typically performed for a mixture of both intrinsic and extrinsic reasons (Amabile 1983). Following this concept of motivation, research has identified multiple instances of both motivation types which lead developers to become involved and stay active in OSS projects. Intrinsic motivation As intrinsic motivations which make developers engage in OSS projects, scholars have identified enjoyment and fun, altruism, community identification and ideology. Enjoyment and fun. As the most genuine intrinsic motivation, enjoyment and fun has been identified as one of the most important motivations to contribute to OSS projects (Raymond 2001; Hertel et al. 2003; Lakhani & Wolf 2005). This finding is also supported by Linus Torvalds70 when he explains that “[…] most of the good programmers do programming […] because it is fun to program” (Ghosh 1998, highlighting as in original). For developers motivated by the enjoyment and fun of coding, the actual end product is not a large concern (Lakhani & Wolf 2005). Enjoyment and fun as a key intrinsic motivation to participate in OSS projects can further be broken down into challenge seeking on the one hand and the experience of creative pleasure on the other hand (Amabile et al. 1994; Sen et al. 2008). Developers experience enjoyment and fun as the result of overcoming a 69
This classification is also accepted by other psychology scholars, e.g. Amabile (1996) and economists, e.g. Frey (1997).
70
Linus Torvalds is the founder of the Linux kernel, one of the most influential OSS projects.
Open source software developers’ perspectives on code reuse
47
cognitive challenge and resolving a technical problem in the challenge seeking component. In the creative pleasure component, developers pursue their development task not as a means to an end, but rather for its own sake. For such developers time and efficiency are not relevant. Based on a large-scale survey among OSS developers, Lakhani and Wolf (2005) find that about 73% of their respondents always or frequently lose track of time when programming and 60% would dedicate one additional hour to programming if the day had one additional hour. Enjoyment and fun as intrinsic motivation are strongly related to the concept of “flow” (Csíkszentmihályi 1975, 1990) describing a state of maximized enjoyment in which the developer is very focused and likely to forget about time. Csíkszentmihályi (1975, p. 181) describes flow situations as those delivering feelings of “creative discovery, a challenge overcome and a difficulty solved.” Altruism is a variant of intrinsic motivation in which an individual voluntarily seeks to increase the welfare of others without expecting any form of reciprocity (Krebs 1970). In this way altruism is a form of intrinsic motivation, because the individual has developed a preference for the good of the community, e.g. by charitable giving (Frey & Meier 2004). Linus Torvalds points to altruism as a motivation to develop OSS when describing his experience of making Linux available as OSS as “it feels good to have done something that other people enjoy using” (Ghosh 1998). In empirical work, both Hars and Ou (2002) and Wu et al. (2007) find altruism as motivating developers to contribute to OSS projects. Community identification. Hars and Ou (2002), Hertel et al. (2003) and Lakhani and Wolf (2005) identify community identification as a further intrinsic motivation for the participation in OSS projects. Community identification relates to the feeling of belonging to a group or community and makes developers act in the best interest of the group or community e.g. by helping other members of the community to receive internal satisfaction from the well-being of their community (von Krogh et al. 2008). Developers for whom community identification is a major motivation will treat other members of the community as kin and will be willing to do something which is beneficial for their kin, but not necessarily for themselves (Zeitlyn 2003).71
71
This idea is related to altruism. However it differs in that for altruistic behavior the relationship with the receiver of some action is irrelevant while behavior motivated by community commitment is only directed toward those considered as kin.
48
Open source software developers’ perspectives on code reuse As quantitative evidence, Lakhani and Wolf (2005) find in their survey, that 83% of the
participating developers either “strongly” or “somewhat” agree that the OSS community is the primary source of their identity. Ideology. As the last type of intrinsic motivation, ideological beliefs regarding the OSS movement have also been shown to make developers participate in OSS projects (Ghosh et al. 2002; Hertel et al. 2003; Lakhani & Wolf 2005). Since the inception of OSS in the form of Richard Stallman’s free software movement, ideology has played an important role in its development (Stallman 1999), and empirical work shows that developers who share ideological beliefs such as “software should be free for all” (Lakhani & Wolf 2005, p. 23) or “open source code should replace proprietary software” (Lakhani & Wolf 2005, p. 23) participate in OSS projects in order to realize their visions (Raymond 2001). Extrinsic motivation On the side of extrinsic reasons for developers to participate in OSS projects, scholars have identified personal needs, learning, reciprocity expectations, community reputation, commercial signaling and payment. Personal needs. Developers frequently participate in OSS projects in order to satisfy their own personal needs for software functionality (DiBona et al. 1999; Lerner & Tirole 2002). The programming language PERL for example was created when Larry Wall found C programs inefficient in creating web pages dynamically (Lerner & Tirole 2002). Learning. Developers have been found to participate in OSS projects to hone their skills and build their “human capital” in order to attain better job opportunities, higher salaries and more fulfilling jobs (Ghosh et al. 2002; Hars & Ou 2002; Lakhani & Wolf 2005). OSS projects are especially suited for personal development as developers can choose tasks and projects which meet their development needs and interests (Hars & Ou 2002). Additionally, OSS provides entry-level programmers the chance to participate in real projects. Further, OSS projects typically entail an intensive peer-review mechanism in which developers receive feedback from other developers in their projects teams and users of their software (von Krogh et al. 2003). Reciprocity expectations. Originally a concept from anthropology, a gift economy based on reciprocity expectations has also been discussed by several scholars as extrinsic motivation for developers to participate in OSS (Bergquist & Ljungberg 2001; Raymond 2001; Zeitlyn 2003; Lakhani & Wolf 2005). They argue that developers participate in OSS
Open source software developers’ perspectives on code reuse
49
projects by supporting them because they expect to be supported by the OSS community at a later point in time. When asked about his motivation for releasing Linux as OSS, Linus Torvalds for instance explains that among other things, he did expect some “quid pro quo” (Ghosh 1998). Community reputation. Striving for peer recognition derives from the desire for fame and esteem (Maslow 1987) and has been shown to motivate developers to participate in OSS projects (Ghosh et al. 2002; Lerner & Tirole 2002; Lakhani & Wolf 2005). As Raymond (2001, p. 94) explains, “[…] you do not become a hacker by calling yourself a hacker – you become a hacker when other hackers call you a hacker.” As giving credit to contributors in a project is an essential part of the OSS culture (Raymond 2001), developers will receive the more credit and thus reputation the more they contribute. Commercial signaling. Developers may also regard OSS projects as an opportunity to demonstrate their capabilities and skills in order to advance their careers (Lerner & Tirole 2002; Bonaccorsi & Rossi 2003). Through their achievements in OSS projects they can signal programming competence to potential employers and business partners (Raymond 2001). Linus Torvalds is quite explicit about this point when he says, “[…] you can trade your [OSS] reputation for money” (Ghosh 1998). The openness of code provides employers the transparency to evaluate a developer’s skill level. Thus, developers have an incentive to showcase their skills in (unpaid) OSS projects in order to thereby convince others to remunerate them for their work in other situations such as employment (Lerner & Tirole 2002). Confirming the suitability of this behavior, Hann et al. (2002) find that a higher rank within the Apache OSS community correlates significantly with higher wages. Payment. Lastly, also payment as the most genuine form of extrinsic motivation has been found to make developers contribute to OSS. Initially, scholars took OSS for a hobbyist phenomenon (e.g. Bessen 2002) and potentially it was one in its beginnings. Meanwhile, however, this has changed fundamentally. For example only 9% of the developers working on the Linux kernel in 2007 did not receive payment for their work (Corbet 2007). Also outside the Linux kernel, empirical work has found that many developers are paid for their OSS contributions in various forms (Ghosh et al. 2002; Hars & Ou 2002; Lakhani & Wolf 2005). Having established OSS and its development as the empirical context of this study in the previous chapters, the next chapter will turn to code reuse in OSS and review the limited literature already existing on this topic in order to formulate research questions for this dissertation in Chapter 3.3.6.
50
Open source software developers’ perspectives on code reuse
3.3.5. Code reuse in OSS development Research addressing code reuse in OSS development is still scarce and only very recently scholars have begun to investigate this topic. In the course of this, two different approaches have been used. On the one hand scholars have taken a high-level perspective and analyzed large samples of OSS projects for either code duplications (Mockus 2007; Chang & Mockus 2008) or project dependencies (German 2007; Spaeth et al. 2007). On the other hand, von Krogh, Spaeth and Haefliger (von Krogh et al. 2005; Haefliger et al. 2008) have chosen a more fine-grained project-level perspective and used case studies to explore reusing in a small group of selected OSS projects. Outside of academia, the code scan firm Black Duck Software (2007, 2009b) has repeatedly used its databases of OSS projects to publish information on OSS components reused in other OSS projects following an approach somewhat similar to that pursued by Mockus (2007). High-level perspective As the first scholarly work investigating code reuse in OSS projects, Mockus (2007) and Chang and Mockus (2008) seek to identify and quantify “large-scale code reuse.” They use the existence of text files with the same name in one directory across different projects as a metric for reuse and label these directories as components.72 While certainly pragmatic, this metric misses several forms of code reuse such as snippets, binary packages (e.g. JAR-files common in the Java environment) and reusing of components which are not part of the source package of the focal project but rather expected to already exist on the user’s system. Mockus (2007) uses an enormous database of 38.7 thousand unique OSS projects and finds that about half of all components of the sample are used in more than one project. Using a related approach outside of academia, Black Duck Software, a commercial firm which provides tools to detect OSS within software code bases has repeatedly published information on duplications between OSS projects (Black Duck Software 2007, 2009b). In contrast to Mockus’ (2007) and Chang and Mockus’ (2008) analyses, however, these analyses do not count duplications between text file names but scan for binary files (e.g. JAR-files) which are contained in multiple OSS projects.73 The most recent results of
72
In doing so, Mockus (2007) and Chang and Mockus (2008) apply a definition of “component” which is different to the one employed in this dissertation.
73
Thus, also the Black Duck Software (2007, 2009b) analyses capture only certain forms of code reuse.
Open source software developers’ perspectives on code reuse
51
these analyses (Black Duck Software 2009b) show that 1,311 different OSS binaries are reused in 200,000 other OSS projects more than 365,000 times. Another approach to research code reuse in OSS projects from a high-level perspective is analyzing package dependencies in Linux distributions. Linux distributions are compilations of OSS projects – including the Linux kernel, but also many other projects – which have been preselected and packaged by editors based on certain criteria (Spaeth et al. 2007). During the selection and the packaging process the distribution editors add information to the packages such as which other packages are required to execute the focal package because the focal package reuses functionality from other packages (Robles et al. 2006). Both German (2007) and Spaeth et al. (2007) have pursued this avenue which again does not cover the whole picture of code reuse as e.g. snippets and some forms of components are left out. German (2007) analyzes the Fink distribution of Linux74 and finds that more than 75% of the packages75 of Fink reuse functionality from at least one other package and about two thirds of the packages reuse at least two other packages. Spaeth et al. (2007) find that 1,146 library packages are reused 51,230 times in the Debian distribution of Linux76 they analyze. Furthermore, the latter authors also provide some information on reuse behavior. Factors which lead to higher reuse of a component are an older age, being developed in the C programming language, being referenced on freshmeat.net77 and being connected to an umbrella project or a legal entity (such as Apache). A large size of the component and a strict license such as the GPL are found to be detrimental to being reused. While they do not offer a complete picture, these findings of the high-level analyses presented above indicate that code reuse does exist in OSS projects. However, their coarse grained approaches do not shed much light on the code reuse behavior of individual OSS developers which is the focus of this dissertation. Project-level perspective Closer to the goals of this study is the work by von Krogh, Spaeth and Haefliger. Using case-studies on 15 (von Krogh et al. 2005) and six (Haefliger et al. 2008) rather large and 74
http://www.finkproject.org, last accessed 02.10.2009.
75
Within a Linux distribution a package represents one or more OSS projects.
76
http://www.debian.org, last accessed 02.10.2009.
77
Freshmeat is an OSS project directory which provides tools to search for OSS projects, http://freshmeat.net, last accessed 17.11.2009.
52
Open source software developers’ perspectives on code reuse
successful OSS projects and explicitly addressing the project level, the authors confirm that code reuse exists in OSS projects. They further find that developers reuse both components and snippets. Components are typically reused without being modified. The reuse of snippet seems to be quite limited in their sample. Diving into the mechanics of code reuse in OSS, Haefliger et al. (2008) find that OSS developers reuse code in order to make their development work more effective because they lack the skills to implement certain functionality by themselves, or because they prefer some specific development work over other tasks. Providing more detail on the efficiency benefits which OSS developers gain from reuse, Haefliger et al. (2008) point out a two-fold nature. First, reuse saves developers’ time because they do not have to write software by themselves. However, developers also perceive not having to maintain the software in the future as an efficiency benefit of reusing. This is possible for reused components which are developed further by their own projects which will fix bugs and implement new functionality which the focal project can access for free, that is without having to invest development effort. Lastly, Haefliger et al. (2008) show that developers use code reuse in order to deliver a plausible promise. While they continue reusing existing code throughout the whole life of their project, they do so more in the early phases. Describing the process of code reuse in OSS, Haefliger et al. (2008) show that OSS projects do not possess internal search repositories as they are common in companies, but OSS developers turn to OSS repositories such as SourceForge.net78 and dedicated OSS search and index tools such as freshmeat.net or Koders.com79 in order to find reusable code. Moreover, Linux distributions like Debian containing a large number of OSS projects are found to be used to identify reusable code. However, means of local search such as fellow developers or the project’s mailing list are considered to be more important than repositories and search engines when searching for reusable artifacts. The results of the project-level perspective case studies confirm that OSS developers do reuse existing code and also introduce some details about OSS developers’ code reuse behavior in descriptive fashion. However, the data are limited to a small number of OSS projects, are not quantitative in nature and do not offer comparative and multivariate insights needed to understand the determinants of code reuse by OSS developers.
78
http://sourceforge.net, last accessed 18.11.2009.
79
http://www.koders.com, last accessed 29.11.2009.
Open source software developers’ perspectives on code reuse
53
3.3.6. Intermediate conclusion and detailed research questions This section has presented OSS with its licenses, its specific software development processes and the peculiarities of its developers as the empirical context chosen to study knowledge reuse with a special focus on the role of individual developers. Analyzing code reuse by OSS developers as a specific instance of knowledge reuse is a promising opportunity to further scholarly work on knowledge reuse because of the unique characteristics of OSS and its developers. OSS was founded based on the ideas of sharing software innovations with others and giving users access to the source code of their software to provide them with the option to modify it. These ideas are deeply ingrained in the licenses which govern OSS. As a result of this, the existence of a large amount of code governed by these licenses provides OSS developers with the option to reuse existing code if they want to. This situation is different to that of software developers in commercial firms who are often restricted to the limited amount of code available in their firms’ reuse repositories. Consequently, an analysis of OSS developers’ code reuse behavior should shed light on the perspectives of individual developers on reuse and result in a picture which is not distorted by the lack or inaccessibility of reusable code. Further, the existing scholarly work on the development processes in OSS and the motivations of OSS developers reviewed in this section provides a unique starting point to scrutinize developers’ reuse behavior. It is utilized to build the research model and formulate hypotheses in the next section (3.4) and helps to interpret the resulting findings. Lastly, the heterogeneity of OSS developers spread all over the world and working in very different projects represents an interesting population for a large-scale quantitative analysis because this population is not affected by a common factor such as a single employer which might overshadow the perspectives of the individual developers. Beyond contributing to knowledge reuse literature, an analysis of code reuse in OSS development is also insightful for OSS research itself. As the review of existing scholarly work on code reuse in OSS development has shown, the detailed mechanics of this phenomenon are not fully understood yet and especially quantitative data on the level of individual developers are lacking. Because of that there is a lack of knowledge about the “receiving” side of the open innovation process of OSS. Striving to shed light on the role of individual developers in knowledge reuse processes to create value and to help complete the picture of OSS as an open innovation process, this
54
Open source software developers’ perspectives on code reuse
part of the dissertation empirically addresses the following detailed research questions in the context of code reuse in OSS development with a large-scale quantitative survey: − How important is code reuse for the contributions individual OSS developers make to their projects and to which extent do individual OSS developers practice code reuse? (Question 1) − What are the benefits which OSS developers see in code reuse and what are the drawbacks and issues of code reuse they perceive? (Question 2) − Do social norms and project policies exist which encourage or discourage code reuse despite the informal setups in which OSS is developed? (Question 3) − What are general impediments to code reuse in OSS which make it difficult for developers to reuse existing code even if they wanted to? (Question 4) − How do OSS developers reuse existing code, that is which forms of code do they prefer to reuse and how do they integrate the reused code with their own code? (Question 5) − Where do OSS developers turn to when searching for existing code to reuse? (Question 6) − How is the degree of code reuse an OSS developer practices determined by her individual characteristics and that of her project? (Question 7) The first six questions are addressed in descriptive and exploratory fashion in Section 3.6 while question seven is discussed using multivariate analyses in Section 3.7. Before this, Section 3.4 describes the research model addressing question seven and Section 3.5 reports on the design and conduction of the survey employed to generate the data required for answering the questions.
3.4.
Research model and hypotheses
To guide the choice of variables to be captured in the survey questionnaire and to formulate hypotheses for research question seven, a research model is developed in this section. Drawing on existing research on both knowledge and software reuse as well as on OSS development and the results from a qualitative pre-study, this research model aims at explaining how the code reuse behavior of an OSS developer is determined by her individual characteristics and that of her project.
Open source software developers’ perspectives on code reuse
55
To provide a solid theoretical base, the research model builds on the well-established Theory of Planned Behavior (TPB) (Ajzen 1991) which is one of the frameworks most frequently applied to explain human behavior in various research domains.80
3.4.1. The theory of planned behavior81 While initially developed in the context of social psychology, behavioral models such as TPB or the Technology Acceptance Model (TAM) (Davis et al. 1989) have found wide diffusion in various fields of management and information systems (IS) research to explain individual behavior. Related to this study, behavioral models have been used to understand software developers’ application of various development methodologies such as CASE tools82 (Riemenschneider & Hardgrave 2001), object-oriented software development83 (Hardgrave & Johnson 2003) or generally formalized software development processes (Riemenschneider et al. 2002; Hardgrave et al. 2003). Following this stream of research, the research model of this part of the dissertation which aims at explaining the code reuse behavior of individual OSS developers is based on TPB. TPB is favored over TAM because TPB provides more specific information regarding the factors which individuals consider when making a decision regarding behavior (Mathieson 1991). TPB posits that behavior is determined by intention, which itself is predicted by the three factors attitude toward the behavior, subjective norm and perceived behavioral control: Attitude toward the behavior is formed by the individual’s beliefs about the consequences and outcomes (both positive and negative) of the behavior. It is a “psychological tendency that is expressed by evaluating an entity with some degree of favor or disfavor” (Eagly & Chaiken 1996, p. 269). Subjective norm refers to pressure from the social environment as perceived by the individual to perform or not perform the behavior, and is often also referred to as peer norms. Perceived behavioral control is the perception of individuals of their ability to perform the behavior. It can be broken down in individuals’ “capability” of performing the behavior and the “controllability” (Ajzen 2002)
80
Much research has been conducted to validate TPB empirically. E.g. Armitage and Conner (2001) list 185 studies in various research areas which all rely on TPB, and find significant supportive evidence for TPB.
81
While TPB is employed to guide the research model for this study, this part of the dissertation does not follow the typical approach and setup of TPB studies. Due to that only the relevant portions of TPB are elaborated on here. For a more detailed review of TPB please see Chapter 4.3.3.
82
CASE (Computer-Aided Software Engineering) tools are software packages which automate activities along the software development process.
83
Object-orientation is a programming paradigm which has overtaken procedural programming as the dominant approach to software development in the 1990s.
56
Open source software developers’ perspectives on code reuse
the individuals have over the behavior, that is, whether the decision to perform the behavior is theirs or not. In its original form TPB proposes that an individual’s behavior is fully explained by her intention which is completely predicted by her attitude toward the behavior, her subjective norm and her perceived behavioral control. Moreover, in recent research84 all of these five constructs are typically treated as latent variables captured with rather generic scales. This study deviates from the standard approach of TPB-based research described above in two ways. First, for robustness purposes and to provide a richer picture, the model is tested in two setups. In the first, past behavior is the dependent variable while in the second, future intention is the dependent variable. Given TPB’s assumption that intention fully explains behavior, the two model setups should show similar results.85 As the second change, attitude and perceived behavioral control are not assessed with the usual generic scales but with items explicitly framed in the context of code reuse in OSS development.86 This approach is in line with early Theory of Reasoned Action (TRA) research (Ajzen & Fishbein 1980) which later gave rise to TPB and TAM.
3.4.2. Qualitative pre-study The research questions presented above require a large-scale survey among OSS developers. Before conducting this survey, a qualitative pre-study was carried out. The purpose of the pre-study was three-fold. First, the information gathered in the pre-study helped to inform and refine the research model discussed in this section (Greene et al. 1989). Second, the pre-study helped to gain a better understanding of OSS development in general and to become familiar with the terminology used by OSS developers in the context of code reuse. Both aspects facilitated the design of the survey instrument. Third, the findings generated during the qualitative research were later employed to support the analysis and interpretation of the quantitative survey findings (Miles & Huberman 1994). In order to best leverage the qualitative pre-study for the questionnaire to be developed later-on, interview partners were selected from SourceForge.net, the OSS collaboration platform which also served as the starting point for the survey. In February 2009 OSS 84
See Venkatesh et al. (2003) for an overview of recent work using behavioral models in the IS domain.
85
Obviously, there may be differences resulting from the backward-perspective of one setup and the forward-perspective of the other.
86
For subjective norm a generic scale seemed well suited and is thus retained.
Open source software developers’ perspectives on code reuse
57
developers selected at random from those registered on SourceForge.net were contacted with an email asking them for an interview on code reuse in OSS development. 12 developers agreed and were contacted in the following. Contrary to the interviewees in Haefliger et al. (2008) and von Krogh et al. (2005), the interviewees in the pre-study do not exclusively represent comparably large and successful OSS projects, but reflect the full heterogeneity of SourceForge.net which also the survey will have to accommodate. The group of developers interviewed encompasses developers from both small and large projects, from projects of different topics (e.g. database front-end vs. game) and from different geographic regions. In line with the exploratory character of the pre-study the interviews were conducted as semi-structured interviews to allow comparison of the answers, but still leave enough room to address new topics and questions (Bortz & Döring 2003; Schnell et al. 2005). Ten of the twelve interviews were conducted either by phone or internet-based voice communication and two in the form of an email exchange. The voice-based interviews lasted between 27 minutes and one hour and 44 minutes with an average duration of 49 minutes. Nine of the ten voice-based interviews were recorded, for the other interview careful notes were taken. The taped interviews were transcribed. In addition to the interviews also discussions which evolved from the survey pretest (see Chapter 3.5.3) were included in the qualitative pre-study. Results of the pre-study are reflected in the research model, the setup of the questionnaire and the discussion of the results of the quantitative survey.
3.4.3. Determinants of code reuse behavior The research model to explain OSS developers’ code reuse behavior (see Figure 3-1) consists of seven groups of components (labeled with the letters “A” to “G”) which are assumed to determine code reuse.
“subjective
norm,”
and
“perceived
H1c H1d H1e
Software quality effects (+)
Task selection benefits (+)
Loss of control risks (-)
behavioral
control”
explain
the
H3
H2b
Commercial signaling (+)
H4f
H4e
H4d
Community commitment (+) OSS reputation building (+)
H4c
H4b
Creative pleasure (-) Skill improvement (+)
H4a
Challenge seeking (-)
F Compatibility with devs.’ goals
Code reuse intention/ behavior
• Supportive project policy
• Perceived peer perspective on code reuse
• Dev. residence (continent)
• Dev. professional reuse training
• Dev. education on reuse
• Dev. experience as professional
• Dev. share in project development
• Dev. weekly project hours
• Dev. OSS age
• Project type (CO vs. ST)*
• Project complexity
• Project size (# of developers)
G Additional control variables
• Developer skill level
• Architectural issues
• Programming language conflicts
• OSS license conflicts
• Lack of reusable code
• Discouraging project policy
C Perceived behavioral control
B Subjective norm
*CO=component project, ST=standalone executable application project. Notes:The direction of the hypotheses is indicated by (+) and (-); “developer” is abbreviated with “dev.”; “developers” are abbreviated with “devs.”
Project phase (-)
E Project maturity
Total number of developer’s OSS projects (+)
Size of developer’s personal OSS network (+) H2a
H1b
Efficiency effects (+)
D Access to local search
H1a
Effectiveness effects (+)
A Attitude toward code reuse
58 Open source software developers’ perspectives on code reuse
Figure 3-1: OSS code reuse research model
TPB research originally posits that the three groups “attitude toward a behavior,”
behavior
Open source software developers’ perspectives on code reuse
59
comprehensively through intention (Ajzen 1991). The research model of this study stays true to this assumption despite its four additional groups of hypotheses and control variables, because all of these additional groups could be incorporated into the three original TPB groups of attitude, subjective norm, and perceived behavioral control.87 However, in order to better illustrate the ideas behind them, some of the hypotheses are displayed as independent groups of their own in the following.88 Moreover, some control variables are shown as a group of their own because their influence on attitude, subjective norm, and perceived behavioral control is rather indirect. Following TPB as a starting point, the research model proposes that developers’ code reuse behavior is influenced by their attitude toward code reuse (group A), their subjective norm on code reuse (group B), and the behavioral control they perceive regarding code reuse (group C). Beyond these, developers’ access to local search for reusable code (group D), the maturity of their project (group E) and the compatibility of code reuse with their individual goals in the project (group F) are hypothesized to influence their code reuse behavior. Finally, the model encompasses additional control variables (group G) which are mentioned either in the existing literature or result from the qualitative pre-study. Attitude toward code reuse (Group A) TPB suggests that developers with a more positive attitude toward code reuse, that is those developers who perceive the benefits of code reuse more strongly and its drawbacks and issues less prominently, will reuse more existing code. This is also consistent with research on the not-invented-here syndrome which points out that developers with no or negative prior experiences with external knowledge are less likely to reuse it (Mehrwald 1999). Based on existing research and the qualitative pre-study, eight benefits of code reuse and nine drawbacks and issues were identified. Using exploratory factor analyses these 17 items could be condensed to five constructs which describe developers’ attitude toward code reuse (see Chapter 3.6.3). These constructs are developers’ perceptions of the effectiveness effects of code reuse, the efficiency effects of code reuse, the software quality
87
Bagozzi and Dholakia (2006) follow a similar approach when they apply a model with the three original TPB groups and additional determinants derived from the model of goal-directed behavior (Perugini & Bagozzi 2001) in order to explain OSS developers’ intentions to participate in Linux user groups.
88
A direct mapping between the additional groups of hypotheses and control variable and the three original TPB groups is not possible.
60
Open source software developers’ perspectives on code reuse
effects of code reuse, the task selection benefits resulting from code reuse, and the potential loss of control over their project which might come with code reuse: − Effectiveness effects of code reuse: OSS developers may reuse existing code to overcome programming problems which they cannot solve themselves (DiBona 2005). A developer from the qualitative pre-study points to these effectiveness benefits when explaining, “[…] we are reusing gnuchess and gnucap. Developing a chess engine and an electric simulator is out of my core competencies.” Following the above argumentation, developers who are more convinced of the effectiveness benefits of code reuse will rely more on existing code. H1a: The more positive developers perceive the effectiveness effects of code reuse, the more existing code they will reuse. − Efficiency effects of code reuse: Reusing existing code saves developers time and effort because they do not have to develop the functionality implemented in the code themselves from scratch (see Chapter 3.2.2). However, in order to achieve this, developers have to search for, understand, modify and integrate the code to be reused (see Chapter 3.2.2). The greater developers perceive the difference between the time and effort saved through code reuse and the time and effort necessary for code reuse, they more existing code they should reuse (Krueger 1992; Isoda 1995; Lynex & Layzell 1998). A developer from the qualitative pre-study summarizes this when saying, “[…] sometimes you can reuse and sometimes you have to modify and waste more time [on existing code] than writing it on your own [would take].” H1b: The more positive developers perceive the efficiency effects of code reuse, the more existing code they will reuse. − Software quality effects of code reuse: As has been shown in Chapter 3.2.2, code reuse can increase the quality of software by including high-quality code and better maintenance. A developer from the qualitative pre-study exhibits a very strong opinion on these quality benefits of code reuse when explaining, “it’s nonsense to write your own JPEG, MP3 etc. algorithms because it wastes time and you will never make it as good as it is already done.” However, code reuse may also impact software quality negatively. If developers integrate or modify code they do not fully understand, they may introduce bugs and security issues to their project (Apte et al. 1990; DiBona 2005; Frakes & Kang 2005). A developer from the qualitative prestudy explains: “[Reuse] introduces dangers to the project as its code base is not
Open source software developers’ perspectives on code reuse
61
entirely understood by its developers, which may result in significant errors that are difficult to diagnose and correct.” Further, code reuse may impede the performance of software. For instance if a developer reuses a certain piece of functionality of a component, but does not need the other functionality also included in the component, she still has to include the whole component in her project.89 This leads to overhead in the software which is not needed for functionality, but still needs resources and thereby may affect performance (Garlan et al. 1995, 2009). As a developer from the qualitative pre-study explains: “I have to make sure that [my project] stays coherent and easy to maintain. Integrating too large or too complex pieces of code may destabilize our code base.” Another developer nicely summarizes this double influence of code reuse on software quality when explaining, “I would reuse far more snippets and components, but it’s hard to find the quality we require.” Consequently, developers who perceive the quality benefits more strongly should reuse more while developers more strongly affected by the quality downsides should reuse less existing code. H1c: The more positive developers perceive the software quality effects of code reuse, the more existing code they will reuse. − Task selection benefits of code reuse: OSS developers prefer some software development tasks over others (see Chapter 3.3.3). Haefliger et al. (2008) point out in their case studies on code reuse that developers leverage code reuse as a means to allow them to spend their time on the interesting tasks while taking care of the less interesting ones by reusing existing code. Further, they quote a developer with the words, “code reuse is just helping us to get the job done, so I can work on something that is more interesting” (p. 190). Thus, the more developers trust on code reuse to help them focus on the interesting tasks of their project, the more they should reuse existing code. H1d: The more strongly developers perceive the task selection benefits of code reuse, the more existing code they will reuse. − Loss of control risks from code reuse: By including foreign code into their project developers give up some of the control they have over their project. From a process
89
This situation may also occur if the reused component can be employed in different contexts, such as in different programming languages, operating systems etc. While this context independence of such components is certainly one of their main advantages, this feature often also requires extensive code to ensure compatibility which takes its toll in terms of performance (Garlan et al. 2009).
62
Open source software developers’ perspectives on code reuse perspective, they might become dependent on the original developers of the reused code to fix bugs and make changes required for their project because they cannot do so themselves as they do not understand the reused code well enough. Raymond (2001, p. 37) describes this problem when writing about one of his projects in which he had reused existing code, but then chose to remove this code: “I had another purpose for rewriting besides improving the code and the data structure design, however. That was to evolve it [the project] into something I understood completely. It’s no fun to be responsible for fixing bugs in a program you don’t understand.” Further, having reused existing code that they cannot or do not want to maintain themselves, developers may also lose some control over their time schedule as the maintainer of the reused code decides when there will be updates and this may or may not be consistent with the reusing developers’ schedule. Finally, reused components may make installing the project difficult for its users with the project developers not having many chances to solve this issue. This situation can occur if installing the required components is difficult because the project developing the components does not put much effort on easy installation. According to a developer from the qualitative pre-study, “[…] having more than a few open source dependencies [i.e. components] results in a nightmare when building on many different architectures.” Summing up, developers who are more uncomfortable with the issues described should reuse less existing code. H1e: The more strongly developers perceive the loss of control risks from code reuse, the less existing code they will reuse.
Subjective norm on code reuse (Group B) As the second predictor of intention and thus indirectly also determining behavior, TPB posits subjective norm. Again, this is consistent with research on the not-invented-here syndrome which proposes that social environments in which colleagues have a negative perspective on external knowledge lead individual developers to rely less on ideas from the outside (Mehrwald 1999; Michailova & Husted 2003).90 Consequently, the research model of this study includes developers’ subjective norm as a determinant of code reuse. This effect is however treated as a control variable, first because the focus of this research is on individual developer characteristics and second because Mellarkod et al. (2007) have 90
Yet, contrary to research on the not-invented-here syndrome which focuses on negative influences of the social environment, subjective norm in TPB accounts from both positive and negative influences.
Open source software developers’ perspectives on code reuse
63
already tested a related construct in their model describing developers’ reuse behavior in a corporate environment. Perceived behavioral control about code reuse (Group C) As its third component, TPB points to perceived behavioral control as a determinant of behavior which is mediated through intention. Although most of the relationships regarding perceived behavioral control conjectured below have never been tested empirically before and are thus of high interest, they will be treated as control variables and not as hypotheses in this study because they mostly relate to project characteristics and not to the individual developer who is at the center of this research. Six aspects of the research model cater to the controllability portion of perceived behavioral control – as opposed to the capability portion – (see Chapter 3.4.1) which describes whether code reuse is under the control of the developer at all. First, both a project policy which supports code reuse and a policy discouraging code reuse should affect developers’ code reuse behavior. Pointing to the effects of a policy discouraging code reuse, a developer from the qualitative pre-study explains, “[…] we would like to do that [code reuse] more, but we have our restrictions about external dependencies [i.e. components].” Beyond these two aspects, four general impediments to code reuse derived from the literature and the qualitative pre-study are expected to influence developers’ behavior: − First, if there is a lack of reusable code for the specific requirements of the developers’ project, the developers cannot reuse even if they wanted to. As a developer from the qualitative pre-study points out, “the reason that [my project] in general […] does not reuse a lot of code is that [my project] focuses on new and innovative libraries. Thus, there isn’t a lot of existing code out there we could reuse.” In a similar way, another developer from the qualitative pre-study explains, “I like to make very unique projects where often I can reuse only low-level code.” − Second, OSS license conflicts between the developers’ project and the code to be reused can make code reuse difficult independent of the developers’ aim to reuse existing code or not (DiBona 2005). Describing this situation and alluding to the difficulties of reusing code under a license different from that of the own project, German (2007, p. 7) speaks of “islands” created by the OSS licenses. A situation in which license conflicts oppose code reuse could e.g. occur if the developers’ project were licensed under the BSD license and the developers wanted to reuse code under
64
Open source software developers’ perspectives on code reuse the GPL license. Due to the license conditions of the GPL (see Chapter 3.3.2), reusing in the described scenario would either require the developers to change the license of their project to the GPL as well (Rosen 2004) or integrate the GPL code in a modular fashion which avoids direct contact between their own project and the reused GPL code (Henkel & Baldwin 2009). Given that developers often choose the license of their project for good reasons (Lerner & Tirole 2005; Stewart & Gosain 2006), they should not be willing to change it for reuse purposes easily.91 Moreover, the high effort required to integrate the GPL code in a modular way might make reusing it prohibitive. A developer from the qualitative pre-study is quite explicit about license conflicts when stating, “license is a show stopper. I won’t look at the code unless the license is compatible with [the license of his project].” Further support for this argumentation comes from an analysis by Spaeth et al. (2007) who show that components not licensed under the GPL license are reused more frequently because they do not create so many license conflicts..
− Third, and similar to license conflicts, programming language conflicts could lead to less code reuse. Two different instances of this issue may exist. First, the programming language of the developers’ project may make it difficult to reuse external code (especially components), because it does not allow for easy linking and integration of larger chunks of external code. Second, if much of the code to be reused is written in a different language than the developers’ project, the additional effort required to bridge this language gap might discourage reuse (Garlan et al. 1995; DiBona 2005; Haefliger et al. 2008). − Lastly, architectural issues could impede code reuse. In order to allow for easy reuse of external code (especially components), the developers’ project should feature a modular architecture which allows easy plugging-in of new code (Baldwin & Clark 2006; MacCormack et al. 2006). If the architecture of the project is not modular enough, the effort required to reuse existing code despite this obstacle could again make code reuse an unattractive choice for the developers (Garlan et al. 1995). To cover the capability portion of perceived behavioral control the research model includes developers’ skill level in software development, arguing that a certain level of proficiency is required in order to develop the mental representation of the code to be 91
Making a change of license even more complicated, every single developer who has ever contributed code to the project would have to agree to the new license applied to her code. Projects can only avoid this if they ask developers to assign the copyright of their contributions to a central entity. This is however practiced only by a few projects (O’Mahony 2003).
Open source software developers’ perspectives on code reuse
65
reused which is needed to evaluate, modify and integrate it (Soloway et al. 1982; Davies 1989). A component developer is highlighting this point when he says, “[…] you cannot be a beginner developer and build my library [i.e. component] because it is kind of tricky and there are pitfalls […].” Access to local search (Group D) With the first group of constructs beyond the three original TPB groups, the research model proposes that developers who have better access to local search for reusable code will reuse more existing code. Banker et al. (1993) show that developers will reuse if their costs for searching and integrating the existing code are lower than for developing it from scratch. These costs for searching and integrating are lowered if OSS developers can turn to their own experience or that of fellow OSS developers who can point them to the code they need, assure them of its quality, and explain to them how it works and how to best integrate it instead of spending valuable time using search engines such as Google to find reusable code, evaluating its quality, and understanding its inner workings (Haefliger et al. 2008). Sambamurthy and Subramani (2005, p. 3) point out that “personal, social, or organizational networks” can help to find out “who knows what and who can be asked for help” and conjecture that access to these experts makes individuals more likely to reuse existing knowledge. Consequently, developers with a larger personal network within the OSS community should show a stronger code reuse behavior than those developers with no or only a small OSS network. H2a: The larger developers’ personal OSS networks, the more existing code they will reuse. As another face of local search, developers who have been active in a large number of OSS projects in the past, might turn to their own experiences and either remember having solved a similar programming problem before by themselves in another project or remember having reused existing code before to solve a similar problem. In the first case they can cheaply access the code they have written for the previous project and reuse it in the new project at low cost, because they know the code and its features and limitations very well. In the second case, they have already found, evaluated and understood external code in the past and can save this time and effort now when they integrate the same code into their new project. Consequently, developers who have been involved in a large number of OSS projects in the past should show a stronger code reuse behavior than those developers who have worked only in a small number of projects.
66
Open source software developers’ perspectives on code reuse H2b: The greater the number of OSS projects developers have ever been involved in, the more existing code they will reuse.
Project maturity (Group E)92 As a further hypothesis, the research model infers a relationship between the maturity of an OSS project and the code reuse behavior of its developers. As pointed out in Chapter 3.3.3, OSS developers launching a project strive to build an interesting and promising code base as quickly as possible in order to attract other developers’ support. Code reuse is an excellent tool to accomplish that because it allows the addition of large blocks of typically stable and working functionality to a new project with limited effort (Haefliger et al. 2008). A developer from the qualitative pre-study also makes this point when explaining, “[…] reusing code is much more important initially just to get something going and working.” Another developer argues the same way: “[…] initially, in order to get up and running quickly, you try to [re]use as much as you possibly can. Because initially you want to get a project that works at a basic level and you can then improve later.” This conjectured relationship receives further support from Senyard and Michlmayr (2004) who find that developers pondering to start a new OSS project often study other related projects in detail before launching their own project. It would be quite natural for these developers to leverage the knowledge gained by studying the related projects and reuse interesting parts that can make the start of their own project easier. Further, while code reuse is very helpful in the early phases of the life of an OSS project, its importance should decline once the project has reached a certain level of maturity. At that point, the developers have implemented all required basic functionality and turn toward fine-tuning and adding aspects which make their project unique, which by definition is difficult with reused code. A developer from the qualitative pre-study exemplifies this when saying: “[In mature projects] the code has a level of originality which makes it more difficult to reuse external things.”93 In a similar way, another developer explains, “[…] as the project gets in the beta phases or the final phases […] the basic functionality that you can get from reusing […] is already in there and now you are improving how your application itself works.” Following this argumentation, developers in
92
Note that while project maturity reflects a project attribute, the argumentation regarding it is based on a particular characteristic of OSS developers who need to deliver “credible promises” in their projects. Because of that this part of the research model is treated as a hypothesis rather than as a control variable.
93
Translated from German.
Open source software developers’ perspectives on code reuse
67
projects which are still early in their life should attribute a higher importance to code reuse and practice it more while developer in more mature project should reuse less. H3: The more mature developers’ project, the less existing code they will reuse. Compatibility of code reuse with developers’ project goals (Group F) In the final group of hypotheses, the research model argues that the compatibility of code reuse with developers’ own individual goals in their project will influence their code reuse behavior. This aspect is important because the “attitudes”-group of the model presented above captures developers’ general attitude toward code reuse, while the “compatibility”-group presented in the following helps to link these general attitudes to the developers’ work in one specific project. Following Moore and Benbasat (1991, p. 195) compatibility is defined as the degree to which code reuse “[…] is perceived as being consistent with the existing values, needs, and past experiences” of an OSS developer. Here the focus is primarily on “values” and “needs” because “experiences” have already been addressed in H2b. Based on the discussion of the reasons for developers to participate in OSS projects in Chapter 3.3.4, the research model proposes that OSS developers’ motivations to work on their project influence their reuse behavior. This argumentation follows Crowston et al. (2009, p. 36) who assume that “it seems likely that [OSS developers’] motivations are linked to other facets of contribution [to OSS projects].” In the following the conjectured relationships between several forms of motivation and code reuse behavior are discussed: − Challenge seeking: Sen et al. (2008) show empirically that OSS developers for whom tackling difficult technical challenges is a main motivation to work on their project try to limit the number of team members involved in their project besides them because they want to solve the problems by themselves without the help of others. In a similar fashion, OSS developers who work on their project to tackle difficult technical challenges should reuse less existing code because code reuse would solve some of the challenges for them.94 Referring to own work on an OSS project, DiBona (2005, p. 23) nicely illustrates this hypothesis when describing how he deals with a specific storage problem: “I was (and am) also fascinated by a 94
In order to be able to focus on solving these difficult technical challenges by themselves, developers might very well show increased reuse behavior for other parts of their project. This effect is controlled for however by including developers’ perception of task selection benefits through reuse (see hypothesis H1b).
68
Open source software developers’ perspectives on code reuse problem […]. I haven’t solved that problem as of this writing, but I don’t necessarily want to use other people’s code for that. […] the storage problem is mine, for now.” Similarly, Shah (2006) quotes a French OSS developer who says, “[…] it’s great when you find a challenging problem to work on – either on your own or because somebody needs it – you can spend hours on it.” Likewise, a developer from the qualitative pre-study explains, “[…] in open source you want to do it [write code] by yourself, so you only look at [other people’s code] if you are really stuck or something.” While this last statement is probably not representative of all OSS developers, DiBona et al. (1999, p. 13) generalize the above three personal experiences of OSS developers when they describe the “[…] satisfaction of the ultimate intellectual exercise” which OSS developers feel “[…] after completing or debugging a hideously tricky piece of recursive code that has been a source of trouble for days.” It seems quite plausible that code reuse would impede the joy described after solving the problem and thus developers for whom challenge seeking is a major motivation should reuse less existing code. H4a: The more important tackling difficult technical challenges is as a reason for developers to work on their OSS project, the less existing code they will reuse.
− Creative pleasure: Related to the effect of challenge seeking described above, code reuse should not be of major importance for OSS developers who work on their project for the creative pleasure they perceive while coding either. Code reuse would reduce their need to write their own code because they would use existing code instead of it. However, as writing own code is what these developers enjoy, code reuse would reduce their creative pleasure. Quite bluntly, a developer from the qualitative pre-study points out, “I don’t reuse so much code because I enjoy writing everything myself […].” Similarly another developer from the qualitative pre-study explains, “one reason [for not reusing existing code] is that I do part of my [OSS] work just for fun and personal enrichment. So, sometimes you just do not want to have a library or something like that and you do not really want to use something that somebody has already done because it [writing code yourself] is much fun sometimes.” As a second argument, it seems likely that developers for whom the creative pleasure from writing code is a major motivation are not very susceptible to the effectiveness, efficiency and quality benefits of code reuse, because delivering a high-quality piece of software in short time is not required for them to fulfill their
Open source software developers’ perspectives on code reuse
69
individual goal. A developer participating in the survey of Hars and Ou (2002, p. 28) explains her motivation to work on OSS projects by her “innate desire to code, and code, and code until the day I die.” It would be surprising if this developer were overly concerned with spending her time efficiently and building high-quality software. Given the above two lines of thought regarding the relationship between creative pleasure as an OSS motivation and code reuse, the compatibility between the two constructs should be rather low. H4b: The more important creative pleasure is as a reason for developers to work on their OSS project, the less existing code they will reuse. − Skill improvement: Code reuse can help developers to solve problems which they cannot solve by themselves without having to deeply understand the code (see above). However, the general availability of OSS in source code form also allows developers to study and modify the code which they reuse. In this form, code reuse provides developers with a unique opportunity to improve their skills as they can start with working existing code and study and modify it to hone their software development skills (DiBona 2005). A developer from the qualitative pre-study explains this point: “I have used code reuse as a way of learning how to achieve certain goals […].” Similarly, another developer points out that especially snippets are helpful for developers who want to improve their skills: “Reusing code snippets can really help to learn a new programming language and develop a new application.” Developers can reuse existing OSS code as a black box if they aim mainly at the effectiveness, efficiency and quality effects, but if improving their development skills is important for them, they can also dive deeply into the existing code and use it to hone their qualifications. H4c: The more important skill improvement is as a reason for developers to work on their OSS project, the more existing code they will reuse. − Community commitment: Developers who are strongly committed to the OSS community want it to be successful. Emphasizing this point, Raymond (2001, p. 68) quotes a fictitious developer with a strong community commitment with the words “I exist to create useful, beautiful programs and information resources, and then give them away.” OSS developers who work on their project mainly to do good for the OSS community should reuse more existing code because code reuse helps them write better software faster, which makes the OSS community stronger. Further, as both knowledge reuse research (Fafchamps 1994; Szulanski 1996) and research on
70
Open source software developers’ perspectives on code reuse the not-invented-here syndrome (de Pay 1995) point out, a relationship of trust between knowledge source and knowledge recipient supports knowledge reuse because the recipient is less reluctant to apply the existing knowledge. This should be true even more if the recipient identifies strongly with the values of a community which has been founded on the principles of sharing code and building on each other’s work. Consequently, developers who feel committed to the OSS community should also be less reluctant to reuse existing code created by somebody else in this community. H4d: The more important community commitment is as a reason for developers to work on their OSS project, the more existing code they will reuse.
− OSS reputation building: From an abstract perspective code reuse could on the one hand increase developers’ OSS reputation because by leveraging existing code they contribute to the community more and in better quality. Yet on the other hand, code reuse could also diminish developers’ OSS reputation because by reusing existing code they do not prove their own programming proficiency. Raymond (2001, p. 24) dispels the second argumentation when he writes: “Good programmers know what to write. Great ones know what to rewrite (and reuse).” Also supportive of the argumentation that a high level of code reuse goes together well with the desire to build a reputation in the OSS community is the finding of von Krogh et al. (2003) who report that developers who need to prove their worthiness to join a project by making their initial contributions (see Chapter 3.3.3) often include reused code in these first contributions. Moreover, code reuse should make a project better and thus create more attention for the project in the OSS community and consequently also result in more attention for the developers associated with the project. This is nicely reflected in the statement of a developer from the qualitative pre-study who points out that “for me OSS is all about getting the code as good as it can be. If I or someone else does it is not important.” Also supportive of this argumentation, Sen et al. (2008) show that OSS developers for whom reputation building is important prefer to be part of a successful project with many developers over being one of only a few developers of a less successful project. Consequently, developers who contribute to their project mainly to enhance their OSS reputation should reuse more existing code.
Open source software developers’ perspectives on code reuse
71
H4e: The more important reputation building in the OSS community is as a reason for developers to work on their OSS project, the more existing code they will reuse. − Commercial signaling: Following the same logic as presented above regarding the link between code reuse and reputation building within the OSS community, developers who work on their project to signal their skills to potential employers or business partners should reuse more existing code because parties outside of the OSS community are more likely to become aware of successful OSS projects (Lerner & Tirole 2002). H4f: The more important signaling of skills toward potential employers and business partners is as reason for developers to work on their OSS project, the more existing code they will reuse. Further control variables (Group H) Finally, multiple additional control variables are included in the research model to account for further contextual differences which could influence developers’ code reuse behavior. These control variables encompass four groups: − First, the model accounts for some further project characteristics. The size of a project (i.e. the number of developers involved) as well as its technical complexity could influence developers’ reuse behavior because these two dimensions influence both the effort required to fulfill the project goals and the man-power available to reach these goals. For example, given their man-power, large project teams should be able to realize complex projects even without code reuse while small teams might only be able to bear such a project in finite time with heavy code reuse. Moreover, the type of the project, that is if the project aims at creating a standalone executable software program or a reusable component, could influence developers’ reuse behavior. Reusable component projects often aim at being very portable and easy to integrate in other software and thus creating a large number of dependencies by reusing many components themselves is frequently not in line with their goals. − Second, the level of professionalism and seriousness with which developers contribute to their project might influence their reuse behavior. These issues are controlled for by including into the research model the number of years that developers have already been involved in OSS, the average weekly hours they invest
72
Open source software developers’ perspectives on code reuse into their project and the share of project functionality which has been developed by them as compared to their project team members. Further, the model controls whether developers have ever worked or work as a professional software developers.
− Third, the model accounts for developers’ education and training on reuse, which has been shown to be a determinant of reuse behavior in software development firms in previous research (Card & Comer 1994; Joos 1994; Frakes & Fox 1995). For better differentiation, training during developers’ education and training during their time as professional developers in firms are separated. − Fourth and finally, developers’ geographic residence on a continent level is included in the model. Subramanyam and Xia (2008) show that developers from different geographies prefer, for example, different levels of modularity in their OSS projects. Following this line of thought, geographic residence might also influence code reuse behavior. In order to answer the research questions presented in Chapter 3.3.6 and to test the research model discussed in this chapter, data on the code reuse behavior of individual OSS developers were collected with a survey. The design of this survey and the process of conducting it are discussed in the next section.
3.5.
Survey design and methodology
3.5.1. Data source and sample selection The research objects of this study are individual OSS developers. Obviously, there exists no complete directory of OSS developers which would have allowed contacting them for the survey. However, there do exist several large OSS collaboration platforms on the internet which provide OSS developers with the infrastructure they need for their projects. Moreover, these platforms often provide means to contact the developers registered with them. Of these platforms SourceForge.net is the largest one with 163,244 hosted OSS projects and 222,920 registered OSS developers on June 11th, 2009.95 In total more than two million users were registered on the SourceForge.net platform as of February 2009 (SourceForge.net 2009), however, only about ten percent of them are 95
This information is based on a database built for this study using SourceForge.net data. This database is described on the following pages. A developer registered on SourceForge.net is defined as a SourceForge.net user who is a member of at least one of the OSS projects hosted at SourceForge.net. In total there are more than two million users registered on SourceForge.net.
Open source software developers’ perspectives on code reuse
73
members of at least one of the OSS projects hosted at SourceForge.net and are thus labeled as “developers.” The developers registered with SourceForge.net have been analyzed in several scholarly projects (e.g. Lakhani & Wolf 2005; Wu et al. 2007; Sen et al. 2008) and were also chosen as the frame population for this survey because of the size of their platform which promises more heterogeneity in developer and project characteristics than smaller platforms. Despite it popularity among researchers, working with samples of SourceForge.net developers creates some selection bias since large OSS projects (e.g. Linux or Apache projects) are underrepresented on SourceForge.net (Lerner & Tirole 2005). However, while participants for this study were selected from developers registered on SourceForge.net, when completing the questionnaire they were explicitly allowed to refer to other OSS projects they are involved in and which are not developed on SourceForge.net. Since about 20 percent of the survey participants made use of this option also larger OSS projects such as Linux or various Apache projects are included in the data and potential selection bias concerns should be mitigated due to that. Every two months SourceForge.net exports selected data about the projects and the developers registered with them to a research project named FLOSSmole96 (Howison et al. 2006) which allows other researchers to use this information for their work. For this study the June 2009 dataset of FLOSSmole is used which reflects the projects and developers of SourceForge.net as of June 11th, 2009. In order to select developers for the survey and to create personalized invitations to the survey, a database was built based on the FLOSSmole data. Beyond the FLOSSmole data, the database was further amended with information on when the OSS developers on SourceForge.net had contributed to a project for the last time. This additional information was gathered with a self-developed Java program. Using the database, the survey population was constructed as follows (see Figure 3-2). The total number of OSS developers registered with SourceForge.net on June 11th, 2009 was 222,920. Of this total frame population 184,382 developers or about 83% had not exhibited any developer activities on SourceForge.net after January 1st, 2009 and were thus excluded from the survey as they seemed to be inactive.97 While this figure of inactive developers seems high, it does make sense given SourceForge.net’s role in the OSS community and typical behavioral patterns in the OSS community. First, as developer accounts on SourceForge.net are not deleted and developers usually do not deregister even
96
http://ossmole.sourceforge.net/index.htm, last accessed 11.06.2009.
97
The date of 01.01.2009 was chosen as a cut-off date in order to address only those developers with the survey who had been active on SourceForge.net within the first six months of 2009.
74
Open source software developers’ perspectives on code reuse
when leaving SourceForge.net (Lerner & Tirole 2005) it has to be assumed that there is a large number of “dead” user accounts.98 Further, there might also be developers registered with SourceForge.net who do not actually contribute to OSS projects, but use their SourceForge.net account only for “reading purposes.” Such OSS community participants are typically referred to as “lurkers” (von Krogh et al. 2003; David & Rullani 2008). Of the remaining 38,538 developers, 1,026 are registered with roles which suggest that they are not involved in coding. These developers who support their projects as e.g. “web designers”, “translators” or “unix admins” were also removed from the survey population because only coding developers can reuse existing code. The remaining 37,512 developers either have roles which suggest that they actually write code for their projects (“developer” or “project manager”) or have roles which do not allow any conclusions about their project work (“all-hands person”, “unspecified” or “no specific role”).99 Figure 3-2: Construction of OSS code reuse survey population OSS developers* registered on SourceForge.net in June 2009 230,000
222,920
Developers with roles such as “Web Designer”, “Translator”, “Unix Admin” etc.
38,538
40,000
-184,382
Developers with role “Developer”, “Project Manager”, “All-Hands Person”, “No specific role” or no role information at all
37,512 -1,026
35,569 -1,943
Deverlopers active after 01.01.2009
Developers with noncoding roles
Developers with potential coding roles
Pretest population
Survey population
in % of total developers registered
Developers inactive after 01.01.2009
0
Total developers registered
20,000
82.7%
17.3%
0.5%
16.8%
0.9%
16.0%
*A developer registered on SourceForge.net is defined as a SourceForge.net user who is a member of at least one of the OSS projects hosted at SourceForge.net. In total there are more than two million users registered on SourceForge.net.
Further 1,943 developers were removed from the survey population because they had either been contacted for interviews during the qualitative pre-study (see Chapter 3.4.2) or had been asked to pretest the survey (see Chapter 3.5.3). After all these adjustments,
98
Being registered at SourceForge.net is free for a developer, so she has no incentive to deregister even if she does not use SourceForge.net anymore.
99
About 50% of all developers registered with SourceForge.net have not entered their project role because this information is not mandatory. Due to the large number of developers without information on their role, it has to be assumed that most of these developers actually write code. They were thus included in the survey population and a question was added to the survey which explicitly asks developers whether they write code for their project (see Chapter 3.5.2).
Open source software developers’ perspectives on code reuse
75
35,569 developers remained available as the population for the final survey, equaling 16% of all developers registered with SourceForge.net in June 2009. Of these 35,569 developers, a random sample of 7,500 developers was drawn and invited to participate in the survey.
3.5.2. Survey design The survey was conducted via an online questionnaire (see Appendix A.1.1).100 Such an approach was highly suited for this survey, first, because of the high internet proficiency of the survey participants, second for cost reasons given the large population which was addressed in the survey and third did the digital capturing of data allow direct analyses without the risk of any media breaks (Forrest 2003). The survey was designed after a thorough review of the literature on both OSS (see Section 3.3) and reuse in software development (see Chapter 3.2.2) and after the interviews with OSS developers during the qualitative pre-study (see Chapter 3.4.2). Moreover, whenever possible existing scales were employed in the questionnaire to ensure validity and reliability. These efforts provided that survey questions were asked in a systematic way in the given context and that the answers offered to OSS developers in closed questions made sense to them. Further, in order to reduce common method bias, several measures were employed during data collection as suggested by Podsakoff et al. (2003). Care was taken to formulate simple and unambiguous questions for the survey and survey respondents were assured when the survey was introduced to them that their responses would be treated strictly confidentially. Most questions of the survey were designed as mandatory questions. Exceptions were demographic questions because some culture groups do not feel comfortable providing this information. Nearly all questions were conditional in order to ensure that developers were only presented questions relevant to them. Following Dilman (1978, p. 123-127), similar questions were grouped together and presented on the same page. Moreover, question groups which were expected to be particularly interesting to the respondents were presented first while e.g. the group with demographic questions was presented last. The resulting survey structure with its eight sections is as follows:
100
The online questionnaire was developed using the (http://www.limesurvey.com, last accessed 12.06.2009).
OSS
survey
application
LimeSurvey
76
Open source software developers’ perspectives on code reuse
− 1. Introduction: The first section asks the participant whether she is actually writing code for OSS projects. If she declines this question, she is not in the target audience of the survey because she cannot reuse existing code. Consequently, she will not be asked any questions about code reuse, but will be directed to the demographic questions at the end of the survey right away. Participants who do write code for OSS projects are asked to enter the name of their current main project, defined as the project they currently spend on most of their OSS time. The questions on the following sections of the survey reference this project’s name and e.g. ask the developer about the importance of code reuse in this specific project or for her motivation to contribute to this specific project. This is important, because developers may exhibit different behaviors in different projects. − 2. Definitions: In order to ensure that all developers taking the survey have a common understanding of which software development behavior constitutes code reuse, the second section of the survey contains a definition of code reuse. The definition points out that reusing both snippets and components is considered code reuse. − 3. Importance of reuse: The next section contains questions about the importance of code reuse for the developer’s current main project. Developers are asked about both the past and future importance of code reuse for their work. On a more fine-grained level, the survey further enquires about the role of snippets and components in developers’ work. − 4. Sources of reuse: Part four focuses on the sources which developers turn to when searching for reusable components or snippets. The items are based on both existing literature and the results from the qualitative pre-study. − 5. Benefits and drawbacks of reuse: This section asks developers about their agreement to various benefits and drawbacks or issues of code reuse in order to understand why developers reuse or do not reuse existing code. Again the items are based on existing literature and the qualitative pre-study. − 6. Developer’s main project: The next block of questions deals with characteristics of the developer’s current main project such as its license, its main programming language etc. − 7. Developer’s open source activities: Following the project characteristics, the seventh section asks developers for information on their OSS activities such as their
Open source software developers’ perspectives on code reuse
77
motivation to contribute to their current main project, the size of their personal network within OSS and the number of hours they invest into OSS during an average week. − 8. Demographic questions: Lastly, participants are asked demographic questions.
3.5.3. Pretest In order to check for relevance, to make sure that the survey questions can be understood well and that all relevant answers are available for selection, an extensive pretest was conducted before launching the survey (Bortz & Döring 2003, p. 331; Schnell et al. 2005, p. 347). The pretest consisted of three steps. First, five academic peers knowledgeable about OSS provided feedback on the questionnaire, checking question types, phrasing, presentation and the order of the questions. Second, eight OSS developers of those who had been interviewed in the qualitative pre-study (see Chapter 3.4.2) were asked to review the questionnaire with regard to the definitions employed, clarity of questions, suitable response ranges etc. Third, in April 2009 two rounds of pilot studies were conducted with 1,000 developers selected at random from SourceForge.net each. These pilot studies had the primary purpose of assessing the quality of the instruments employed. The feedback received was very positive. Especially the pretesters from SourceForge.net expressed a high interest in the topic of code reuse and many of them emailed asking for the results of the final survey. Following the pretest, the overall structure of the survey and its questions did not need any changes. Based on the comments received, minor changes were applied to the wording of some of the questions to avoid misunderstandings.
3.5.4. Conducting the survey Of the total survey population of 35,569 developers, 7,500 were selected at random and sent an email invitation to take part in the survey. In order to personalize the invitation email, the real name of each developer and the number of SourceForge.net projects each developer is involved in was extracted from the database and used in the invitation text. In order to achieve a high response rate, Dilman’s (1978, p. 12) suggestion to “minimize the costs of responding, maximize the rewards for doing so, and establish trust so that those rewards will be delivered” was followed. To minimize the costs of
78
Open source software developers’ perspectives on code reuse
responding, participants were sent an e-mail containing a direct link to the survey they only had to click on. Furthermore, the questionnaire was designed such that it should not take more than 15-20 minutes to complete. To maximize participants’ benefit of taking the time to complete the survey they were promised a detailed aggregate report of the data and given the option to sign-up for a raffle giving away ten book gift certificates. Finally, credibility was built with the participants by leveraging the reputation of Technische Universität München. The survey was active from July 2009 to September 2009. Of the 7,500 emails sent to developers inviting them to participate in the survey 293 could not be delivered. Of those developers who did receive an invitation, 701 completed the survey (see Table 3-1), yielding a response rate of 9.7% which is in line with the typically low response rates of web surveys (Couper 2000) and matches the response rates of other current surveys among developers on SourceForge.net (e.g. Wu et al. 2007; Oreg & Nov 2008; Sen et al. 2008). Of the 701 responses 17 had to be eliminated due to inconsistent or corrupt entries, resulting in a final data set with 684 observations. Table 3-1: OSS code reuse survey response statistics Total invitations sent thereof delivered to designated recipients thereof not delivered to designated recipients Total questionnaires completed Total response rate (based on delivered invitations) Inconsistent or corrupt responses Total usable questionnaires completed
7,500 7,207 293 701 9.7% 17 684
To estimate the presence of common method bias in the survey data Harman’s onefactor test was employed. In this test all variables of a model are loaded onto a single factor in a principal component factor analysis. A significant amount of common method bias is assumed to exist if only one factor emerges or if one factor explains the majority of all the variance in the data (Podsakoff et al. 2003). In the data of this study the maximum variance explained by one factor is 6.2 percent, which does not hint toward strong common method bias. Moreover, to test whether the respondents are representative of the population (nonresponse bias), a late-response analysis (Armstrong & Overton 1977) was conducted. In this analysis all variables which are later included in the multivariate model were tested for differences between early and late respondents to the survey invitation. Since Pace (1939) has shown that late respondents are more like non-respondents than like early respondents, differences between these two groups could point to a non-response bias.
Open source software developers’ perspectives on code reuse
79
Survey participants on average were very fast in taking the questionnaire. 60% completed the survey on the day on which they had received the invitation. Due to this, participants who took the questionnaire more than four days after having received the invitation have to be considered late-respondents already. The late-respondents account for about ten percent of the total respondents. Only four variables out of 44 differ significantly between early and late respondents. First, early-respondents are more likely to consider an OSS project aiming to develop a reusable component instead of a standalone executable application as their current main project (paired t-test, p=0.0093).101 This could be because as developers of a project which aims at being reused, component project developers might have had a higher motivation to participate in the survey. Consequently, the share of reusable component projects in the survey might be higher than in the frame population. Second, with 31.2 years of age on average, early-respondents are significantly younger than late respondents with on average 34.0 years of age (paired t-test, p=0.0216). This difference might be caused by the fact that older developers are more likely to have more social commitments (e.g. families) and/or jobs with more meetings etc. and thus could not respond to the survey invitation immediately. Resulting, younger developers might be overrepresented in the survey data. As third and fourth difference, early-respondents perceive a less positive subjective norm about code reuse (paired t-test, p=0.0853) and consider themselves as better developers (paired t-test, p=0.0904) than late respondents. However, the significance levels of these differences are rather low. Based on the data gathered with the survey, the next section addresses the research questions regarding OSS developers’ code reuse behavior in descriptive and exploratory fashion while section 3.7 tests the research model with multivariate methods.
3.6.
Descriptive and exploratory analyses
This section provides a detailed descriptive and exploratory analysis of the data collected in the survey. It serves the purpose of shedding light on the in-depth mechanics of the reusing side of OSS development on the one hand and of establishing the context for the multivariate analyses explaining determinants of developers’ reuse behavior on the other hand. Before turning to the descriptive research questions, key information about the participating developers and their projects is presented (3.6.1). Based on the data, the 101
Unless explicitly specified differently, all paired t-tests report two-tailed significance levels.
80
Open source software developers’ perspectives on code reuse
extent and importance of code reuse for OSS developers is described in Chapter 3.6.2 before developers’ reasons for and against code reuse are discussed (3.6.3). After that the mechanics of code reuse in OSS are investigated by exploring which forms of code OSS developers prefer to reuse and how they integrate this code with their own code (3.6.4). An analysis of the sources which OSS developers turn to when searching for reusable code (3.6.5) and a summary of the descriptive and exploratory findings (3.6.6) conclude the section.
3.6.1. Survey participants and their OSS projects Before addressing the research questions, this chapter provides selected information about the survey participants and their OSS projects. Further, the quality of the multi-item constructs measuring developers’ project motivations is assessed. Description of survey participants Of the 684 survey participants whose demographics are summarized in Table 3-2 the vast majority is male (98%), on average 32 years old and lives in Europe (54%) or North America (26%). Participants are well educated (84% of them hold a university degree, 19% even a Ph.D.) and most of them have studied IT-related subjects such as computer science (56%) or engineering (18%). The majority is not only active in OSS projects, but also works or has worked as a professional software developer (69%) with an average experience of 7.7 years. Importantly, the demographics of the survey participants are largely consistent with data reported in other studies among OSS developers (see David and Shapiro (2008) for a summary of several recent OSS surveys) and do not suggest that non-response has biased the sample to over represent less serious OSS developers.102 Of relevance for this study is the fact that only 92% of the survey participants actually write code for OSS projects. The others participate in OSS by taking care of tasks such as graphics design, translation or web site administration. As only those survey participants
102
Given the large number of surveys among SourceForge.net developers, one might suspect that especially the more active developers on this platform would show signs of “survey fatigue.” However, comparing the self-reported weekly hours developers spend working on their main project between this survey (mean: 8.7) and the first SourceForge.net survey ever by Lakhani and Wolf (2005) (mean: 7.5) mitigates these concerns. The additional finding that 69 percent of the developers in this survey have worked as professional software developers or are still working as professional software developers with an average tenure of 7.7 years rules out the further concern that only less skilled programmers took part in the survey.
Open source software developers’ perspectives on code reuse
81
who write code can reuse existing code, the future analyses refer to these 632 participants only which are labeled as “developers”. Surprisingly, only about half of the participants with an IT-related university education have been taught about code reuse during their education.103 Further, only 19% of those participants who have worked as professional software developers or still do so have ever received any training on code reuse in their firms. Given the high importance of code reuse for modern software development (see Chapter 3.2.2) these low figures are startling. Table 3-2: Demographics of OSS code reuse survey participants Percentage Percentage Age (mean: 31.6, median: 30) Training on reuse during education* 1-19 5% Yes 48% 20-29 43% No 52% 30-39 34% Training on reuse in job as software developer* 40-49 13% Yes 19% 50+ 5% No 81% Region of residence Task profile in OSS projects North America 26% Includes writing code 92% South America 5% Does not include writing code 8% Europe 54% Years active in the OSS (mean: 5.4, median: 4)* Asia and rest of world (RoW) 15% 0-2 26% Highest level of education 3-4 25% Non-university education 16% 5-6 20% Undergraduate or equivalent 35% 7-8 9% Graduate or equivalent 30% 9+ 20% Ph.D. or equivalent 19% OSS projects ever involved in (mean: 4.6, median: 3)* Subject of highest university degree* 1-4 66% Computer Science or related subject 56% 5-9 26% Engineering or related subject 18% 10-14 5% Mathematics or Physics 10% 15+ 3% Other 16% Size of personal OSS network (mean: 12.6, median: 8)* Experience as professional software developer* 0-9 developers 70% Yes 69% 10-14 developers 5% No 31% 20+ developers 12% Self-assessment of software development skills* Weekly hours spent on project (mean: 8.7, median: 5)* Much worse than average 4% 0-4 48% Slightly worse than average 17% 5-9 19% Average 41% 10-19 21% Slightly better than average 27% 20+ 12% Much better than average 11% *Percentages refer only to those developers for whom the segmentation is applicable, e.g. “training on reuse in job as software developer” refers only to those respondents who have worked or work as professional software developers Note: N=684.
The developers have been active in OSS projects for 5.4 years on average and during this time have contributed to an average of 4.6 OSS projects. On average they know 12.6 other OSS developers and 41% consider their software development skills as average when compared to other OSS developers. Developers with a higher level of education, a degree in computer science and those who have worked as professional software developers or 103
Participants with a degree in computer science and younger participants have a significantly higher probability of having had reuse on their curriculum, but even for those groups the likelihood does not exceed 60%.
82
Open source software developers’ perspectives on code reuse
still do so, self-assess their software development skills significantly more positive than other developers. Lastly, developers report to spend on average 8.7 hours per week on their current main project. Participating developers’ motivations to work on their current main project Given the assumed relationship between developers’ motivations to contribute to their OSS project and their code reuse behavior in the research model (see Chapter 3.4.3), these motivations were captured with multi-item constructs in the context of developers’ current main project (see Table 3-3). The constructs measured on 7-point Likert scales (“strongly disagree” to “strongly agree”) are adapted from both psychology literature (Spence & Robbins 1992; Amabile et al. 1994) and earlier scholarly work on OSS (Hars & Ou 2002; Lakhani & Wolf 2005; Roberts et al. 2006; Sen et al. 2008). Table 3-3: Reliability of OSS developer motivation constructs Construct Challenge seeking In my work on [project] I enjoy trying to solve complex problems. [CHAL1] The more difficult the problem to solve in [project] the more I enjoy trying to solve it. [CHAL2] In my work on [project] I prefer difficult tasks over tasks that are straightforward. [CHAL3] Creative pleasure I lose track of time when writing my own code for [project]. [FUN1] I love writing my own lines of code for [project]. [FUN2] Sometimes I enjoy writing my own code for [project] so much I have a hard time stopping. [FUN3] Skill improvement Through working on [project] my coding skills get better. [LEARN1] I work on [project] to learn new developer skills. [LEARN2] The feedback I get from peers on my coding for [project] helps me become a better developer. [LEARN3] Community commitment I believe that source code should be open. [COM1] I work on [project] to implement needs of [project]'s non-commercial user community. [COM2] I identify with the open source community. [COM3] OSS reputation building I work on [project] to enhance my reputation in the open source software community. [REP1] I work on [project] because it gives me status among my open source peers. [REP2]
Mean
S.D.
IR
5.68
1.02
0.70
5.12
1.32
0.56
4.60
1.35
0.45
5.10 5.57
1.45 1.11
0.40 0.64
4.82
1.42
0.46
5.81 5.34
1.10 1.42
0.76 0.54
4.83
1.47
0.38
5.34
1.45
0.44
5.72
1.21
0.31
5.81
1.10
0.54
3.74
1.75
0.79
3.47
1.66
0.87
C’sGȻ
CR*
AVE
0.81
0.80
0.57
0.75
0.75
0.50
0.76
0.79
0.56
0.64
0.69
0.43
0.90
0.91
0.83
Commercial signaling 0.87 0.87 0.69 Working on [project] increases my opportunities for a better job. [SIG1] 4.46 1.68 0.65 I work on [project] to increase my market value to potential business partners or 4.06 1.80 0.72 employers. [SIG2] I work on [project] to enhance my professional reputation. [SIG3] 4.43 1.70 0.71 *As the constructs are treated as tau-equivalent, Cronbach’s Ȼ and the Composite Reliability are quite similar. Notes: In the questionnaire “[project]” was replaced with the name of the developer’s current main OSS project which she had entered earlier in the survey; “[CHAL1]” denotes the name of the item as it is referred to in later analyses; Abbreviations: S.D. = Standard Deviation, IR = Indicator Reliability, C’s Į = Cronbach’sGȻ, CR = Composite Reliability, AVE = Average Variance Extracted; N=632.
Following Homburg and co-authors (Homburg & Baumgartner 1995; Homburg & Giering 1996), several steps were taken to ensure validity and reliability of the constructs.
Open source software developers’ perspectives on code reuse
83
Content validity was qualitatively assessed through building on existing literature whenever possible, discussions with fellow OSS researchers, and two rounds of pretests. Regarding reliability, all constructs and items with the exception of the community commitment construct and item LEARN3 exceed the reliability criteria of indicator reliabilities greater than 0.4 (Bagozzi & Baumgartner 1994), Cronbach’s Į greater than 0.7 (Nunnally 1978), composite reliability greater than 0.6 (Bagozzi & Yi 1988) and average variance extracted greater than 0.5 (Fornell & Larcker 1981) (see Table 3-3). After eliminating item COM2, also the community commitment construct would fulfill all the above reliability cut-off values. However, the construct is retained in its original form because it is not very far below the respective thresholds and because the idea of “giving to the community” captured in item COM2 is important for the argumentation of the research model (see Chapter 3.4.3). As item LEARN3 is only barely below the indicator reliability threshold and the overall construct exhibits good reliability criteria, the item is also retained. Convergent validity of the constructs is assessed through factor analysis, which confirms that all items have their highest loading with their respective intended construct and all loadings are higher than 0.5 (Hair et al. 2006) (see Table 3-4). Table 3-4: Loadings of OSS developer motivation items Rotated component matrix Commercial Challenge seeking Community Creative pleasure OSS reputation Skill improvement signaling commitment building CHAL1 0.044 0.791 0.136 0.211 0.003 0.038 CHAL2 -0.038 0.882 0.138 0.142 0.030 0.024 CHAL3 0.040 0.806 0.059 0.155 -0.020 0.022 FUN1 0.029 0.186 0.108 0.759 -0.013 0.109 0.737 0.074 0.021 FUN2 -0.028 0.259 0.237 FUN3 0.033 0.153 0.089 0.841 0.007 0.007 LEARN1 -0.081 0.041 0.100 0.074 0.172 0.733 LEARN2 0.133 0.101 0.046 0.022 -0.129 0.706 LEARN3 -0.057 -0.015 0.081 0.037 0.194 0.830 0.834 0.162 0.008 0.046 COM1 0.096 0.145 COM2 0.163 0.113 0.844 0.175 0.020 0.068 COM3 0.056 0.100 0.711 -0.006 0.208 0.116 0.895 0.094 REP1 0.265 -0.002 0.047 0.034 REP2 0.261 0.015 0.064 0.005 0.896 0.088 SIG1 0.852 0.005 0.175 0.060 0.099 0.018 0.870 -0.020 0.075 -0.014 0.258 -0.022 SIG2 SIG3 0.812 0.051 0.043 -0.007 0.359 -0.032 Notes: The factor analysis uses principal component analysis and Varimax rotation; figures in bold and with gray shading are factor loadings on a-priori constructs; N=632. Item
Lastly, discriminant validity is demonstrated by showing that the square root of the average variance extracted of each construct is greater than its correlations with other constructs (see Table 3-5), thus satisfying the Fornell-Larcker criterion (Fornell & Larcker 1981).
84
Open source software developers’ perspectives on code reuse
Table 3-5: Discriminant validity of OSS developer motivation constructs Challenge Creative Skill Community OSS reputation Commercial seeking pleasure improvement commitment building signaling Challenge seeking 0.756 Creative pleasure 0.438*** 0.707 Skill improvement 0.289*** 0.331*** 0.748 Community commitment 0.111*** 0.143*** 0.213*** 0.656 OSS reputation building 0.028 0.062 0.199*** 0.192*** 0.910 Commercial signaling 0.047 0.056 0.253*** 0.025 0.501*** 0.833 * correlation significant at 10%, ** correlation significant at 5%, *** correlation significant at 1% level Notes: The diagonal bolded entries are square roots of the average variance extracted (AVE) of the respective construct; the offdiagonal entries are correlations between constructs; N=632.
The resulting motivation constructs (see Figure 3-3) show that of those motivations captured in the survey, community commitment receives the highest level of agreement.104 It is followed by a group consisting of skill improvement, creative pleasure and challenge seeking. Commercial signaling and OSS reputation building are less important with only 52% and 28% of the developers agreeing to them as reasons for their work on their current main project, respectively. Figure 3-3: OSS developers’ motivations to work on current main project Developers' motivations to work on current main project (in % of developers)
Share agreement Share disagreement
Community commitment (intrinsic)
3%
Skill improvement (extrinsic)
7%
Creative pleasure (intrinsic)
7%
Challenge seeking (intrinsic)
6%
Commercial signaling (extrinsic)
86%
50
1.00
5.33
1.10
73%
5.16
1.09
73%
5.13
1.05
4.32
1.54
3.61
1.63
52%
41% 75
5.61
78%
29%
OSS reputation building (extrinsic)
Mean S.D.
28% 25
0
25
50
75
100%
Notes: The share of developers who are “indifferent” about the respective motivations is not shown; N=632.
Having established the demographics of the survey participants and especially of the developers whose code reuse behavior is analyzed in the following, some selected characteristics of the developers’ current main projects are reviewed briefly in the following.
104
Note that the survey only captured those motivations which are part of the research model. Other motivations such as personal need or altruism (see Chapter 3.3.4) were omitted.
Open source software developers’ perspectives on code reuse
85
Description of participating developers’ current main projects Asked for their current main project the 632 developers named 620 unique OSS projects and no project was named more frequently than twice, which emphasizes the enormous heterogeneity in OSS captured by the survey. Important to note is that developers were not limited to their SourceForge.net projects. Due to that about one fifth of the developers reported on projects for which the development was taking place outside of SourceForge.net. Developers’ current main projects are primarily licensed under the highly restrictive GPL license (61%), developed in Java (25%), C++ (23%) or C (15%) and about half of these projects have already reached a relatively mature development phase (see Table 3-6). Table 3-6: Characteristics of OSS developers’ current main projects Percentage Percentage Main license Project type GPL (v2 and v3) 61% Standalone executable application 74% LGPL (v2.1 and v3) 12% Reusable component 26% BSD 9% Developers involved in project (mean: 6.1, median: 2) MIT 3% 1 46% APL 3% 2 20% EPL 2% 3 12% Other 10% 4 6% Main programming language 5 4% Java 25% 6+ 12% C++ 23% Technical complexity of project C 15% Much less than average 8% PHP 10% Slightly less than average 25% Python 7% Average 39% Other 20% Slightly more than average 21% Development phase Much more than average 7% Pre-Alpha 12% Alpha 14% Beta 25% Stable/ Production 38% Mature 11% Notes: Only projects of participants who write code are considered; N=632.
Projects have on average 6.1 developers, but with 46% single-developer projects represent the largest group of projects. In terms of technical complexity, 39% of the developers consider their current main project to be of average technical complexity compared to other projects hosted on SourceForge.net.105 Of particular interest to this study
105
The project characteristics reported in this study are partially quite different to the characteristics of SourceForge.net projects reported by Lerner and Tirole (2005). These differences are due to several reasons: First, this study asked developers using a survey while Lerner and Tirole (2005) rely on metadata stored at SourceForge.net. As entering and updating this metadata is not mandatory at SourceForge.net, this information may very well be rather incomplete and outdated. Second, Lerner and Tirole (2005) consider all projects registered at SourceForge.net while this study only includes those projects which developers consider as their current main project. Thus, smaller “pet projects” which may have different characteristics are unlikely to be included in this study. Finally, Lerner and Tirole’s (2005) sample reflects only projects hosted on SourceForge.net while this study only uses developers registered with
86
Open source software developers’ perspectives on code reuse
is the finding that 26% of the developers consider a project as their current main project which develops a reusable component.106 After having established the demographics of the survey participants and the characteristics of their current main OSS project and after assessing the constructs employed to measure developers’ motivations to contribute to their current main project, the following chapters address the descriptive and exploratory research questions regarding code reuse in OSS development.
3.6.2. Importance and extent of code reuse As pointed out in Chapter 3.2.2, a broad range of artifacts can be reused in software development. Most common is the reuse of existing code which is also the focus of this study. In OSS development, code is reused in the form of components and snippets (see Chapter 3.3.5). In the survey, component reuse was defined as “reusing of functionality from external components in the form of libraries or included files. E.g., implementing cryptographic functionality from OpenSSL or functionality to parse INI files from an external class you have included. Please do not count functionalities from libraries that are part of your development language, such as the C libraries.”107 In a similar fashion, snippet reuse was defined as “reusing of snippets (several existing lines of code) copied and pasted from external sources. If you have modified the code after copying and pasting it by, e.g., renaming variables or adjusting it to a specific library you use, this would still be considered as […] reuse […].” The definition further pointed out that code refactoring and using code already existing at another place in the same project was not a form of code reuse. Traditionally, code reuse is measured by calculating the share of reused lines of code over total lines of code in a piece of software (e.g. Lee & Litecky 1997). An alternative approach is to divide the number of reused modules in a piece of software by the total number of modules (e.g. Cusumano 1991; Frakes & Fox 1995). After discussing both means with OSS developers in the qualitative pre-study, none of the two approaches seemed well suited for this study. Code reuse measurement based on lines of code is difficult for survey participants because estimating the number of lines of code in reused SourceForge.net as survey respondents, but does not restrain developers from describing projects that may be hosted somewhere else. 106
Due to non-response bias this number may be lower in the frame population (see Chapter 3.5.4).
107
The text in italics is a verbatim copy of the text presented in the questionnaire.
Open source software developers’ perspectives on code reuse
87
components is nontrivial, especially if only selected parts of the component are reused.108 Further, measuring code reuse with the number of modules turned out to be difficult because many OSS projects are not large enough to be composed of multiple clearly defined modules. In order to analyze developers’ code reuse behavior despite these obstacles, two alternative approaches to measure code reuse were developed, drawing mainly on the qualitative pre-study but also integrating existing scholarly work. In the following, developers’ code reuse behavior is measured with the importance of code reuse as perceived by the developers and the share of reused functionality in developers’ contributions to their OSS project. Importance of code reuse To capture the importance of code reuse as perceived by the developers, two multi-item constructs were developed. Both are related to research on general knowledge reuse (Watson & Hewett 2006; Ajila & Wu 2007) and the intention and behavior scales commonly employed in TAM or TPB research in the IS domain (e.g. Riemenschneider et al. 2002; Mellarkod et al. 2007). However, none of the items were adopted from existing research, but rather developed in dialogue with OSS developers during the qualitative prestudy. In the survey, all of the items require developers to indicate on a 7-point Likert scale their agreement to statements which describe code reuse as “very important” for their individual contributions to their current main project (see Table 3-7). The first scale captures the importance of code reuse for developers’ past work on their current main OSS project and in TPB terms thus describes past behavior while the second scale refers to developers’ expectancy regarding the importance of code reuse for their future work on their current main OSS project. In this form the second scale describes intention rather than behavior.109 Both constructs exhibit excellent reliability and validity characteristics (see Table 3-7).110 Both resulting constructs point to the high importance which OSS developers seem to attach to code reuse. The construct addressing past development exhibits a mean of 4.73, 108
OSS developers frequently need only parts of the functionality implemented in the components they reuse (Haefliger et al. 2008).
109
As pointed out in Chapter 3.4.3 for robustness purposes and to provide a richer picture, the research model is tested with both behavior and intention as dependent variable in this study.
110
See Chapter 3.6.1 for more details and references regarding construct validation.
88
Open source software developers’ perspectives on code reuse
standard deviation of 1.85 and a median of 5.25. 58% of the developers at least “somewhat agree” to the statements describing code reuse as very important for their past work. Also rather high, the mean on the construct aiming at future development is 4.57, its standard deviation is 1.69 and its median is 4.75. Here, 53% of the developers at least “somewhat agree” to the statements positioning code reuse as very important for their future work. Given the rather extreme formulation of the single items which position code reuse as “very important”,111 the high average values of the constructs emphasize that code reuse is of high relevance in OSS development. Table 3-7: Reliability of OSS code reuse importance constructs Construct Importance of code reuse for past development work Reusing has been extremely important for my past work on [project]. [RPAST1] Without reusing [project] would not be what it is today. [RPAST2] I did reuse very much during my past work on [project]. [RPAST3] My past work on [project] would not have been possible without reusing. [RPAST4]
Mean
S.D.
IR
5.12 5.04 4.54
1.95 2.04 2.01
0.84 0.76 0.80
4.23
2.13
0.71
C’sGȻ
CR*
AVE
0.93
0.93
0.77
Importance of code reuse for future development work Reusing will be extremely important in my future work on [project]. [RFUT1] 4.91 1.82 0.81 0.93 0.94 0.89 Realizing my future tasks and goals for [project] will not be possible without 4.28 1.90 0.72 reusing. [RFUT2] I will reuse very much when developing [project] in the future. [RFUT3] 4.57 1.81 0.86 Realizing my future tasks and goals for [project] will be very difficult without 4.51 1.84 0.76 reusing. [RFUT4] *As the constructs are treated as tau-equivalent, Cronbach’s Ȼ and the Composite Reliability are quite similar. Notes: In the questionnaire “[project]” was replaced with the name of the developer’s current main project which she had entered earlier in the survey; “[RPAST1]” denotes the name of the item as it is referred to in later analyses; Abbreviations: S.D. = Standard Deviation, IR = Indicator Reliability, C’s Į = Cronbach’sGȻ, CR = Composite Reliability, AVE = Average Variance Extracted; N=632.
Interestingly, both mean and median are significantly lower (paired t-test, p=0.0008) in the construct addressing future development than in the one referring to past development. This might be a first indication supporting hypothesis H3, which states that code reuse is more important in earlier phases of an OSS project. Share of reused functionality The second approach to measure developers’ code reuse behavior captures the share of functionality based on reused code in their contributions to their current main OSS project. It is related to measuring code reuse via lines of code, but allows developers to indicate that they have e.g. reused only a small share of the functionality of a large component. Importantly, this measurement approach covers only past development because predicting the future share of functionality to be reused is rather difficult for developers.
111
This extreme formulation was intended to make participants choose lower levels of agreement than with a less extreme formulation.
Open source software developers’ perspectives on code reuse
89
In the questionnaire developers report that, on average, nearly one third (mean=30.0%, standard deviation=26.4%, median=20%) of the functionality they have added to their current main OSS project is based on reused code (see Figure 3-4). This again confirms that code reuse is indeed an important element of OSS development. This interpretation is further supported by the fact that only six percent of the developers surveyed report that all of the functionality they have contributed to their current main project has been developed completely from scratch by them. Furthermore, the maximum share of reused functionality of 99% shows that some developers rely very heavily on code reuse and see their role mainly in writing “glue-code” to integrate the various pieces of reused code. While a direct comparison is not possible due to the different measurement approaches, the mean of 30% reused functionality appears much higher than the 10% and 18% of reused lines of code which Cusumano and Kemerer (1991) report for American and Japanese software development firms, respectively. This, combined with the findings regarding the importance of code reuse for developers’ OSS contributions, suggests that code reuse is of high importance in OSS development and might be even practiced more intensively in OSS development than in traditional software development in commercial firms. Figure 3-4: Share of reused code in functionality contributed to OSS projects Mean:
30.0%
Share of functionality contributed to developers' current main project based on reused code (in % of developers)
S.D.:
26.4%
20%
N:
18%
Median:
20% 632
17% 15%
15
11% 10%
10 6%
6% 5
0
Number of developers in class
4%
4%
6% 4%
0%
1%9%
10%19%
20%29%
30%39%
40%49%
50%59%
60%69%
70%79%
80%89%
90%100%
36
113
106
97
69
37
61
27
28
35
23
Despite the prominent role of code reuse in OSS development as consistently indicated by all three measures presented, the high standard deviations also reveal large heterogeneity in developers’ code reuse behavior. Developers’ reasons for and against code
90
Open source software developers’ perspectives on code reuse
reuse in their development are expected to partially drive this heterogeneity and are explored in the following chapter.
3.6.3. Developers’ reasons for and against code reuse In the analysis of developers’ reasons for and against code reuse five different sets of factors are considered. First, the benefits of code reuse as perceived by OSS developers are analyzed, followed by an investigation of the drawbacks and issues which developers see in code reuse. Third, social pressures regarding the reuse of existing code reflected in developers’ subjective norm are considered, and, fourth, the effect of project policies regarding code reuse is taken into account. Finally, general impediments to code reuse are addressed. Developers’ perceived benefits of code reuse Based on the qualitative pre-study as well as the existing literature, eight distinct benefits of code reuse have been identified. Survey participants were asked to indicate their agreement on a 7-point Likert scale to statements reflecting these benefits. Results are displayed in Figure 3-5 and show that all of the statements receive rather high shares of agreement. The two statements with the highest level of agreement both point to efficiency effects of reuse. 92% of the developers agree that code reuse helps developers realize their project activities faster and 91% agree that code reuse allows developers to focus on the most important tasks of the project, thereby again allowing their project to progress faster toward its goals. The two efficiency arguments are followed by a statement pertaining to effectiveness effects of code reuse (84% agreement), pointing out that by relying on existing code, developers can solve problems for which they themselves lack the knowledge. For the benefits on ranks four and higher, agreement drops significantly compared to rank three, yet is still relatively high. Ranked fourth and fifth are statements addressing effects of code reuse on the quality of the software being developed by making it more stable (74% agreement) and more compatible with standards (73% agreement). The statement ranked seventh, about the effects of code reuse on software security also pertains to this group, however, with 57% it receives considerably less agreement. This could be explained by the fact that many OSS projects develop types of software for which security is not a major concern, for example, games.
Open source software developers’ perspectives on code reuse
91
Ranked sixth and eighth are statements which position code reuse as a means for developers to select their project tasks by preference and avoid mundane jobs. Code reuse can help developers to focus on the tasks they are most interested in by reusing existing code for those tasks which are less preferred by them (67% agreement). Further, by reusing existing code for certain functionality in their own project, developers can “outsource” the maintenance work for this functionality to developers outside of their project (60% agreement). Figure 3-5: Code reuse benefits perceived by OSS developers Benefits of code reuse as perceived by developers (in % of developers)
Share agreement Share disagreement
Mean S.D.
Reusing helps developers realize their project goals/ tasks faster. [BEN_FASTER]
3%
92% 6.00 1.07
Reusing allows developers to spend their time on the most important tasks of the project. [BEN_MOST_IMP]
3%
91%
Reusing allows developers to solve difficult problems for which they lack the expertise. [BEN_EXPERTISE]
9%
84%
5.96
1.06
5.61
1.32
Reusing helps developers create more reliable/ stable software, e.g. less bugs. [BEN_RELIABLE]
14%
74%
5.23
1.45
Reusing ensures compatibility with standards, e.g. the look and feel of GUIs. [BEN_STANDARD] Reusing allows developers to spend their time on the development activities they have most fun doing. [BEN_MOST_FUN] Reusing helps developers create more secure software, e.g. less vulnerabilities. [BEN_SEC]
13%
73%
5.15
1.41
5.02
1.42
4.73
1.45
4.73
1.55
Reusing allows developers to "outsource" maintenance tasks for certain of their code to developers outside of their project, e.g. fixing bugs. [BEN_OUTSOURCE]
50
14%
67%
19%
57%
24% 25
60% 0
25
50
75
100%
Notes: “[BEN_FASTER]” denotes the name of the item as it is referred to in later analyses; the text describing the items is a verbatim copy of the text presented in the questionnaire; the share of developers who are “indifferent” about the respective benefits is not shown; N=632.
In order to check consistency of responses and to construct factor scores to be used in the multivariate analysis later, an exploratory factor analysis is carried out with the benefits of code reuse. A KMO (Kaiser-Meyer-Olkin) value of 0.760 and a Bartlett test rejecting the null hypothesis with p<0.0001 suggest that the data are suited for factor analysis.112 The number of factors was initially determined by selecting all factors with an eigenvalue greater than one (Backhaus et al. 2008), yet, the resulting two factors were difficult to interpret because both BEN_EXPERTISE and BEN_OUTSOURCE did not load properly on any one of the two factors. Due to that four factors are extracted which together explain 77.3% of total variance. The resulting factors (see Table 3-8) can be interpreted as software 112
Whether data are suitable for factor analysis is determined by their correlation matrix. The Bartlett test checks whether all variables in the data are uncorrelated. The KMO criterion is a further indicator testing the adequacy of data for factor analyses (Backhaus et al. 2008). If the KMO value is below 0.5 Kaiser and Rice (1974) discourage factor analysis. They consider values above 0.7 as “middling” (Kaiser & Rice 1974, p. 111).
92
Open source software developers’ perspectives on code reuse
quality benefits from code reuse (including items BEN_RELIABLE, BEN_SEC and BEN_STANDARD), development efficiency benefits from code reuse (BEN_FASTER, BEN_MOST_IMP), task selection benefits from code reuse (BEN_MOST_FUN and BEN_OUTSOURCE), and effectiveness benefits from code reuse (BEN_EXPERTISE). With Cronbach’s Į values of 0.717 and 0.807, respectively, the efficiency and quality factors exhibit good internal consistencies. Cronbach’s Į for the task selection factor is however only 0.473 which is rather low. Yet, as both Churchill (1979) and Homburg and Baumgartner (1995) deem values of 0.5 acceptable for constructs which are measured for the first time and the task selection factor is only barely below this value, the factor is retained.113 Table 3-8: Rotated factor loadings of benefits of OSS code reuse items Rotated component matrix Item Factor 1 Factor 2 Factor 3 Factor 4 BEN_EXPERTISE 0.086 0.170 0.089 0.949 BEN_FASTER 0.176 0.802 0.008 0.314 BEN_MOST_IMP 0.166 0.833 0.249 0.056 BEN_MOST_FUN -0.015 0.390 0.758 0.031 0.765 0.161 BEN_OUTSOURCE 0.344 -0.025 BEN_RELIABLE 0.843 0.274 0.124 -0.028 BEN_SEC 0.875 0.123 0.109 0.091 BEN_STANDARD 0.739 -0.008 0.104 0.246 Eigenvalue of factor 3.183 1.312 0.898 0.788 Notes: The factor analysis uses principal component analysis and Varimax rotation; figures in bold and with gray shading represent the highest factor loading of the respective item; see Figure 3-5 for the text behind each item; N=632.
Developers’ perceived drawbacks and issues of code reuse Following the benefits of code reuse, nine drawbacks and issues of code reuse (shown in Figure 3-6) were presented to participants who were again asked to indicate their agreement to the respective statements. The statement with the highest share of agreement (81%) points to the loss of control which developers may have to accept when reusing existing code. Having reused existing code, the developers of a project may not be able to maintain and further develop this code because it is too complex or because fully understanding it at a low level would take too much time. In such a situation the developers have become dependent on the original authors of the code and must rely on these developers to e.g. fix bugs or implement additional functionality. The statements ranked second and third also relate to losing control, however, with significantly lower levels of agreement. The statement ranked second points to software
113
Wierenga and Oude Ophuis (1997) even argue that given the calculation of Cronbach’s Į, in factors with only two or three items values of 0.4 should be accepted.
Open source software developers’ perspectives on code reuse
93
being more difficult to install and use by end-users due to technical dependencies resulting from reused code (68% agreement), while the statement ranked third reflects developers’ obligation to check and integrate updates of reused code (61% agreement).114 Especially integrating updates of reused components which have changed the structure of their interfaces has been described as a rather painful task by developers in the qualitative prestudy. Figure 3-6: Code reuse drawbacks and issues perceived by OSS developers Drawbacks and issues of code reuse as perceived by developers Share agreement (in % of developers) Share disagreement Through reuse projects become dependent on other projects, e.g. to fix bugs or add functionality. [DRAW_OTH_PROJ ]
Mean
S.D.
5.23
1.19
4.81
1.53
4.62
1.29
54%
4.45
1.37
54%
4.37
1.44
9%
Dependencies created by reuse make a project more difficult to install and use. [DRAW_DEPENDENCY ]
22%
Reusing creates additional work, e.g. in the form of fixing broken linkages after an update of the reused component or checking for updates of reused components. [DRAW_ADD_WORK ]
21%
Through reuse developers might introduce security risks to their project. [DRAW_SEC ]
81% 68% 61%
25%
Through reuse developers might introduce quality risks to their project, e.g. bugs. [DRAW_QUALITY ]
29%
Adapting and integrating reusable resource usually takes longer than implementing the functionality from scratch. [DRAW_INTEGRATE ]
49%
35%
3.75
1.47
Understanding reusable resources usually takes longer than implementing the functionality from scratch. [DRAW_UNDERSTAND ]
51%
33%
3.67
1.51
3.31
1.40
3.16
1.42
Reuse hurts the performance of a project (e.g. lower speed, higher demand for memory). [DRAW_PERFORMANCE ]
56%
Finding reusable resources usually takes longer than implementing the functionality from scratch. [DRAW_FIND]
20%
66% 100
22% 50
0
50
100%
Notes: “[DRAW_OTH_PROJ]” denotes the name of the item as it is referred to in later analyses; the text describing the items is a verbatim copy of the text presented in the questionnaire; the share of developers who are “indifferent” about the respective drawbacks and issues is not shown; N=632.
Ranked fourth and fifth – and again with significantly lower levels of agreement than the previous statements – are two potential issues of code reuse which point to security (54% agreement) and quality (54% agreement) risks which developers may introduce to their project when reusing existing code. Somewhat related to these two statements is the toll which reused code may take on the performance of a project. The respective statement receives however only 20% agreement and is ranked eighth. This may be because given today’s large availability of processing power and memory such performance considerations are only applicable to a small group of rather special projects. The statements ranked sixth, seventh, and ninth all describe situations where development from scratch is more efficient than code reuse because finding reusable code (rank 9, 22% agreement), understanding the code (rank 7, 33% agreement) and adapting 114
Both statements mainly refer to component reuse and are only partially applicable to snippet reuse.
94
Open source software developers’ perspectives on code reuse
and integrating it (rank 6, 35% agreement) take very long. These statements which reflect the main reasons existing literature states as explanations why individual developers do not reuse existing knowledge (see Chapters 3.2.1 and 3.2.2) do, however, receive at least 50 percent disagreement, which emphasizes that most OSS developers do not deem searching, understanding, and adapting reusable code as inefficient. Similar to the benefits of code reuse, an exploratory factor analysis is conducted with the drawbacks and issues of code reuse. Again KMO value (0.726=”middling”) and Bartlett test (p<0.0001) suggest that the data are suited for the analysis. The three factors with an eigenvalue greater than one explain 68.9% of total variance. The resulting factors (see Table 3-9) can be interpreted as inefficiency of code reuse (including items DRAW_FIND, DRAW_UNDERSTAND and DRAW_INTEGRATE), software quality risks from code reuse (DRAW_QUALITY, DRAW_SEC and DRAW_PERFORMANCE) and
loss
of
control
risks
because
of
code
reuse
(DRAW_DEPENDENCY,
DRAW_OTH_PROJ and DRAW_ADD_WORK). With 0.457, the highest loading of item DRAW_PERFORMANCE is rather weak; however, as there is some distance between its highest and its second highest loading and further the internal consistency of the software quality risk factor is rather high, the item is retained in the factor. All factors exhibit good internal consistencies with Cronbach’s Į values of 0.844, 0.756 and 0.652 for efficiency, quality and control loss, respectively. Table 3-9: Rotated factor loadings of drawbacks & issues of OSS code reuse items Rotated component matrix Item Factor 1 Factor 2 Factor 3 DRAW_FIND 0.859 0.100 0.042 DRAW_UNDERSTAND 0.876 0.116 0.101 DRAW_INTEGRATE 0.837 0.184 0.146 0.920 0.103 DRAW_QUALITY 0.159 DRAW_SEC 0.100 0.926 0.110 DRAW_PERFORMANCE 0.232 0.457 0.300 DRAW_DEPENDENCY 0.185 0.069 0.760 DRAW_OTH_PROJ 0.016 0.131 0.819 0.725 DRAW_ADD_WORK 0.163 0.208 Eigenvalue of factor 3.278 1.635 1.284 Notes: The factor analysis uses principal component analysis and Varimax rotation; figures in bold and with gray shading represent the highest factor loading of the respective item; see Figure 3-6 for the text behind each item; N=632.
To further consolidate the number of variables in the multivariate model employed later, the four factors representing the benefits of OSS code reuse (see Table 3-8) and the three factors identified for the drawbacks and issues of OSS code reuse (see Table 3-9) are subject to a further factor analysis. In the result of this factor analysis the software quality benefits factor and the quality risks factor are combined into one factor. Moreover, the development efficiency benefits factor and the inefficiency of code reuse factor are
Open source software developers’ perspectives on code reuse
95
represented by one common factor. The resulting final set of factors which is used in the multivariate model later is: effectiveness benefits, efficiency benefits, quality benefits, task selection benefits, and loss of control risks. Developers’ subjective norm on code reuse Having discussed the benefits, drawbacks and issues which individual developers see in code reuse, the next analysis addresses social pressures to reuse or not reuse existing code perceived by OSS developers. This subjective norm is covered with a two-item construct which is related to current TPB research (Riemenschneider et al. 2002; Mellarkod et al. 2007). Interesting about this construct, which exhibits excellent validity and reliability characteristics,115 is that half of the developers are “indifferent” about it (see Figure 3-7). Only about a quarter of them perceives social pressures to reuse existing code and believes that code reuse results in reputational gains. Figure 3-7: OSS developers’ subjective norm on code reuse Share agreement Share indifferent
Developers' subjective norm on code reuse (in % of developers)
Share disagreement
Developers that reuse a lot are highly regarded among their peers.
25%
Developers that reuse a lot have a higher reputation in the open source community than developers that reuse little.
50%
28% 0
49% 25
50
75
Mean S.D.
25%
3.92
1.28
24%
3.91
1.34
100%
Notes: Differences to 100% in the bars are due to rounding; the text describing the items is a verbatim copy of the text presented in the questionnaire; N=632.
Project policies on code reuse Beyond subjective norm resulting from their social environment, developers’ decision to reuse or not reuse existing code could also be influenced by explicit policies within their OSS project. Such policies can be seen as the OSS equivalents of corporate reuse programs within firms which are considered to be very important for code reuse to work properly in commercial environments (see Chapter 3.2.2). Of the total 632 developers only 340 could be influenced by such a policy in their current main project because the other 292 are the only developer in their project (see Figure 3-8). Roughly two thirds of these 340 developers contribute to a project with a reuse 115
With indicator reliabilities of 0.87 and 0.80, a Cronbach’sGĮ value of 0.91, a composite reliability of 0.91 and an average variance extracted of 0.83 the construct and its items exceed all relevant cut-off values (see Chapter 3.6.1).
96
Open source software developers’ perspectives on code reuse
policy. However, typically these policies are rather informal and reflect a common understanding of the project team of how to deal with code reuse. Only 22 developers report a formal policy which is e.g. explicitly explained to new developers when joining the project. Policies are usually supportive of code reuse, either explicitly allowing it (50%) or even actively promoting it (42%). Only three developers contribute to projects in which the policy strongly discourages reuse and 15 are faced with policies which limit code reuse to situations where there is no way to avoid it. Despite the lack of formal authority in OSS projects (see Chapter 3.3.3) developers seem to adhere to the code reuse policies in their projects. The type of the code reuse policy significantly influences developers’ code reuse behavior (oneway ANOVA, p<0.0001).116 Along the different policy forms of strongly discouraging, limiting, allowing, and promoting code reuse, developers report that the functionality based on existing code they have added to their current main project is 10.0%, 19.7%, 26.1% and 42.4%, respectively.117 Figure 3-8: OSS project policies on code reuse Existance of project policies regarding code reuse (in # of developers) 632
Formal policy
Policy strongly discouraging reuse
Common understanding 340 -292 207
207
-133
Policy limiting reuse Policy promoting reuse Policy allowing reuse
Total developers
Developers in singledeveloper projects
Developers in multideveloper projects
Developers in projects w/o reuse policy
Developers in projects w/ reuse policy
Developers in projects w/ reuse policy
Interestingly, policies seem to affect developers’ code reuse behavior regarding the number of reused components (oneway ANOVA, p=0.0023), but not their reusing of snippets (oneway ANOVA, p=0.9486). This may be because the policies explicitly only
116
This significance level of p<0.0001 is consistently confirmed for all three metrics of code reuse behavior introduced in Chapter 3.6.2.
117
An analysis using the importance of code reuse metrics leads to similar results.
Open source software developers’ perspectives on code reuse
97
address component reuse or because snippet reuse is easier to hide from other project team members. General impediments to code reuse While developers’ reasons for or against code reuse presented so far were subjective and reflected the perceptions of the individual developers, there also exist general impediments to reuse. These four general impediments make code reuse difficult or impossible even if the individual developer wanted to rely on existing code (see Figure 3-9).118 Interestingly, however, all statements offered to the surveyed developers receive more disagreement than agreement. The statement “there exist only very few reusable resources for [my current main project]”119 ranks first among the general impediments, however, still receives only 39% agreement. Oneway ANOVA analysis to identify for which projects there exist least reusable resources finds the target operating system of a project to have a weak significant influence on the availability of reusable code (p=0.0846). Projects which are developed for POSIX operating systems (e.g. Linux) or Windows have more reusable code at their disposal than projects aiming at less common operating systems such as MacOS. Beyond that, neither the topic of the project (oneway ANOVA, p=0.3628), nor the type of graphical user interface employed by the project (oneway ANOVA, p=0.1912) exhibit a significant influence.120 However, for projects with a less than average technical complexity there exists more reusable code than for projects with an above average technical complexity (paired t-test, p=0.0291). This may be because the functional requirements of projects with less complexity are lower and less unique. Lastly, there is no difference in the existence of reusable code for projects developing software positioned low in the software stack121 and those working on software positioned high in the software stack (paired t-test, p=0.3037). This is surprising because projects positioned higher in the software stack should be able to
118
While these “general impediments” are rather objective compared to developers’ beliefs about the benefits, drawbacks and issues of code reuse and the social pressures which developers perceive, they may still reflect individual developers’ opinions, having been measured by asking the developers.
119
The text in italics is a verbatim copy of the text presented in the questionnaire.
120
Some graphical user interfaces such as KDE bring with them large libraries of components for various functionalities such as XML processing or audio (German 2005). Thus, it would have been possible that developers working on projects with such graphical user interfaces have more reusable code at their disposition than developers working on projects e.g. using the console as user interface.
121
The software stack describes the set of software subsystems required to deliver a solution to the end-user. Very low in the stack there is typically an operating system, very high there are end-user applications such as word processing software.
98
Open source software developers’ perspectives on code reuse
reuse code from projects positioned lower while the other way is not possible. However, apparently there is enough code to be reused in the lower levels of the software stack as well. Figure 3-9: General impediments to code reuse perceived by OSS developers General impediments to code reuse as perceived by developers Share agreement (in % of developers) Share disagreement There exist only very few reusable resources for [project].
48% 63%
License issues make reusing in [project] very difficult The software architecture of [project] make reusing very difficult [Project]'s programming language makes reusing very difficult
39% 24%
72%
17%
85% 100
9% 50
0
Mean
S.D.
3.77
1.82
3.01
1.85
2.63
1.59
2.15
1.39
50%
Notes: The share of developers who are “indifferent” about the respective impediments is not shown; In the questionnaire “[project]” was replaced with the name of the developer’s current main project which she had entered earlier in the survey; the text describing the items is a verbatim copy of the text presented in the questionnaire; N=632.
Ranked as the second general impediment to code reuse with only 24% agreement are license incompatibilities. A license incompatibility would occur e.g., if a developer wanted to reuse code snippets licensed under the GPL in a project licensed under the BSD license (see Chapter 3.4.3 for a more detailed explanation of license incompatibilities). As expected, the license of developers’ main project significantly influences this general impediment (oneway ANOVA, p<0.0001), with developers working on GPL licensed projects least likely to perceive this as an issue. However, the low share of agreement is surprising. Three explanations for this finding seem plausible. First, there exists enough reusable code on each license “island”; second, developers are able to mitigate the license incompatibilities through modular project architectures which clearly separate modules under different licenses and thus avoid contamination issues (Henkel & Baldwin 2009); third, developers are not knowledgeable about license incompatibilities and ignore the potential issues. Somewhat contradicting the third possible reason is the fact that at least some component developers seem to be aware of the issue and purposefully choose licenses other than the GPL for their projects to mitigate the license incompatibility problems which developers reusing their component might have. With 41%, component projects are significantly less likely (paired t-test, p<0.0001) to be licensed under a highly restrictive GPL license than standalone executable projects with 68%. Ranked third and fourth with 17% and 9% agreement, respectively, are the architecture of the developers’ current main project being not modular enough to allow for easy
Open source software developers’ perspectives on code reuse
99
integration of reusable code (rank 3) and incompatibilities between the project’s main programming language and the programming language of the code the developer wants to reuse (rank 4). Both are significantly dependent on the programming language of the developer’s project (oneway ANOVA, p=0.0059 and p<0.0001 for rank 3 and rank 4, respectively), with C++ and Java as object-oriented languages posing the least issues. Summarizing, only a minority of the developers seems to be affected by the general impediments to code reuse with the lack of existing reusable code being the most important such impediment. Potentially, the scale which OSS has reached by now implies that there exist various parallel implementations of nearly any functionality and developers can choose existing code which fits their OSS license, programming language and software architecture best.
3.6.4. Component and snippet reuse After having established the importance of code reuse for OSS developers and after having investigated developers’ reasons for and against code reuse, this chapter dives one level deeper to explore how exactly OSS developers reuse existing code. In the course of this, the role of both components and snippets as the two forms of code reuse in OSS is analyzed. Further the questions of how the two different forms of code are reused and why developers might prefer one form over the other are addressed. Component reuse On average developers reuse 5.6 components in their current main OSS project (see Figure 3-10). The distribution is however highly skewed with a median of 2 reused components per developer. Further, on the extremes, 21% of the developers do not reuse components at all while 9% reuse more than ten components. An analysis of the top five components reused by developers leads to 1,001 different components highlighting the broad variety of existing OSS code which can be reused.122 Ranked as the five components reused most frequently are Apache Commons (reused 39 times), zlib (28), Qt (27), SDL (26), and Apache log4j (23). All of these provide rather generic low-level functionality which is required in many projects, but timely and difficult to develop internally. The type of these components reused most frequently supports the 122
Developers were asked for the names of their five most important components only. The names of further components were not collected. If developers reuse less than five components, only the names of the reused components were asked for. Different spellings of the same component were corrected manually.
100
Open source software developers’ perspectives on code reuse
above finding (see Chapter 3.6.3) that efficiency and effectiveness gains are the main benefits that developers see in code reuse. Of the developers reusing at least one component, 51% have reused their components without any modifications. 36% have made minor changes to their components and 13% have significantly changed their components when reusing them in their main project. Given the potential issues resulting from component modifications (see Chapter 3.2.2) it is interesting to understand under which circumstances developers choose to modify the components they reuse. Figure 3-10: Number of reused components in OSS projects Mean:
Number of components reused by developer in current main project (in % of developers)
S.D.:
40%
N:
21%
2 632
20%
20
10%
10
Number of developers in class
Median:
31%
30
0
5.6 14.7
9% 4%
5%
0
1-2
3-4
5-6
7-8
9-10
More than 10
133
193
128
63
26
31
58
The analysis addressing this topic finds that developers who work on their projects to improve their software development skills tend to modify reused components more frequently (logistic regression, p=0.030).123 These developers presumably use existing components as starting points for their own trial and error learning as posited in hypothesis H4c. Further, modification of components seems to be more common in projects with many developers (logistic regression, p=0.077). Such projects might have the man-power to be able to afford “customizing” components to perfectly fit their needs while smaller projects might be required to make a trade-off and accept a reasonable component at the 123
The reasons explaining modification of components are determined by logistic regression analysis with robust standard errors (observations=499, pseudo R²=0.1117, ȋ²(38)=61.58, p=0.0091) using a dummy as dependent variable which indicates whether a developer has modified at least one component in her current main project or not. The model contains explanatory variables addressing the usage frequency of the various sources to search for existing code to reuse (see Chapter 3.6.5), the developer’s projects motivation, her attitude toward code reuse, perceived general impediments to code reuse, demographics regarding the developer and characteristics of her current main project.
Open source software developers’ perspectives on code reuse
101
expense of having some extra time to work on other features of their project. Similarly, developers who spend more time on their main project also tend to modify reused components more frequently (logistic regression, p=0.013). Analogous to the additional man-power of larger projects, these developers might possess the additional time required to customize the components. Moreover, developers in C++ and Java projects tend to modify less frequently (logistic regression, p=0.011). This may be due to the standardized object-oriented paradigm of these languages which allows the integration of external components without any changes to them. Finally, developers who rely on their own other OSS projects frequently to find reusable components modify them more often (logistic regression, p=0.002), probably to adapt them to the new context and because they know them well while developers searching in Linux distributions (logistic regression, p=0.076) and their personal network (logistic regression, p=0.083) do so less frequently. Developers who reuse components which are part of Linux distributions most likely do so because these components can be expected to already exist on the computers of many potential users of their software and thus do not have to be distributed separately by the developer. If they changed the component they would forego this advantage. Developers who rely frequently on reusing components referred to them from their personal network should have the option to ask the referring developers in their network about the technical peculiarities of the component and discuss about them. If they started changing the component they would give up this possibility because then their individual component would differ from the one that the other developers in their network have experience with. Snippet reuse On average 9.6% of the lines of code developers contribute to their current main project are snippets copied and pasted from other existing code. However, the distribution is again skewed with a median of 5% and 23% of the developers not reusing snippets at all. Nonetheless, on the other extreme 9% of the developers report that more than 30% of their contributed lines of code result from snippets. This is interesting as snippet reuse is usually considered inefficient in existing scholarly work and is typically not included in any corporate reuse program (see Chapter 3.2.2). Also Haefliger et al. (2008) do not find significant snippet reuse in their case studies on OSS projects.
102
Open source software developers’ perspectives on code reuse
Of the developers reusing snippets, 4% do not change the snippets at all, but copy and paste them into their code as they are. As expected this number is rather low because when reusing snippets, typically at least minor changes such as adapting variable names are necessary. 33% of the developers make minor changes to the snippets and 63% apply major changes to the code. Figure 3-11: Share of snippets in lines of code contributed to OSS projects Mean:
Share of snippets in developers' lines of code contributed to current main project (in % of developers)
9.6%
S.D.:
30%
15.8%
Median:
5%
N:
632
25% 23% 19%
20
12% 10
8%
9%
4% 0
Number of developers in class
0%
1-4%
5-9%
10-14%
15-19%
20-29%
More than 30%
145
158
120
73
26
52
58
An analysis of reasons explaining the differences in snippet modification behavior highlights that developers concerned about losing control over their project modify their snippets more heavily (logistic regression, p=0.028), presumably in order to better understand the code and make sure that the snippets are fully under their control.124 Developers convinced of the quality benefits of code reuse modify their snippets less (logistic regression, p=0.091) and thus avoid introducing quality issues into the code. Similarly, developers perceiving the task selection benefits of code reuse more strongly do more rarely apply major changes to their snippets (logistic regression, p=0.055), potentially because this would require them to deal with project tasks they would rather avoid. Similar to the component modification analysis, developers in larger teams tend to apply greater changes to their snippets (logistic regression, p=0.092) while developers who have been educated about reuse at university modify less (logistic regression, p=0.003). Further, developers who have more reusable snippets at hand modify them less (logistic regression, 124
The reasons explaining major modification of snippets are determined by logistic regression analysis with robust standard errors (observations=487, pseudo R²=0.1064, ȋ²(40)=64.37, p=0.0086) using a dummy as dependent variable which indicates whether a developer has applied major changes to at least one snippet in her current main project or not. The model contains the same explanatory variables as the model described in footnote 123.
Open source software developers’ perspectives on code reuse
103
p=0.054) and apparently just keep looking for the perfect code they can integrate with the least modification effort. Moreover, developers for whom building their reputation in the OSS community is important change snippets less radically (logistic regression, p=0.042). This could be due to the unwritten rule in the OSS community to respect other people’s code and give credit when using it (Haefliger et al. 2008). In a more radical interpretation this rule could also be understood as not to tamper with other people’s code. Similar to the modification of components, developers who use their own other OSS projects frequently as sources to search for snippets to reuse modify them more heavily (logistic regression, p=0.023) as do developers who rely much on code example web pages to search for snippets (logistic regression, p=0.025). In both situations the snippet should be easy to understand for the developer, either because it is own code or because it is commented well on the code example web page (see Chapter 3.6.5). However, the price of being easier to understand should be that the snippet needs large modifications because its original context is quite different from the new one the developer wants to apply it in. Differences between component and snippet reuse To better understand the differences between component and snippet reuse and shed light on the question of why developers might prefer one over the other, developers who rely mainly on reusing components are compared with developers who mainly reuse snippets. In order to do so, developers are grouped into quintiles according to the number of components they have reused and the share of snippets in the code they have contributed (see Figure 3-12). Developers who are in a high quintile with regard to snippet reuse but a low quintile with regard to component reuse are considered as snippet reuse focused developers. The group of component reuse focused developers is constructed analogously.
Snippet reuse intensity (in quintiles)
Figure 3-12: Component and snippet focused OSS developer groups
5th
24
35
4th
18
32
Snippet reuse 17 focused developers 15 24
3rd
17
49
25
24
24
2nd
20
38
15
17
41
1st
54
39
1st
2nd
15
Component reuse focused 12 developers 3rd
19 18
27
13
4th
5th
Component reuse intensity (in quintiles) Notes: The numbers in the boxes indicate the number of developers at the intersection of the two respective quintiles. N=632.
104
Open source software developers’ perspectives on code reuse
The comparison of the two groups points out that developers motivated by tackling technical challenges are more snippet reuse focused (logistic regression, p=0.002) while developers who contribute to their projects because of the creative pleasure they experience while coding are more component reuse focused (logistic regression, p=0.032).125 The preference of snippets over components by developers interested in tackling challenges seems reasonable because components usually contain so much functionality that they solve a technical problem completely and thereby leave no technical challenge for the developer to solve herself. With snippets on the other hand, developers can decide themselves which parts of the problem they want to solve with the snippet and which parts they want to leave for themselves. They might even reuse snippets to get rid of the less interesting facets of a problem in order to have more time to mull over the really interesting issues. For developers interested mainly in the creative pleasure experienced when coding the situation is exactly different. They thrive on writing their own lines of code and consequently might not like to include other people’s lines of code (i.e. snippets) into their code structure. Reusing components, however, does not affect the structure of their code much as they typically need only one line of code to link to the component and then can go on doing what they enjoy and continue writing their own lines of code. As a third difference, developers in projects with greater technical complexity tend to be rather component reuse focused (logistic regression, p<0.001), presumably because they need the large blocks of functionality which can be imported by components in order to be able to realize their project. On the contrary, developers in projects with many other developers are more geared to snippet reuse (logistic regression, p=0.037), most likely because they have the man-power to implement things by themselves for which smaller projects would rely on an existing component. Moreover, developers in projects not developed in the object-oriented Java or C++ programming languages tend to reuse snippets rather than components (logistic regression, p<0.001). This should be because linking components in programming language with no or less standardized interfaces is difficult and can take a lot of time. Finally, developers who 125
The reasons explaining developers’ tendency to be either more snippet or more component reused focused are determined by logistic regression analysis with robust standard errors (observations=207, pseudo R²=0.3345, ȋ²(22)=63.14, p<0.0001) using a dummy as dependent variable which indicates to which group a developer belongs. The model contains the same explanatory variables as the model described in footnote 123 with two changes: First, an additional dummy is included controlling whether the developer’s current main project is licensed under the GPL. Second, the developer’s attitude toward code reuse is not included because it does not differentiate between her attitude toward component reuse and snippet reuse.
Open source software developers’ perspectives on code reuse
105
have been involved in OSS longer (logistic regression, p=0.014) and those that work on a project located rather low in the software stack favor component reuse (logistic regression, p=0.069).
3.6.5. Developers’ sources to search for existing code to reuse As the last descriptive analysis, the sources which OSS developers turn to when searching for code to reuse are investigated. To analyze the importance of different sources to find existing code to reuse, developers were offered ten such sources to search for components which had been identified in the qualitative pre-study and existing literature. For snippets two additional sources were offered. For both code types developers were asked to indicate on a five-point scale ranging from “never” to “always” how often they turn to the respective sources when searching for code to reuse. Contrary to Haefliger et al. (2008, p. 188) who find that “[…] even more important than repositories and search engines were means of local search […]”, the surveyed developers point to general purpose search engines (e.g. Google) as their first port of call when searching for both components and snippets (see Figure 3-13). This is interesting because general purpose search engines do not allow systematically searching for code characteristics such as programming language or OSS license. Specific code search engines (e.g. Koders.com) which offer exactly these features are used only very rarely (rank 8 for components and rank 9 for snippets). The popularity of general purpose search engines may be due to two reasons. First, general purpose search engines also aggregate information available in several of the other sources and are thus an ideal starting point which may e.g. link to an OSS repository in the second step. A developer from the qualitative pre-study makes this point when explaining, “the best way to search RubyForge is Google.”126 Second, the search algorithms of general purpose search engines typically rank those entries highly which are referred to a lot on other web pages and which are clicked on frequently. Because of that, the ranking of existing code in a general purpose search engine can help developers to determine the quality of the code they consider to reuse.
126
RubyForge is an OSS repository that focuses on projects developed in the Ruby programming language, http://rubyforge.org, last accessed 05.11.2009. As most other OSS repositories RubyForge has own search functionalities on its web site.
106
Open source software developers’ perspectives on code reuse
Figure 3-13: OSS developers’ sources to find existing code to reuse Developers' sources to search for existing code to reuse (in average frequency of use by developers) 5 Components Snippets
3.9 3.3
3.3
3 2.5
2.8 2.8
2.6 2.6
2.6
2.4
2.4 2.5
2
1.8
1.9
2.0
2.2 1.8 1.9 1.4 1.5 Code examples web pages**
Developers' CSS* projects
Mailing list or forum of project
Source code search engines
Linux distributions
Other OSS projects similar to developers' project
Other OSS developers in related communities
Developers' personal network
Developers' other OSS projects
OSS repositories
n/a General purpose search engines
1
2.4
n/a Books and magazines**
4.1 4
*CSS=Closed Source Software, i.e. software under a proprietary license. **Not suited for component search. Notes: Bars ordered by source importance for component search; frequency scale: 1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always; N=499 for components; N=487 for snippets.
Ranked second for components – but with significant distance to general purpose search engines – and fifth for snippets are OSS repositories such as SourceForge.net. OSS repositories contain several thousand OSS projects classified along multiple dimensions and provide search functionalities. They represent the closest match to reuse repositories in commercial firms which exists in OSS development. Interesting about OSS repositories is that research about reuse in commercial firms argues that reuse repositories become inefficient with increasing size (e.g. Ravichandran & Rothenberger 2003). OSS repositories such as SourceForge.net with more than 100,000 projects can certainly be considered large. However, developers apparently still consider them their second most important source to search for components to reuse. When searching for snippets OSS repositories are considered less frequently, probably because their search mechanisms are more applicable to whole projects and thus to components. With the same difference profile between components and snippets, Linux distributions as sources of reusable code are considered on rank seven for components and rank eleven for snippets. Similar to OSS repositories, Linux distributions are compilations of typically several thousand OSS projects which are catalogued in the distribution and which have been tested by the distributions editors. The finding that they are considered significantly less frequently than OSS repositories may be because their offering spectrum is narrower, because searching in them is less comfortable or because their software is less up-to-date. Linux distributions are released in cycles of several months which might be an issue for developers looking for
Open source software developers’ perspectives on code reuse
107
the most recent versions of code to reuse. The difference between the importance for components and snippets should be because the project structure of Linux distributions makes searching for components easier than searching for snippets. Ranked second for snippets are code example web pages.127 Such web pages which are often organized as forums give developers a platform to ask for technical solutions in the form of code or post own code and ask for comments on it. Of interest is the finding that code example web pages which typically do not offer very sophisticated methods to search are preferred over source code search engines which do offer these functionalities and – similar to code example web pages – also mainly cater to snippets. A reason for this might be that developers are not only interested in the code itself, but also in the comments describing and explaining the code which exist on the code search web pages. This need for explanation of the code lends support to hypothesis H4c that developers reuse existing code to improve their coding skills. Ranked next are several sources which pertain to local search such as developers’ other OSS projects, their personal network, developers in related communities and similar projects. For all of these there are only very little differences between component and snippet search. Developers’ closed source software projects and the mailing lists and forums of their projects which are further means of local search seem to be employed much less frequently when searching for existing code to reuse. This low importance of mailing lists and forums is presumably due to the high number of small and not very widespread projects in the sample. The mailing lists and forums of such projects usually do not have many subscribers and thus are not very helpful in finding existing code to reuse. Lastly, when searching for snippets, developers also “rarely” consider books and magazines which contain code examples. To gain further insights into the sources which developers turn to when searching for existing code to reuse and to better understand the effects of access to local search which are part of the research model presented in Chapter 3.4.3, the differences in the usage frequencies of the various sources stated by developers with better access to local search are compared to the frequencies stated by developers with worse access to local search (see Table 3-10).
127
Such code example web pages are not suitable to search for components and because of that are not ranked for components.
108
Open source software developers’ perspectives on code reuse
The results of this comparison point out that developers with better access to local search due to their larger OSS network or a greater number of OSS projects they have been involved in already, rely on means of local search more frequently than other developers. For both component and snippet search, means of local search such as developers’ other OSS projects are used much more frequently by developers with better access to local search than by other developers. The usage frequencies of other means of search such as general purpose search engines, however, differ only little between the two groups of developers. Table 3-10: OSS developers’ sources to find existing code by access to local search
Developers’ CSS* projects
Code examples web pages**
Books and magazines**
Frequency of use when searching for components by quality of access to local search*** Better access 4.1 3.4 3.1 2.8 2.8 2.6 2.5 1.9 Worse access 4.1 3.1 2.3 2.4 2.2 2.4 2.2 1.7
Mailing list or forum of project
Source code search engines
Linux distributions
Other OSS projects similar to developers’ project
Other OSS developers in related communities
Developers’ personal network
Developers’ other OSS projects
OSS repositories
General purpose search engines
Sources to search for existing code to reuse
2.0 1.6
1.5 1.3
n/a n/a
n/a n/a
Frequency of use when searching for snippets by quality of access to local search*** Better access 3.9 2.7 3.2 2.7 2.6 2.6 1.9 2.1 2.1 1.6 3.3 2.2 Worse access 4.0 2.4 2.4 2.4 2.2 2.4 1.6 1.8 1.6 1.4 3.2 2.2 *CSS=Closed Source Software, i.e. software under a proprietary license. **Not suited for component search. ***Following the research model from Chapter 3.4.3, developers are assumed to have better access to local search if they have a larger personal OSS network (known other OSS developers greater than median of 8) or have been involved in a greater number of OSS projects (number of OSS projects greater than median of 2). Notes: Top-3 differences highlighted in bold and gray shading; frequency scale: 1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always; N=499 for components; N=487 for snippets.
This finding emphasizes the benefits of means of local search such as a greater effectiveness and efficiency because developers who can employ these means apparently do so. Summarizing, developers rely on general purpose search engines most frequently when searching for existing code to reuse and not on dedicated code search engines. For components, OSS repositories are ranked second and contrary to existing reuse literature their size seems not to lead to inefficiencies. For snippets, code example web pages are used second most frequently pointing to the skill improvement effects of code reuse. The usage frequency of means of local search which are considered on the following ranks is highly dependent on developers’ access to them.
3.6.6. Summary The purpose of the descriptive and exploratory analyses in this section was to shed light on the mechanics of the reusing side of OSS development and establish the context
Open source software developers’ perspectives on code reuse
109
for the multivariate analyses explaining determinants of developers’ code reuse behavior. This was achieved by establishing the importance and extent of code reuse in OSS development, investigating OSS developers’ reasons to reuse existing code or not, understanding how OSS developers reuse existing code and exploring where they turn to when they search for code to reuse. The previous chapters have shown that code reuse plays a prominent role in OSS development with nearly 60% of the surveyed developers at least “somewhat” agreeing that code reuse is very important for the work on their projects. Further, nearly a third of the functionality developers have contributed to their current main project is based on reused code. As its main benefits, OSS developers point to the efficiency and effectiveness effects of code reuse and are concerned about a loss of control over their projects resulting from code reuse as its main drawback and issue. In terms of social factors influencing developers’ code reuse behavior, only about a quarter of the OSS developers perceive social rewards from reusing existing code. Of the developers working in team projects about two thirds report a project policy regarding code reuse. However, this policy is typically informal. Furthermore, policies are mostly either allowing reuse or explicitly promoting it. All of the general impediments to code reuse identified (lack of reusable code, license and programming language incompatibilities and architectural issues) receive more disagreement among OSS developers than agreement, pointing out that if OSS developers want to reuse existing code they typically can do so. On a more fine-grained level of analysis, OSS developers reuse on average 5.6 components (the median is 2) with nearly half of them modifying these components, among other reasons for skill improvement purposes. Further, nearly 10% of the code OSS developers have contributed to their current main project is based on reused snippets on average. A comparison of developers relying mainly on component reuse with those mainly reusing snippets finds that developers for whom tackling difficult technical challenges is a main motivation for their project work tend to be more on the snippet side while developers contributing to their project for the creative pleasure experienced while coding rather reuse components. Finally, an analysis of the sources which developers turn to when searching for code to reuse identifies general purpose search engines, OSS repositories and code example web pages as the main sources, however, also points out that developers do leverage means of local search more frequently if they have good access to them.
110
Open source software developers’ perspectives on code reuse
The detailed picture of the mechanics of code reuse in OSS development painted in this section provides the context to explore determinants of developers’ code reuse behavior in the next section.
3.7.
Multivariate analysis of determinants of code reuse
Following the descriptive and exploratory analyses, this section addresses research question seven in multivariate fashion. By testing the research model developed in Chapter 3.4.3, determinants of developers’ code reuse behavior are identified to answer the question under which conditions OSS developers reuse existing code. Before the results of the multivariate analyses are presented (3.7.4) and discussed (3.7.5), the research model hypotheses are summarized (3.7.1), the variables employed in the multivariate models are described (3.7.2) and the statistical methods used are introduced (3.7.3).
3.7.1. Hypotheses The research model developed in Chapter 3.4.3 proposes hypotheses which infer relationships between the code reuse behavior of an OSS developer and her attitude toward code reuse, her access to local search, the maturity of her project and the compatibility of code reuse with her individual goals in the project. See Table 3-11 for a recapitulation of the detailed hypotheses. Table 3-11: Summary of OSS code reuse research model hypotheses Attitude toward code reuse H1a The more positive developers perceive the effectiveness effects of code reuse, the more existing code they will reuse. H1b The more positive developers perceive the efficiency effects of code reuse, the more existing code they will reuse. H1c The more positive developers perceive the software quality effects of code reuse, the more existing code they will reuse. H1d The more strongly developers perceive the task selection benefits of code reuse, the more existing code they will reuse. H1e The more strongly developers perceive the loss of control risks from code reuse, the less existing code they will reuse. Access to local search H2a The larger developers’ personal OSS networks, the more existing code they will reuse. H2b The greater the number of OSS projects developers have ever been involved in, the more existing code they will reuse. Project maturity H3 The more mature developers’ project, the less existing code they will reuse. Compatibility of code reuse with developers’ project goals H4a The more important tackling difficult technical challenges is as a reason for developers to work on their OSS project, the less existing code they will reuse. H4b The more important creative pleasure is as a reason for developers to work on their OSS project, the less existing code they will reuse. H4c The more important skill improvement is as a reason for developers to work on their OSS project, the more existing code they will reuse. H4d The more important community commitment is as a reason for developers to work on their OSS project, the more existing code they will reuse. H4e The more important reputation building in the OSS community is as a reason for developers to work on their OSS project, the more existing code they will reuse. H4f The more important signaling of skills toward potential employers and business partners is as reason for developers to work on their OSS project, the more existing code they will reuse.
Open source software developers’ perspectives on code reuse
111
3.7.2. Variables Dependent variables. To test the research model three different measures of reuse behavior are employed as dependent variables in order to provide greater robustness of the results. See Table 3-12 for the descriptive statistics of these three different variables. The first variable (ImpRePast) describes the importance of code reuse in developers’ past work on their current main OSS project in an interval ranging from 1 to 7. The second variable (ReuseSharePast) reports the share (i.e. percentage) of reused code in the functionality developers have contributed to their current main OSS project in the past. The third variable (ImpReFut) describes the importance that developers expect code reuse to have in their future work on their current main project in an interval ranging from 1 to 7. Table 3-12: Descriptive statistics of dependent variables Variable 1. ImpRePast
Explanation Importance of code reuse in developers’ past work on their current main OSS project. Construct created as index of four 7-point Likert scale items describing code reuse as “very important.” See Chapter 3.6.2 for validity and reliability of construct. 2. ReuseSharePast Share of reused code in functionality contributed to developers’ current main OSS project during their past work. 3. ImpReFut Importance of code reuse for developers’ future work on their current main OSS project. Construct created as index of four 7-point Likert scale items describing code reuse as “very important.” See Chapter 3.6.2 for validity and reliability of construct. Note: All correlations are significance at p<0.001; N=632.
Mean
S.D.
Median
Min
Max
1.
2.
3.
4.73
1.85
5.25
1.00
7.00
1.00
29.96
26.43
20.00
0.00
99.00
0.67
1.00
4.57
1.69
4.75
1.00
7.00
0.76
0.57
1.00
Independent variables. See Table 3-13 for descriptive statistics of the dummy variables and Table 3-14 for descriptive statistics of the ordinal and metric variables. Table 3-13: Descriptive statistics of explanatory dummy variables Variable
Dummy variable equal to “1” if…
ProjPolSupport
Developer’s current main project has a policy (formal or as common understanding) which either allows or explicitly encourages code reuse Developer’s current main project has a policy (formal or as common understanding) which either strongly discourages or limits code reuse Developer’s current main project is a standalone executable application project and not a component project Developer is working as professional developer or has worked as professional developer for a commercial firm Developer has received training on reuse during her education Developer has received training on reuse as professional developer Developer resides in North America Developer resides in South America Developer resides Asia, Africa, Australia or Oceania
ProjPolDiscourage ProjStandalone DevProf DevEduReuse DevProfEduReuse Residence-N. America Residence-S. America Residence-Asia & RoW Note: N=632.
Frequency of “0”
Frequency of “1”
443 (70%)
189 (30%)
614 (97%)
18 (3%)
162 (26%)
470 (74%)
195 (31%)
437 (69%)
415 (66%) 550 (87%) 464 (73%) 602 (95%) 543 (86%)
217 (34%) 82 (13%) 168 (27%) 30 (5%) 89 (14%)
112
Open source software developers’ perspectives on code reuse
Table 3-14: Descriptive statistics of ordinal and metric explanatory variables Variable BenefitEffectiveness* (H1a) BenefitEfficiency* (H1b) BenefitQuality* (H1c) BenefitTaskSelection* (H1d) IssueControlLoss* (H1e) DevOSSNetsize (log) (H2a)
Explanation Mean S.D. Median Min Max Developer’s perception of effectiveness effects of code reuse 0.00 1.00 0.19 -4.75 2.00 Developer’s perception of efficiency effects of code reuse 0.00 1.00 0.08 -3.52 2.30 Developer’s perception of software quality effects of code reuse 0.00 1.00 -0.02 -3.97 2.91 Developer’s perception of task selection benefits of code reuse 0.00 1.00 0.03 -3.90 3.03 Developer’s perception of loss of control risks from code reuse 0.00 1.00 0.07 -3.81 2.41 Number of other developers in developer’s personal OSS 2.00 1.03 2.20 0.00 6.22 network (as logarithm) DevOtherProjects (H2b) Number of OSS projects besides current main project that 3.57 5.35 2.00 0.00 48.00 developer has ever been involved in ProjPhase (H3) Development phase of developer’s current main project (1=Pre3.22 1.18 3.00 1.00 5.00 Alpha, 2=Alpha, 3=Beta, 4=Stable/ Production, 5=Mature) MotChallenge** (H4a) Importance of challenge seeking as motivation for developer to 5.13 1.05 5.33 1.00 7.00 contribute to current main project MotCreaPleasure** (H4b) Importance of creative pleasure as motivation for developer to 5.16 1.09 5.00 1.67 7.00 contribute to current main project MotLearning** (H4c) Importance of skill improvement as motivation for developer to 5.33 1.10 5.33 1.00 7.00 contribute to current main project MotCommunity** (H4d) Importance of community commitment as motivation for 5.61 1.00 5.67 1.00 7.00 developer to contribute to current main project MotOSSReputation** (H4e) Importance of OSS reputation building as motivation for 3.61 1.63 4.00 1.00 7.00 developer to contribute to current main project MotSignaling** (H4f) Importance of commercial signaling as motivation for developer 4.32 1.54 4.67 1.00 7.00 to contribute to current main project DevNorm*** Developer’s perception of positive reputations effects of code 3.92 1.25 4.00 1.00 7.00 reuse DevSkill Self-assessment of developer’s software development skills compared to the average OSS developer (1=Much worse,…, 3.26 1.57 4.00 1.00 7.00 5=Much better) ConditionLack Developer’s agreement to lack of reusable code as impediment 3.80 1.82 4.00 1.00 7.00 to code reuse (7-point Likert scale) ConditionLicense Developer’s agreement to license incompatibilities as 3.01 1.85 2.00 1.00 7.00 impediment to code reuse (7-point Likert scale) ConditionLanguage Developer’s agreement to programming language incompatibilities as impediment to code reuse (7-point Likert 2.15 1.39 2.00 1.00 7.00 scale) ConditionArchitecture Developer’s agreement to issues with project architecture as 2.63 1.59 2.00 1.00 7.00 impediment to reuse (7-point Likert scale) ProjSize Number of developers in developer’s current main project 6.05 44.14 2.00 1.00 999† ProjComplexity Complexity of developer’s current main project compared to 2.95 1.03 3.00 1.00 5.00 average project on SourceForge.net (1=Much less complex,…, 5=More more complex) DevOSSExperience Number of years developer has been active in OSS 5.45 4.07 4.00 1.00 25.00 DevProjTime Average weekly hours developer works on her current main 8.74 10.67 5.00 0.50 58.00 project DevProjShare Share of work that has been done by developer in her current 67.52 37.05 90.00 5.00 100.0 main project as opposed to other project team members *Variable represents a factor score, see Chapter 3.6.3 for corresponding exploratory factor analysis. **Variable represents an index created from items on 7-point Likert scales, see Chapter 3.6.1 for corresponding confirmatory factor analysis. ***Variable represents an index created from items on 7-point Likert scales, see Chapter 3.6.3 for corresponding confirmatory factor analysis. †The main project of this developer is Linux where a very high number of project team members seems reasonable. Notes: Bolded variables reflect hypotheses; N=632.
Open source software developers’ perspectives on code reuse
113
The benefits, drawbacks and issues of code reuse addressed in the five H1 hypotheses are covered by five factor scores resulting from the analyses presented in Chapter 3.6.3. Access to local search (H2) is captured by the size of developers’ personal OSS network as logarithm and the number of projects they have ever been involved in. Project maturity (H3) is reflected in the development phase of the project measured on an ordinal scale ranging from 1 (“Pre-Alpha”) to 5 (“Mature”). The compatibility of code reuse with developers’ project goals (H4) is tested with six indices created from multiple items on 7point Likert scales in Chapter 3.6.1 which reflect the importance of developers’ various motivations to contribute to their current main project. For the further variables employed controlling for the various other aspects of the research model see Table 3-13 and Table 3-14. Neither the correlation matrix of the independent variables (see Table 3-15) nor the variance inflation factors (with a maximum value of 1.70) calculated after the regressions suggest that multicollinearity does influence the test results.128
128
Neter et al. (1996) suggest that multicollinearity might be an issue if variance inflation factors are above 10. Cohen et al. (2002) propose a lower threshold of 2.
0.03
0.12 -0.10 0.01
0.04
0.04
0.01
0.00
-0.09 0.04
25. ProjStandalone
26. DevOSSExperience
27. DevProjTime
0.30
0.02
0.18
0.02
0.05 0.00
0.01
0.08
0.07 -0.01 0.04
0.05 -0.02 0.05 -0.01 0.07
0.03
0.02
0.03
0.00
0.01 -0.08 -0.01 0.06
30. DevEduReuse
31. DevProfEduReuse
32. Residence-N. America
0.08
0.19
0.12
0.00 -0.13 0.13
33. Residence-S. America
34. Residence-Asia & RoW
0.13
0.32 0.00
0.12
0.07
0.05 0.00
Notes: Correlations displayed are Pearson product-moment correlation coefficients; correlations with a significance level <= 10% are bolded and shaded in gray; n.m.=correlation not meaningful because variables are dummy variables of the same characteristic or variables are scores of the same factor analysis; N=632. 0.02
0.03 0.00
0.06
0.21
0.03
0.02
0.06
0.15
0.04
0.02 -0.02 0.03
0.00
0.02 -0.03 -0.03 -0.10 0.05
0.06 -0.04 0.06 -0.09 0.01 -0.02 -0.05 -0.07 0.05
0.00
0.23
0.17
0.02
0.06 0.38
0.08 0.31
0.11
0.02
1.00
0.08
0.17
0.28
0.04
0.02
0.08
0.02
0.01 -0.02 0.15
0.02 -0.06 0.02 0.28 -0.01 -0.09 0.14
0.12
23.
0.08 -0.02 -0.05 1.00
0.33
1.00
22.
0.05 -0.09 0.17 -0.03 -0.10 0.16
0.03
0.11
0.24
1.00
21.
0.38
0.26
0.13
1.00
24.
1.00
26.
27.
0.11 -0.02 1.00
0.06
1.00
25.
28.
0.05
0.06
0.03
0.06
0.06
0.09
0.04 -0.01 0.03
0.04
0.03
0.08 -0.15 0.11
0.02 -0.01 0.09
0.01
0.01
0.03
0.02 -0.04 -0.02 -0.01 0.00 0.05 -0.03 0.01
0.04
0.04
0.04
0.02
0.17
1.00
29.
0.04 -0.02 -0.04 0.00 -0.05 0.06
0.03
0.00
0.14 -0.03 0.09
0.02
0.05
0.01
0.06 -0.01 0.26
0.04 -0.02 0.00 -0.06 -0.08 0.02
0.03
0.04 -0.04 0.04
0.03 -0.02 -0.04 -0.02 0.04 -0.06 0.04
0.01 -0.01 0.03 -0.11 -0.08 0.04 0.02 -0.02 0.04
0.07
0.07 -0.03 -0.02 -0.01 -0.03 -0.05 -0.04 -0.04 0.04
0.12
0.12
0.37
0.01 -0.03 -0.02 -0.03 -0.07 -0.44 -0.10 -0.01 -0.19 -0.08 -0.09 -0.17 -0.34 -0.05 -0.09 -0.15 1.00
0.11 -0.01 0.11 0.00
0.11
0.09
0.07 -0.13 0.06
0.03 -0.09 -0.10 -0.14 0.07
0.07
0.12
0.03
1.00
20.
0.08 -0.18 -0.16 -0.02 -0.15 0.01 -0.10 -0.02 -0.11 -0.02 0.14
0.01 -0.11 0.04 -0.04 0.02 -0.08 0.01
0.06
0.00
0.30 -0.04 -0.02 -0.14 0.12
0.07
0.10
0.05
0.05 -0.02 0.02
0.22
0.17
19.
0.01 -0.04 -0.14 -0.07 -0.12 -0.05 -0.04 0.24 0.09
0.01 -0.01 0.01
0.19
0.14 -0.06 0.03
1.00 n.m.
0.12
0.12
0.02 -0.02 0.09
0.06
0.11
0.03 -0.01 -0.03 -0.01 -0.06 0.02 -0.11 0.06
0.05
0.03 -0.06 -0.02 0.01 -0.05 0.02 -0.01 0.03
0.00 -0.03 0.08
0.04 -0.01 0.04 -0.09 0.02
0.10 -0.02 -0.01 0.08
0.04
0.05
0.11 -0.04 -0.01 -0.01 0.01
0.03 -0.03 0.02 -0.05 -0.20 -0.08 -0.22 0.05
0.04
-0.01 0.13
29. DevProf
0.03 -0.01 0.12
0.00
28. DevProjShare
0.00
0.11
-0.12 0.05
24. ProjComplexity
0.09
-0.03 0.03 -0.07 -0.01 -0.08 -0.03 0.19
23. ProjSize
0.01
0.11
0.00 -0.01 0.04 -0.07 -0.03 0.01
-0.01 -0.16 -0.07 -0.06 0.07
22. ConditionArchitecture
0.01
0.01
0.07
-0.05 -0.23 -0.04 0.00
21. ConditionLanguage
0.11
0.05
0.03
-0.09 -0.15 -0.04 0.01
20. ConditionLicense
0.03
0.16
0.08 -0.03 -0.14 -0.13 0.12
0.19
0.12 -0.07 -0.02 0.01 -0.04 -0.08 -0.02 -0.07 0.04
0.19
0.15
0.12
0.05
-0.07 -0.19 -0.05 -0.08 0.03 -0.03 -0.03 -0.03 0.00 -0.03 -0.03 -0.06 0.02 -0.05 -0.03 -0.08 -0.08 0.04
19. ConditionLack
1.00
0.03
-0.03 -0.09 0.00
1.00
18. ProjPolDiscourage
0.24
0.00
0.16
0.09 0.00
0.05
0.08
-0.05 0.11
17. ProjPolSupport
0.12
0.07
0.09
0.03
0.06
0.22 -0.04 0.06
0.07 -0.04 0.13
0.26
0.03 0.12
-0.05 0.12
16. DevSkill
0.19
0.25
0.50
0.07
15. DevNorm
0.06
0.03 -0.08 0.05
0.16 -0.03 0.02
0.20
0.19
0.06
-0.05 0.03
0.03
0.21
14. MotSignaling
0.08
0.15
0.10 -0.08 0.14 0.06
0.14
-0.07 0.02
13. MotOSSReputation
0.11
0.14
0.10
0.33
0.13
0.02 -0.07 0.21
0.14
0.02
0.13
0.01
0.09
1.00
18.
12. MotCommunity
17.
1.00
1.00
16.
0.09 -0.09 0.44
1.00
15.
0.16 -0.09 0.03 -0.03 -0.11 0.29 1.00
14.
0.09 1.00
13.
0.07
0.00
12.
0.05 -0.04 0.06
0.04
11.
11. MotLearning
1.00
10.
10. MotCreaPleasure
0.16
-0.05 -0.05 -0.02 0.05 -0.02 0.07 -0.01 -0.08 1.00
9. MotChallenge
1.00
0.06 -0.10 0.03
-0.03 0.04
8. ProjPhase 0.18
0.01 -0.03 0.31
0.01
-0.08 0.05
7. DevOtherProjects
1.00
0.10 -0.02 -0.06 1.00
n.m.
-0.01 0.15
n.m.
n.m.
6. DevOSSNetsize (log)
1.00
n.m.
n.m.
1.00
9.
n.m.
8.
5. IssueControlLoss
7.
n.m.
6.
4. BenefitTaskSelection
n.m.
1.00
5.
n.m.
4.
n.m.
3.
3. BenefitQuality
2.
2. BenefitEfficiency
1.
1.00
1. BenefitEffectiveness
0.04
1.00
31.
1.00
32.
0.00
0.05
n.m.
0.05 -0.02 n.m.
0.01
0.08
1.00
30.
n.m.
1.00
33.
1.00
34.
114 Open source software developers’ perspectives on code reuse
Table 3-15: Correlation matrix of independent variables
Open source software developers’ perspectives on code reuse
115
3.7.3. Statistical methods used All three dependent variables are of double censored nature. The two importance indices (ImpRePast and ImpReFut) range from 1 to 7 and the variable capturing the share of reused code (ReuseSharePast) is constrained to values between 0 and 100 percent. Due to this nature of the data, OLS regression is not suitable (Dougherty 2002, p. 293 ff.) and Tobit analyses are applied (Greene 2003, p. 764 ff.). For further robustness purposes each dependent variable is also tested in an Ordered Probit model.129 For these models the dependent variables are transformed to ordinal scales. The two dependent variables reflecting the importance of code reuse (ImpRePast and ImpReFut) are converted to an ordinal scale with the set [1, 2, 3] in which “1” indicates that the value in the original scale is below 3 and “3” reflects an original scale value of higher than or equal to 5. The dependent variable reporting the share of contributed functionality based on reused code (ReuseSharePast) is changed to an ordinal scale with the set [1, 2, 3, 4, 5] in which each number represents a 20% bucket of the original variable, e.g. “1” indicates that the share of contributed functionality based on reused code is between 0 and 20 percent. The models are primarily interpreted regarding the sign and significance of the coefficients.130 Additional information about marginal effects and standardized coefficients is available in Appendix A.1.2. In total, 12 different models are tested with the three dependent variables. Models 1 to 4 use ImpRePast as the dependent variable, models 5 to 8 employ ReuseSharePast and models 9 to 12 contain ImpReFut. Models with even numbers are reduced models obtained by successive elimination of insignificant variables from the preceding model.131 Models 1, 2, 5, 6, 9 and 10 are Tobit models while models 3, 4, 7, 8, 11 and 12 are Ordered Probit models with the above transformations of the dependent variables.
3.7.4. Results This chapter presents the statistical results of the different models which are discussed from an aggregate perspective in the following chapter (3.7.5). 129
Ordered Probit models are used for dependent variables measured on ordinal scales.
130
Due to their censored nature especially the calculation of marginal effects for Tobit models is difficult (Greene 1999; Cong 2001). See Appendix A.1.2 for more information on this topic.
131
Likelihood ratio tests are used to ensure that the eliminated variables are also jointly insignificant.
116
Open source software developers’ perspectives on code reuse
Models 1-4. The results for the models with past importance of code reuse (ImpRePast) as dependent variable are depicted in Table 3-16. All models are statistically significant (p<0.0001) and the pseudo R² values are 0.10 and 0.17 for the full Tobit and Ordered Probit model, respectively.132 Direction and significance of coefficients are largely consistent across the four models. The results support the attitude toward code reuse hypotheses H1a to H1d, confirming that the more positive developers perceive the effectiveness, efficiency, software quality and task selection effects of code reuse, the more importance they attach to code reuse. H1e addressing the loss of control risks from code reuse is however not supported. Regarding access to local search, the positive effect of a larger OSS network (H2a) is only partially supported while the assumption that a greater number of other OSS projects increases the importance of code reuse (H2b) is consistently confirmed in all four models. Moreover, project maturity (H3) is found to have the expected negative effect on code reuse behavior. Of the hypotheses addressing the compatibility of code reuse with developers’ individual project goals the assumed negative effect of challenge seeking (H4a) is partially supported and the positive effect of community commitment (H4d) is consistently significant. The hypotheses regarding the effect of creative pleasure (H4b), skill improvement (H4c), OSS reputation building (H4e) and commercial signaling (H4f) do not find support in the models. Of the control variables a positive subjective norm regarding code reuse exhibits a consistently positive significant effect as does a project policy supporting code reuse. Project policies discouraging code reuse and the lack of code to reuse impact code reuse behavior negatively in all models. Beyond that, developers who invest more time into their current main project attribute more importance to code reuse as do developers who have received training on code reuse during their time as professional software developer.
132
Pseudo R² values for both Tobit and Ordered Probit models reflect “McFadden’s R²” which measures the share by which the log-likelihood of the full model is smaller than the log-likelihood of a model containing only the intercept (Dougherty 2002, p. 309). The R² value of the corresponding OLS regression is 0.34.
Open source software developers’ perspectives on code reuse
117
Table 3-16: Model: Importance of past code reuse (ImpRePast) 1) Tobit Coef. Std. Err. Attitude toward code reuse (research model group A) BenefitEffectiveness (H1a) 0.23*** 0.07 0.68*** 0.09 BenefitEfficiency (H1b) BenefitQuality (H1c) 0.32*** 0.08 BenefitTaskSelection (H1d) 0.16** 0.08 IssueControlLoss (H1e) -0.03 0.07
2) Tobit (red.) Coef. Std. Err.
3) Ord. Probit Coef. Std. Err.
0.24*** 0.66*** 0.34*** 0.16**
0.07 0.08 0.08 0.08
0.12** 0.39*** 0.18*** 0.12** -0.02
0.05 0.06 0.06 0.05 0.05
Access to local search (research model group D) DevOSSNetsize (log) (H2a) 0.14* 0.08 DevOtherProjects (H2b) 0.02* 0.01
0.15* 0.03**
0.08 0.01
0.09 0.02**
Project maturity (research model group E) ProjPhase (H3) -0.15**
-0.12*
0.07
0.07
Compatibility with developers’ goals (research model group F) -0.15* 0.09 MotChallenge (H4a) MotCreaPleasure (H4b) 0.09 0.08 MotLearning (H4c) 0.02 0.08 MotCommunity (H4d) 0.19** 0.09 0.21** MotOSSReputation (H4e) 0.00 0.06 MotSignaling (H4f) -0.06 0.06 Subjective norm (research model group B) 0.13** DevNorm
0.06
Perceived behavioral control (research model group C) DevSkill -0.08 0.09 ProjPolSupport 0.42** 0.19 ProjPolDiscourage -1.11** 0.51 ConditionLack -0.25*** 0.05 ConditionLicense 0.06 0.05 ConditionLanguage 0.03 0.06 ConditionArchitecture 0.03 0.06
0.09
0.11*
0.06
0.36** -1.28** -0.24***
0.17 0.53 0.05
4) Ord. Probit (red.) Coef. Std. Err. 0.13*** 0.39*** 0.20*** 0.13**
0.05 0.05 0.05 0.05
0.05 0.01
0.03***
0.01
-0.10**
0.05
-0.09**
0.04
-0.06 0.06 0.03 0.10* -0.02 -0.04
0.06 0.05 0.05 0.06 0.04 0.04
0.13**
0.05
-0.07**
0.03
0.12***
0.04
0.11***
0.04
-0.01 0.29** -0.62** -0.14*** 0.06* 0.01 0.01
0.06 0.14 0.30 0.03 0.03 0.04 0.03
0.25** -0.72** -0.14*** 0.06*
0.12 0.29 0.03 0.03
Additional control variables (research model group G) ProjSize 0.00 0.00 0.00 0.00 ProjComplexity 0.10 0.09 0.01 0.06 ProjStandalone 0.24 0.20 0.15 0.12 DevOSSExperience 0.02 0.02 0.01 0.02 DevProjTime 0.02** 0.01 0.02*** 0.01 0.01** 0.01 0.01*** 0.01 DevProjShare 0.00 0.00 0.00 0.00 DevProf 0.03 0.17 -0.06 0.11 DevEduReuse -0.19 0.16 0.14 0.16 DevProfEduReuse 0.56** 0.24 0.51** 0.24 -0.03 0.12 Residence-N. America -0.15 0.18 -0.16 0.12 Residence-S. America 0.26 0.35 0.00 0.26 Residence-Asia & RoW -0.11 0.23 -0.07 0.16 3.73*** 0.88 3.85*** 0.58 Constant Observations 632 632 632 632 Pseudo R² 0.10 0.10 0.17 0.16 F test (Tobit) / Wald test (Probit) F(34, 598)=9.44, F(14, 618)=20.95, ȋ²(34)=222.15, ȋ²(14)=200.39, p<0.0001 p<0.0001 p<0.0001 p<0.0001 ı (Tobit) / cuts (Probit) 1.79 1.82 -0.13, 0.80 -0.56, 0.36 * significant at 10%, ** significant at 5%, *** significant at 1% Notes: Reported standard errors are robust standard errors; see Appendix A.1.2 for standardized coefficients and marginal effects.
118
Open source software developers’ perspectives on code reuse
Models 5-8. To increase the robustness of the model test, the same independent variables are tested with two different dependent variables. The results for the models with the share of reused code in developers’ past contributions to their current main OSS project (ReuseSharePast) as dependent variable are shown in Table 3-17. All models are statistically significant (p<0.0001) and the pseudo R² values are 0.03 and 0.08 for the full Tobit and Ordered Probit model, respectively.133 Direction and significance of coefficients are largely consistent across the four models. Similar to models 1-4, the results of models 5-8 confirm the attitude toward code reuse hypotheses H1a to H1d (effectiveness, efficiency, software quality and task selection effects) while H1e (loss of control risks) is not supported. Also the two hypotheses regarding access to local search (H2a and H2b) are supported. Contrary to the results of models 1-4, the effect of a larger personal OSS network (H2a) is now consistently positive and significant across all four models. The expected negative effect of project maturity (H3) is again fully confirmed. Regarding the hypotheses addressing the compatibility of code reuse with developers’ individual project goals the proposed negative effect of challenge seeking is now consistently confirmed in all four models (it was only partially supported in models 1-4) while the conjectured positive effect of community commitment is only significant in the two Tobit models (it was consistently supported in models 1-4). Similar to models 1-4, the other assumed effects of individual project goals find no support. As for the control variables, a positive subjective norm again exhibits a significant positive effect in all models while the coefficient addressing the lack of existing code to reuse is again always negative and significant. Contrary to models 1-4, neither a supportive nor a discouraging project policy toward code reuse are found to exhibit a significant effect on the share of reused code in developers’ contributions to their current main project. Training on code reuse during developers’ time as professional developers is again partially significant and positive while the effect of developers’ weekly hours spent on their project has disappeared. Finally, the Tobit models point out that developers in larger projects report a lower share of reused code while developers in technically more complex project report a higher share.
133
The R² value of the corresponding OLS regression is 0.22.
Open source software developers’ perspectives on code reuse
119
Table 3-17: Model: Share of code reuse in past contributions (ReuseSharePast) 5) Tobit Coef. Std. Err. Attitude toward code reuse (research model group A) BenefitEffectiveness (H1a) 2.62*** 0.95 5.89*** 1.12 BenefitEfficiency (H1b) BenefitQuality (H1c) 1.82* 1.01 BenefitTaskSelection (H1d) 3.44*** 1.01 IssueControlLoss (H1e) -0.38 1.02
6) Tobit (red.) Coef. Std. Err.
7) Ord. Probit Coef. Std. Err.
8) Ord. Probit (red.) Coef. Std. Err.
2.47*** 5.71*** 1.90* 3.28***
0.93 1.04 0.99 1.02
0.13*** 0.28*** 0.10** 0.12*** 0.02
0.05 0.05 0.05 0.04 0.05
0.12*** 0.29*** 0.12*** 0.12***
0.05 0.05 0.04 0.04
Access to local search (research model group D) DevOSSNetsize (log) (H2a) 2.30** 1.10 DevOtherProjects (H2b) 0.36** 0.16
2.28** 0.42***
1.05 0.16
0.09* 0.02**
0.05 0.01
0.10** 0.02**
0.05 0.01
Project maturity (research model group E) ProjPhase (H3) -3.11***
-3.19***
0.94
-0.13***
0.04
-0.13***
0.04
0.99
-0.10** -0.01 -0.03 0.04 0.03 -0.01
0.05 0.05 0.05 0.05 0.03 0.04
-0.10**
0.04
0.86
0.11***
0.04
0.11***
0.04
-2.36***
0.60
-0.02 0.09 -0.26 -0.09*** 0.01 -0.01 0.02
0.06 0.12 0.28 0.03 0.03 0.03 0.03
-0.09***
0.03
-0.02** 1.97*
0.01 1.07
5.19*
3.10
0.00 0.09 0.08 0.00 0.00 0.00 -0.01 0.16 -0.01 -0.10 -0.09 0.07
0.00 0.06 0.11 0.01 0.00 0.00 0.10 0.14 0.11 0.11 0.21 0.13
1.01
Compatibility with developers’ goals (research model group F) -2.67** 1.13 -2.75*** MotChallenge (H4a) MotCreaPleasure (H4b) 0.30 1.09 MotLearning (H4c) -1.14 1.04 MotCommunity (H4d) 2.01* 1.11 2.10** MotOSSReputation (H4e) 0.15 0.73 MotSignaling (H4f) 0.34 0.80 Subjective norm (research model group B) 2.22** DevNorm
0.89
Perceived behavioral control (research model group C) DevSkill -0.14 1.23 ProjPolSupport 1.39 2.66 ProjPolDiscourage -4.96 5.07 ConditionLack -2.40*** 0.63 ConditionLicense 0.20 0.60 ConditionLanguage -0.15 0.77 ConditionArchitecture 0.60 0.71 Additional control variables (research model group G) ProjSize -0.02** 0.01 ProjComplexity 2.29* 1.31 ProjStandalone 1.25 2.51 DevOSSExperience 0.02 0.27 DevProjTime -0.01 0.10 DevProjShare 0.04 0.03 DevProf -0.55 2.40 DevEduReuse -1.40 2.17 DevProfEduReuse 5.44* 3.11 Residence-N. America -3.35 2.41 Residence-S. America -3.44 4.03 Residence-Asia & RoW 1.42 3.07 27.99** 11.33 Constant Observations 632 Pseudo R² 0.03 F test (Tobit) / Wald test (Probit) F(34, 598)=6.68, p<0.0001 ı (Tobit) / cuts (Probit) 24.35
2.15**
1.06
29.81*** 8.63 632 0.03 F(14, 618)=14.50, p<0.0001 24.51
632 0.08 ȋ²(34)=163.00, p<0.0001 -0.44, 0.35, 0.91, 1.37
632 0.08 ȋ²(10)=142.26, p<0.0001 -0.83, -0.05, 0.50, 0.95
* significant at 10%, ** significant at 5%, *** significant at 1% Notes: Reported standard errors are robust standard errors; see Appendix A.1.2 for standardized coefficients and marginal effects.
120
Open source software developers’ perspectives on code reuse
Models 9-12. Finally, the results for the models with the expected future importance of code reuse (ImpReFut) as dependent variable are depicted in Table 3-18. All models are statistically significant (p<0.0001) and the pseudo R² values are 0.12 and 0.18 for the full Tobit and Ordered Probit model, respectively.134 Direction and significance of coefficients are largely consistent across the four models. Similar to models 1-8, the results of models 9-12 confirm the attitude toward code reuse hypotheses H1a to H1d (effectiveness, efficiency, software quality and task selection effects) while H1e (loss of control risks) is not supported. The two hypotheses regarding access to local search are supported in all specifications. Furthermore, the expected negative effect of project maturity is again fully confirmed. Regarding the hypotheses addressing the compatibility of code reuse with developers’ individual project goals the proposed positive effect of community commitment is significant in all models while the effect of challenge seeking which is significant in some previous models cannot be supported. Similar to models 1-8, the assumed effects of other individual project goals find no support. Of the control variables a positive subjective norm regarding code reuse again exhibits a consistently positive significant effect as does a project policy supporting code reuse. Project policies discouraging code reuse and the lack of code to reuse again impact code reuse behavior negatively in all models. As in previous models, developers who have received training on code reuse during their time as professional software developer expect a higher importance of code reuse in their future work. The effect is however not significant in model 11. Similar to models 5 and 6, there is a partially significant negative effect of the team size of developers’ current main OSS project. Contrary to model 1-4 the results show no effect with regard to the time developers invest in their project. As further effects, developers in standalone project attribute a higher future importance to code reuse and the results of models 9 and 10 suggest that developers who have received training on code reuse during their education deem it less important for their future work than developers without such training.
134
The R² value of the corresponding OLS regression is 0.36.
Open source software developers’ perspectives on code reuse
121
Table 3-18: Model: Importance of future code reuse (ImpReFut) 9) Tobit Coef. Std. Err. Attitude toward code reuse (research model group A) BenefitEffectiveness (H1a) 0.16** 0.06 0.53*** 0.07 BenefitEfficiency (H1b) BenefitQuality (H1c) 0.25*** 0.07 BenefitTaskSelection (H1d) 0.13** 0.06 IssueControlLoss (H1e) 0.01 0.06
10) Tobit (red.) Coef. Std. Err.
11) Ord. Probit Coef. Std. Err.
12) Ord. Probit (red.) Coef. Std. Err.
0.16** 0.52*** 0.25*** 0.13**
0.06 0.07 0.06 0.06
0.14*** 0.39*** 0.17*** 0.09* -0.03
0.05 0.05 0.05 0.05 0.05
0.12*** 0.37*** 0.17*** 0.09*
0.05 0.05 0.05 0.05
Access to local search (research model group D) DevOSSNetsize (log) (H2a) 0.22*** 0.07 DevOtherProjects (H2b) 0.03*** 0.01
0.22*** 0.03***
0.07 0.01
0.19*** 0.02*
0.06 0.01
0.20*** 0.02**
0.05 0.01
Project maturity (research model group E) ProjPhase (H3) -0.21***
-0.21***
0.06
-0.15***
0.05
-0.15***
0.04
-0.04 0.00 0.00 0.13** 0.00 0.03
0.06 0.06 0.05 0.05 0.04 0.04
0.12**
0.05
0.17***
0.04
0.17***
0.04
-0.01 0.22 -0.50* -0.11*** 0.01 0.05 0.04
0.06 0.13 0.29 0.03 0.03 0.04 0.04
0.23** -0.54* -0.10***
0.12 0.29 0.03
0.06
Compatibility with developers’ goals (research model group F) MotChallenge (H4a) -0.07 0.08 MotCreaPleasure (H4b) 0.04 0.07 MotLearning (H4c) 0.00 0.07 MotCommunity (H4d) 0.16** 0.07 0.14** 0.08* MotOSSReputation (H4e) 0.07 0.05 MotSignaling (H4f) 0.01 0.06 Subjective norm (research model group B) DevNorm 0.19***
0.06
Perceived behavioral control (research model group C) DevSkill -0.02 0.08 ProjPolSupport 0.33** 0.16 ProjPolDiscourage -1.29*** 0.45 ConditionLack -0.17*** 0.04 ConditionLicense 0.01 0.04 ConditionLanguage 0.05 0.05 ConditionArchitecture 0.02 0.05
0.07 0.04
0.20***
0.06
0.37** -1.30*** -0.16***
0.14 0.45 0.04
Additional control variables (research model group G) ProjSize -0.00** 0.00 -0.00*** 0.00 0.00 0.00 ProjComplexity 0.02 0.08 0.01 0.07 ProjStandalone 0.30* 0.16 0.33** 0.15 0.23* 0.12 0.22* 0.11 DevOSSExperience 0.00 0.02 0.00 0.01 DevProjTime 0.01 0.01 0.01 0.01 DevProjShare 0.00 0.00 0.00 0.00 DevProf 0.14 0.15 -0.15 0.11 -0.28** 0.13 -0.26** 0.13 0.24 0.16 DevEduReuse DevProfEduReuse 0.36* 0.19 0.41** 0.18 0.13 0.12 0.27* 0.15 Residence-N. America 0.12 0.15 -0.02 0.12 Residence-S. America 0.00 0.26 0.01 0.20 Residence-Asia & RoW -0.07 0.19 -0.03 0.15 2.95*** 0.73 3.22*** 0.49 0.00 0.00 Constant Observations 632 632 632 632 Pseudo R² 0.12 0.11 0.18 0.17 F test (Tobit) / Wald test (Probit) F(34, 598)=11.89, F(17, 615)=22.09, ȋ²(34)=251.18, ȋ²(14)=210.85, p<0.0001 p<0.0001 p<0.0001 p<0.0001 ı (Tobit) / cuts (Probit) 1.50 1.51 0.34, 1.58 0.33, 1.26 * significant at 10%, ** significant at 5%, *** significant at 1% Notes: Reported standard errors are robust standard errors; see Appendix A.1.2 for standardized coefficients and marginal effects.
122
Open source software developers’ perspectives on code reuse
3.7.5. Discussion and summary After having presented the results of the three sets of multivariate models with the three different dependent variables employed to test the research model hypotheses, this chapter aggregates and discusses the resulting findings and summarizes the testing of the research model (see Figure 3-14). Figure 3-14: Summary of tested OSS code reuse research model hypotheses
0
Models 1-4 Importance of code reuse for past work
Models 5-8 Share of reused code in past contributions
Models 9-12 Importance of code reuse for future work
0
0
0
H4b: Creative pleasure (-)
0
0
0
H4c: Skill improvement (+)
0
0
0
H4e: OSS reputation building (+)
0
0
0
H4f: Commercial signaling (+)
0
0
0
Significant positive effect
Significant negative effect
Partially significant positive effect
Partially significant negative effect
Insignificant effect
H1a: Effectiveness effects (+) H1b: Efficiency effects (+)
Attitude toward A code reuse
H1c: Software quality effects (+) H1d: Task selection benefits (+) H1e: Loss of control risks (-)
Access to local D search
H2a: Size of dev.’s personal OSS network (+)
E Project maturity
H3: Project phase (-)
H2b: # of developer’s OSS projects (+)
0
H4a: Challenge seeking (-)
F
Compatibility with developers‘ goals
H4d: Community commitment (+)
Notes:The direction of the hypotheses is indicated by (+) and (-); “developer” is abbreviated with “dev.”
Attitude toward code reuse (Group A). The multivariate results consistently confirm hypotheses H1a to H1d. Developers who perceive more positive effectiveness, efficiency and software quality effects as well as stronger task selection benefits of code reuse attribute a higher importance to it and practice it more. In contrast, hypothesis H1e is not confirmed. The data do not show that developers who fear to lose control over their project reuse less code. This is surprising as, in the descriptive analysis, loss of control was ranked as the main issue developers have with code reuse (see Chapter 3.6.3). A plausible interpretation is that developers’ concerns about losing control over their project affect their decision as to which code to reuse, but do not affect the total amount of code they reuse. For example, developers concerned about losing control might choose to reuse only components developed by other projects which have a proven track record of fixing bugs quickly and keeping the structure of their code stable (Haefliger et al. 2008). Access to local search (Group D). The effect of developers’ access to local search on their code reuse behavior was captured by the logarithm of the size of their OSS network
Open source software developers’ perspectives on code reuse
123
(H2a) and the number of other OSS projects they have been involved in (H2b). Hypothesis H2a is supported only partially, its coefficient not being significant in models 3 and 4, while H2b is confirmed in all models. Thus, the positive and mostly significant coefficients lend support to the assumption that developers who can access, evaluate, understand, and integrate reusable code more easily due to local search practice more code reuse. Project maturity (Group E). The hypothesis that developers reuse less existing code once their project has matured (H3) is confirmed across all dependent variables and models. Developers do indeed seem to leverage code reuse as a tool to deliver a “plausible promise” early on, while later project phases call for specific refinements of developers’ projects where there is less code to reuse available. Compatibility of code reuse with developers’ project goals (Group F). Regarding the compatibility of code reuse with a developer’s individual project goals, hypothesis H4d (community commitment) is confirmed in all models except models 7 and 8; H4a (challenge seeking) is confirmed consistently in all models with the share of reused code in developers’ contributions as dependent variable and partially in the models with the importance of code reuse for developers’ past work as dependent variable. For all other hypotheses (creative pleasure (H4b), skill improvement (H4c), OSS reputation building (H4e), and commercial signaling (H4f)) the null hypothesis cannot be rejected. The partial support for hypothesis H4d suggests that developers who feel they are part of the OSS community and want it to grow and be successful rely more on code reuse than other developers. Code reuse seems compatible with their goal of contributing to the OSS community because by leveraging code reuse they can contribute more and in higher quality. Moreover, the higher trust these developers put in existing code developed by “their” community presumably makes them less reluctant to reuse it. The partial confirmation of hypothesis H4a lends some support to the assumption that developers’ goal to seek and tackle technical challenges impedes code reuse. By reusing existing code, developers would be denied the pleasure of solving a problem by themselves. Thus, they would rather refrain from code reuse if challenge seeking were of major importance to them in their OSS work. The finding that, interestingly, the respective coefficient is mostly not significant when the dependent variable addresses the importance of code reuse rather than the share of reused code might result from the size of the effect not being large enough. This might be because even if developers reuse less existing code to tackle some difficult technical challenges, they might still consider code reuse as “very important” because they still reuse existing code for non-challenging technical problems.
124
Open source software developers’ perspectives on code reuse
Regarding the hypotheses not supported, the research model had argued that similarly to challenge seeking, the creative pleasure developers experience when writing code leads them to reuse less code (H4b), but the data do not confirm this hypothesis. As the descriptive and exploratory analyses in Chapter 3.6.4 have shown, developers for whom creative pleasure is an important motivation rather reuse components than snippets and in doing so they might have found a way to combine their need to write their own code and experience creative pleasure while still reusing existing code. As explained in Chapter 3.6.4, component reuse does not affect the code structure much because typically only one or a few lines of code are necessary to link to the component. Because of that developers might not perceive component reuse as detrimental to their creative pleasure of writing own code. The remaining unconfirmed hypotheses, skill improvement (H4c), OSS reputation building (H4e) and commercial signaling (H4f) partially show varying signs of the nonsignificant coefficients across the different dependent variables. This could be an indication that, contrary to the research model assumptions, code reuse could be both supportive as well as detrimental to these goals. While reused code could be used as an example to improve programming skills, it could also hamper learning if developers treat the reused code as a black box. Regarding reputation building and commercial signaling the research model had expected that developers who create more and higher quality code (with the help of code reuse) are regarded more highly in the OSS community and can present themselves as better developers to potential employers or business partners. However, it is also possible that in certain situations the code created by developers themselves without the help of code reuse is important to build their OSS reputation or signal skills to potential employers and partners. In these situations developers would refrain from code reuse if reputation building or signaling is a main motivation for their OSS work. Besides investigating the effects of the above project goals for which hypotheses have been derived and posited in the research model (see Chapter 3.4.3) it is also interesting – for exploratory purposes – to analyze whether there is a difference in code reuse behavior between hobbyist developers and developers who are paid for their work on an OSS project or even work on an OSS project as part of their job. To address this question an additional construct reflecting the role of payment as a motivation for developers to work
Open source software developers’ perspectives on code reuse on their OSS project
135
125
is included in an additional specification of the research model.136
Estimating this further specification of the research model with the three different dependent variables employed to assess code reuse behavior finds no significant difference in the past code reuse behavior between rather hobbyist developers and developers who work on their projects as part of their jobs or are otherwise paid to do so. However, when the expected future importance of code reuse is the dependent variable, there is a significant effect on the five percent level. Developers for whom payment or their job is a stronger motivation to work on their OSS project expect a significantly higher importance of code reuse for their future work on this project than other developers. Yet, as this result is not confirmed when the dependent variables reflect past behavior, it seems as if more professional developers know that they should reuse existing code, probably in order to be efficient and effective, and thus put a higher importance to code reuse for their future work, but ultimately do not reuse more code than rather hobbyist developers. Control Variables (Groups B, C, G). Due to the large number of control variables included in the model, only a few main results are pointed out. The subjective norm as perceived by developers shows a consistently significant and positive influence on code reuse as predicted by TPB. Consequently, OSS developers who feel that their peers appreciate them reusing existing code reuse more. Of the variables describing developers’ perceived behavioral control, the lack of reusable code has a consistently negative and significant influence on reuse behavior while potential license or programming language conflicts or architectural issues seem irrelevant to developers’ code reuse decisions. This may be because given the size which OSS has reached by now there might exist different flavors of code with the same functionality under different licenses, developed in different programming languages and catering to various architectures. These parallel implementations of the same functionality in different flavors would mitigate incompatibility issues. A developer interviewed in the qualitative pre-study alludes to this when explaining why he does not have any license issues even if his license “island” is rather small:137 “Actually I didn’t think about it [license issues] in 135
This construct is composed of three items: “In one way or the other I make money from my work on [project]”; “I work on [project] to implement needs from my business or job”; “I work on [project] because I am paid to do so”. All items are measured on a 7-point Likert scale (“strongly disagree” to “strongly agree”) and the construct has a Cronbach’s Į value of 0.79.
136
Including this construct in the research model does not qualitatively change any of the results (coefficients, signs and significance levels) presented in Table 3-16, Table 3-17 and Table 3-18.
137
Only 0.45% of all OSS projects are licensed under the Eclipse Public License (EPL) used by the developer (Black Duck Software 2009a).
126
Open source software developers’ perspectives on code reuse
[my EPL licensed project], because I [re]used only EPL licensed code. Not because of the license problems, which I am aware of […], but because I did not need them [components under licenses different than EPL].” Lastly, as one of the other control variables, developers who had received training on reuse when working as software developers in commercial companies, seem to practice significantly more code reuse, while training on reuse during academic education does not show a positive and significant coefficient. In models 9 and 10, developers with academic reuse education even intend to reuse less than developers without this form of training. This might be an indication that reuse training in academic institutions is not practical enough to actually affect developers’ code reuse behavior. This assumption is supported by a developer from the qualitative pre-study complaining about reuse education at university: “I spent two years with Java in college and nobody ever mentioned Maven [a software project management tool which facilitates reuse].”
3.8.
Conclusion
This part of the dissertation has quantitatively investigated the code reuse behavior of OSS developers. Code reuse in OSS development has served as an example of knowledge reuse in which the perspectives and behavior of individual developers can be observed well. Understanding the role of individual developers in knowledge reuse is crucial because only if individual developers choose to reuse existing knowledge their firms can benefit from the resulting value creation effects. Beyond this general perspective on knowledge reuse, this part of the dissertation has also furthered scholarly work on code reuse in OSS development as the “receiving” side of the open innovation process OSS. Summary and theoretical contributions. With its findings this part of the dissertation contributes to two streams of literature: Literature on OSS and its development and literature on knowledge reuse, particularly in software development. First, regarding scholarly work on OSS this part of the dissertation represents the first quantitative large-scale investigation of code reuse in OSS development on the level of individual developers. The results point out that code reuse is important for the development of OSS with on average about 30% of the functionality contributed by developers to their projects being based on reused code. Further, OSS developers consider the efficiency and effectiveness benefits from code reuse as their main reasons to rely on existing knowledge in their work while loss of control over their projects is the main issue
Open source software developers’ perspectives on code reuse
127
they see in code reuse. Interestingly, developers do not perceive license and programming language conflicts as major impediments to code reuse. On a more fine-grained level, OSS developers reuse both components and snippets with snippets accounting for about ten percent of the lines code developers have submitted to their projects. Whether developers favor component or snippet reuse is among other reasons influenced by their motivations to contribute to their project (e.g. whether challenge seeking or creative pleasure is more important for them). Interestingly, a large number of developers (about 50%) modify the components they reuse, partly for skill improvement reasons. Thereby they forego several of the advantages of component reuse. Finally, OSS developers turn most frequently to general purpose search engines, OSS repositories and code example web pages when searching for existing code to reuse. Yet, they consider means of local search such as their personal networks or other OSS projects they have been involved in as more efficient. These findings are of relevance to OSS literature because they help to answer Crowston et al.’s (2009) call for a more detailed understanding of how OSS is developed on the one hand and on the other help to comprehend OSS as an open innovation process with a “giving” and a “receiving” or “reusing” side. Second, with its findings regarding the determinants of code reuse as one instance of knowledge reuse, this part of the dissertation reaches out beyond the scope of OSS and also contributes to literature on knowledge reuse, particularly in software development. By developing and partially confirming hypotheses regarding developers’ reuse behavior this part of the dissertation provides answers to Ye and Fischer’s (2005, p. 200) question about “[…] what triggers software developers to initiate the reuse process […].” The investigation of determinants of OSS developers’ code reuse behavior finds that developers with better access to local search due to a larger personal network or more exposure to different projects reuse more because their costs of searching for, understanding, adapting and integrating existing knowledge are lower. Further, developers convinced of the benefits of code reuse (efficiency and effectiveness gains, enhanced software quality, and the chance to work on preferred tasks) practice code reuse more, as do developers who can use code reuse to support their goal of serving the OSS community. Moreover, developers see code reuse as a means to kick-start new projects as it helps them deliver a “plausible promise”. Lastly, the study finds partial support for the hypothesis that those developers who desire to solve technical problems for the satisfaction of it rather refrain from reuse and, thus, make their projects less efficient and effective than they could be.
128
Open source software developers’ perspectives on code reuse
With these determinants of individual developers’ reuse decisions, this part of the dissertation sheds light on the human role in the knowledge reuse process which has so far received only limited scholarly attention (e.g. Maiden & Sutcliffe 1993; Sen 1997). A better understanding of this human role should allow firms to better leverage knowledge reuse to create value because with their decisions to reuse existing knowledge or not, individual developers strongly influence the value creation of their firms. Managerial implications. Beyond their scholarly implications, the findings of this study are also of relevance to managerial practice, especially in the field of software development. The outcomes of this part of the dissertation can help firms, particularly firms developing software, to increase the knowledge reuse of their developers and thereby enhance their value creation. First, the results presented highlight the high level of code reuse within the OSS community. This should provide motivation to firms to also leverage existing OSS code in their software development, thereby partly mitigating the typically high upfront investment costs of building an internal reuse library for artifacts that are not firm-specific (Frakes & Kang 2005).138 If firms intend to pursue this avenue of reusing OSS code, they should encourage and support their employees to enhance their access to local search for OSS code by building personal OSS networks and by becoming involved in various OSS projects. Second, and beyond the reuse of OSS code, firms should foster the networking of their developers also within the firm and provide them with a broad range of experiences in various projects to increase the reuse of firm-internal knowledge, too. Moreover, modified incentive structures and development processes based on the findings presented could support internal corporate knowledge reuse activities in software development and beyond. As part of such modifications developers should be provided with the option to select tasks themselves according to their preference, they should be compensated according to their work results delivered (in terms of efficiency and quality) and not based on the time they have spent at work and they should be required to deliver “credible promises” in new development projects. These modifications would allow software developers to become “software entrepreneurs” (Haefliger et al. 2008, p. 192) and in this role they should have strong reasons to reuse existing knowledge whenever possible. Furthermore, firms should 138
Obviously this has to be in accordance with the licenses of the OSS code. See Chapter 4 for an investigation of determinants which make commercial software developers deal with OSS licenses properly.
Open source software developers’ perspectives on code reuse
129
create an overall culture which endorses knowledge reuse and in which reusing leads to reputational gains (subjective norm) and they should explicitly train their employees in knowledge reuse instead of relying on educational institutions to do so. Lastly, to accommodate developers’ desire to tackle difficult technical challenges, which makes them reuse less than they could, firms should consider job enrichment (e.g. Herzberg 1968) as a means to integrate such challenges into developers’ work which are in the best interest of the firm, thereby accommodating the needs of both developer and firm. All of these measures address determinants of knowledge reuse identified in this study and should thus increase the share of existing knowledge which developers reuse in innovation processes in their firms. Due to that, these measures should ultimately enhance firm value creation. Future research. Both the results of this part of the dissertation and its limitations suggest multiple avenues for future research. First, in the domain of research on knowledge reuse in general, the research model of this study could be tested in different contexts other than OSS development. While the model was developed with the context of OSS in mind, many of its determinants (especially the developer-centric ones like access to local search or compatibility) should be well applicable to other settings, too. For example testing the model in the context of software development in commercial firms or also outside of the domain of software development would improve its robustness and make its contribution to knowledge reuse research stronger. In the domain of OSS development, academic work on code reuse has only just begun with this study being the most comprehensive quantitative account on the level of individual developers. Consequently, this field of study merits further research to better comprehend the role of code reuse in OSS development. While this study has addressed “development with reuse”, future work could investigate “development for reuse”, researching the development of reusable OSS code. One question of relevance in this context is why OSS developers bear the reportedly large additional costs of writing reusable code (see Chapter 3.2.2) and if they have found ways to mitigate them. Preliminary insights from the qualitative pre-study and some side-analyses of the quantitative data point to commercial signaling as an important reason for OSS developers to engage in projects which explicitly develop reusable components. Additionally, as has already been pointed out previously by both Haefliger et al. (2008) and Mockus (2007), the strategies which OSS developers employ to make their reusable code known and reused deserve investigation.
130
Open source software developers’ perspectives on code reuse
Finally, the limitations of this work open up several further research avenues which could add robustness to the research model explaining OSS developers’ code reuse behavior. First, the dependent variables employed in this study reflect developers’ subjective perception of the importance and the share of code reuse in their OSS work. Given the fuzziness resulting from this approach to measuring code reuse, objective measures generated by code analyses could capture the dependent variable of the research model more objectively and precisely. Similarly, independent variables captured from other data sources could be added to the model. For example, social network data derived from SourceForge.net (e.g., Fershtman & Gandal 2009) could be employed to further extend and test the hypotheses regarding local search. Lastly, while this study has focused on developers and their projects as determinants of code reuse, future work could employ an even more fine-grained approach and analyze single reuse incidents, incorporating developers, their projects, and the artifacts they consider for reuse. Such an approach could, for instance, analyze the impact of the quality of the relationship between the “giving” and the “receiving” side of the open innovation process of OSS development.
4.
Commercial software developers’ perspectives on internet code reuse139
4.1.
Introduction
Conventional wisdom holds that knowledge reuse is positive for firms and also scholarly work usually takes for granted that firms benefit from knowledge reuse (e.g. Langlois 1999; Markus 2001; Majchrak et al. 2004).140 Building on this premise, researchers have focused on exploring how firms can enhance the share of knowledge reused in their products and services, typically assuming implicitly that this is in the best interest of the firm (see Chapter 3 for a review of existing literature following this tradition). However, analyzing knowledge reuse with the duality of value creation and value appropriation in mind suggests that firm benefits are not a natural consequence of knowledge reuse. As has been shown in Section 3.2, knowledge reuse does indeed typically enhance value creation. Yet, as Chapter 2 exploring the duality of value creation and value appropriation has pointed out, firms typically need not only to create value but also to appropriate a sufficient share of it in order to ensure profitability. Because of this, even if knowledge reuse enhances firm value creation it may not be in the best interest of the firm if it negatively impacts value appropriation. With the exception of knowledge reused from the public domain, all explicit knowledge is governed by IP rights (de Laat 2005). Through these the owner of the knowledge can set obligations which others reusing the knowledge have to comply with in order to be allowed to make use of it (Rosen 2004; Murray 2009). Even if these obligations do not affect value creation, they may negatively influence value appropriation if they weaken the bargaining position of the reusing firm and thereby reduce the share of value which the firm can capture (see Chapter 2). An example where this issue is highly relevant in practice is the reuse of “internet code” in commercial software development. The large amount of OSS and other code as explicit knowledge available for free (i.e. gratis) download on the internet (this code is 139
This part of the dissertation has partly already been available in Sojer and Henkel (2010b) and Sojer et al. (2010).
140
See Chapter 3 for a detailed account of the benefits of knowledge reuse.
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8_4, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
132
Commercial software developers’ perspectives on internet code reuse
referred to as “internet code” in the following) is a highly attractive resource pool for commercial firms and their software developers who can reuse the existing internet code to increase the efficiency, effectiveness and quality of their work (e.g. Norris 2004; Ruffin & Ebert 2004; Ajila & Wu 2007). As a commercial software developer from the qualitative pre-study (see Chapter 4.3.1) explains: “I think every developer is tempted to reuse code from the internet [in her commercial software development tasks].” When leveraging this resource pool of internet code, commercial software developers usually – and in line with existing theory – enhance the value creation of their firm, but they may also put the value appropriation of their firm in jeopardy. This is because despite being freely available, the majority of the internet code is still protected by IP rights, usually in the form of copyright, and the owners of the code often have set obligations through licenses which those reusing their work need to comply with in order to be allowed to employ their knowledge (O’Mahony 2003; de Laat 2005). If internet code reused in a commercial software development setting is e.g. licensed under highly restrictive OSS licenses such as the GPL (see Chapter 3.3.2), firm value appropriation is potentially at risk because, as in the Cisco/Linksys case (see Section 1.1), the firm might be required to also make other parts of its software available under the GPL (O’Mahony 2003; Murray 2009). As a consequence of this, the customers of the firm would have to be allowed to access the particular software in source code form, modify it and distribute it further without having to ask the firm for permission and without having to pay a fee to the firm. This would erode the bargaining position of the firm versus its customers and consequently reduce the share of value which the firm could appropriate. Software firm VMware (2008, p. 33) for example writes in the “risk”-section of their quarterly filings to the U.S. Securities Exchange Commission that as a consequence of reusing internet code, “[…] we may be subjected to certain conditions, including requirements that we offer our products that use the open source software for no cost.”141 Further, VMware (2008, p. 34) points out that “[…] if we combine proprietary software with open source software in a certain manner, under some open source licenses we could be required to release the source code of our proprietary software, which could substantially help our competitors develop products that are similar to or better than ours.”
141
Legally VMware could not be forced to make available their proprietary software at “no cost”, but after having had to make their software open source, customers might not be willing to pay for it anymore.
Commercial software developers’ perspectives on internet code reuse
133
The importance of this topic for commercial software development is emphasized by the fact that, in recent years, multiple firms (e.g. Black Duck Software,142 Palamida143 or Protecode144) which offer software and services for commercial firms to scan their code bases for the reuse of internet code and potential violations of obligations of the reused internet code have been founded and grown strongly. Similar to Chapter 3 which had argued that firm value creation through knowledge reuse is heavily dependent on whether individual developers choose to reuse existing knowledge or not, also the above value appropriation risks potentially resulting from knowledge reuse are dependent on individual developers’ decisions. As a software auditor145 interviewed in the qualitative pre-study (see Chapter 4.3.1) comments: “As long as they [firms] have engineers connected with the internet you are going to have some kind of a problem [with reused internet code].” If the individual developers do not check thoroughly for the obligations attached to the knowledge they reuse or even ignore these obligations they may put the value appropriation of their firm in jeopardy. Consequently, understanding the perspectives of individual commercial software developers on internet code reuse and the obligations potentially resulting from it is of high interest to both research and practice. In order to explore the value appropriation side of knowledge reuse, this part of the dissertation researches internet code reuse in commercial software development as an example where the topic surfaces frequently. The focus of the study is on the obligations which may result from internet code reuse and on how individual developers deal with them. Existing scholarly work has already addressed the reuse of internet code and especially OSS code in commercial software development and has also pointed to the obligations which may come with reused internet code (e.g. Brown & Booch 2002; Madanmohan & De 2004; Spinellis & Szyperski 2004). However, quantitative work in this research domain and especially analyses based on large-scale data are scant. Moreover, scholars have typically addressed the topic in the context of systematic internet code reuse, that is assuming that firms have integrated internet code reuse into their software development
142
http://www.blackducksoftware.com, last accessed 21.12.2009.
143
http://www.palamida.com, last accessed 21.12.2009.
144
http://www.protecode.com, last accessed 21.12.2009.
145
Software auditors analyze the code base of software and among other things scrutinize it for violations of obligations from reused internet code.
134
Commercial software developers’ perspectives on internet code reuse
processes (e.g. Madanmohan & De 2004; Norris 2004; Spinellis & Szyperski 2004). Yet, anecdotal evidence suggests that a large portion of internet code reuse takes place in adhoc fashion conducted by individual developers who spontaneously search the internet for existing code and integrate it into their work (e.g. Levi & Woodard 2004; McGhee 2007; Bennett & Ivers 2008). Addressing this gap, this part of the dissertation employs a large-scale survey to shed light on the role of ad-hoc internet code reuse by individual commercial software developers with special attention to how they deal with the obligations which may come with reused code. The results of this study on the one hand augment research on the reuse of internet code in commercial software development. On the other hand and more generally, this study provides insights into the value appropriation side of knowledge reuse. In the course of the analysis answers to the following blocks of questions are presented. First, how well aware are individual commercial software developers of the obligations which may come with reusing internet code? Second, how important is ad-hoc reusing internet code for the work of individual commercial software developers? Third, how frequently do individual commercial software developers violate obligations when reusing internet code in ad-hoc fashion? Fourth and finally, which factors influence whether individual commercial software developers violate obligations when reusing internet code in ad-hoc fashion? The remainder of this part of the dissertation is organized as follows. The next section (4.2) lays the foundations for this part of the dissertation by providing details on the obligations which may come with internet code reuse and by reviewing existing scholarly work on internet code reuse in commercial software development. As this part of the dissertation again deals with knowledge reuse in the context of software development and OSS, it builds on the foundations presented in Sections 3.2 and 3.3 which are not repeated in this part of the dissertation for the sake of brevity. Section 4.2 ends with a summary and formulates detailed research questions regarding the reuse of internet code in commercial software development. Section 4.3 develops a research model to guide the quantitative study which explains why commercial software developers might violate obligations when reusing internet code in ad-hoc fashion. After that, Section 4.4 describes the survey design and methodology employed to collect data, before first quantitative results are presented in descriptive and exploratory fashion in Section 4.5. Section 4.6 finally elaborates on the testing of the research model with structural equation modeling techniques and Section 4.7
Commercial software developers’ perspectives on internet code reuse
135
concludes this part of the dissertation with a summary of the most important findings, an overview of theoretical contributions and managerial implications and a discussion of limitations and future research avenues.
4.2.
Foundations of internet code reuse in commercial software development
Building on the foundations of knowledge and software reuse (Section 3.2) and OSS and its development (Section 3.3) established earlier, this section first reviews the obligations which may come with the reuse of internet code (Chapter 4.2.1). After that, existing work addressing the reuse of internet code (particularly OSS code) in commercial software development is discussed (Chapter 4.2.2). The section ends with a summary and the formulation of detailed research questions for this part of the dissertation in Chapter 4.2.3.
4.2.1. Obligations from internet code reuse Obligations from internet code reuse are rooted in IP rights affecting software development. This topic is discussed first before various obligations from OSS and other internet code and the potential consequences of violating these obligations are reviewed. IP rights in software development146 While other mechanisms such as patents, trade secrets or trademarks might also be applicable, the most important IP mechanism in software development is copyright (de Laat 2005; McGhee 2007; Boyle 2009). Copyright has originally been developed for literary works and works of art, however, since the 1980s it has also been applied to software (Samuelson 1990; Fitzgerald & Bassett 2005). Copyright is assigned automatically at the time of creation and gives the creator of a work the following exclusive rights regarding the work (Rosen 2004; Fitzgerald & Bassett 2005):147 Make copies of the work, create derivative works based on the original work, distribute copies of 146
The IP status of software described here reflects the legal situation in the United Stated of America. However, the effect of copyright on software (as the focus of this discussion) is quite similar in most other countries (Rosen 2004; Boyle 2009). Further, while there is still some legal uncertainty regarding the enforceability of licenses which put requirements on the licensee (such as OSS licenses), the latest decision of the U.S. Court of Appeals for the Federal Circuit in Jacobsen v. Katzer (13.08.2008) is clearly supportive of such licenses (Arne 2008; Bennett & Ivers 2008; Hogle 2008).
147
If the work is created in the course of employment, typically the employer becomes the owner of the copyright (Rosen 2004; Fitzgerald & Bassett 2005).
136
Commercial software developers’ perspectives on internet code reuse
the work and derivative works for sale, rent, lease or lending, perform the work in public, display the work in public. Typically, copyright protection exceeds the creator’s lifetime (St. Laurent 2004; de Laat 2005; Arne 2008).148 Following U.S. legislation, the creator of a work can however also forfeit the copyright of her work (German & Hassan 2009; Murray 2009). Such works are then considered to be in the public domain and the above rights granted by copyright regarding the work are available to everybody for such works and nobody can be excluded by legal means from exercising these rights anymore. Thus, the creator of a piece of software code is automatically assigned the copyright of her code upon creation and initially holds the exclusive rights listed above unless she puts her code in the public domain. In order to allow others to use the code in a way which would otherwise be forbidden by copyright, copyright owners can issue licenses (Rosen 2004; St. Laurent 2004). For example, the copyright holder of a piece of code could issue a license to somebody else which allows this other person to make copies of the code and sell it. When issuing licenses regarding their code, copyright holders can specify obligations which the licensees (i.e. the recipients of the license) have to fulfill (Rosen 2004; Boyle 2009; Murray 2009). In such a situation the license is only valid if the obligations are met (Rosen 2004; McGhee 2007; Arne 2008; Hogle 2008). To minimize transaction costs, licenses addressing internet code are typically unilateral, meaning that whoever complies with the obligations automatically receives a license and can use the code in accordance with the license (Rosen 2004; Arne 2008). Finally, while copyright is assigned automatically, it is in the copyright holder’s responsibility to enforce her rights (de Laat 2005; McGhee 2007). In order to prosecute those who are violating her copyright, e.g. by ignoring the obligations from her license, a copyright owner has to become aware of the violation and provide initial proof (Rosen 2004). Given the IP situation regarding software development described above, all internet code is covered by copyright, unless its creators have placed it in the public domain. While there does exist public domain internet code, the large majority of the code available on the internet is still covered by copyright (Bennett & Ivers 2008; Ebert 2008; Murray 2009). 148
When the copyright protection of a work ends it becomes part of the public domain (e.g. St. Laurent 2004).
Commercial software developers’ perspectives on internet code reuse
137
Due to that, developers interested in reusing internet code not in the public domain need a license to do so because at the very least they need to make a copy of the code.149 There exists however one exception when those reusing copyright protected code do not need a license and thus also do not to have to account for the obligations which come with the code. If the amount of code reused is very small, the reuse can be considered “fair use” and does not require a license (St. Laurent 2004). However, what exactly “fair use” is needs to be determined on a by case basis (Fitzgerald & Bassett 2005; McGhee 2007). In the domain of software, courts have considered even the reuse of short snippets with less than 100 lines of code to be not “fair use” and require a license (McGhee 2007; Mertzel 2008). In the following different obligations which developers may have to deal with when reusing OSS or other internet code are discussed. OSS code obligations150 As has already been pointed out in Chapter 3.3.2, there exist many different OSS licenses (66 as of April 2010 (Open Source Initiative 2010)) with the GPL accounting for more than 50% of the existing OSS code (Black Duck Software 2009a). The different licenses partially result in different obligations when code governed by them is reused. However, there exist common patterns which are described in the following. The obligation directly affecting value appropriation and thus most problematic for commercial firms is the “reciprocity”-effect exhibited by highly restrictive (e.g. GPL, Open Software License (OSL)) and restrictive OSS licenses (e.g. LGPL, Mozilla Public License (MPL)).151 These licenses demand that other code which is tightly integrated with code governed by them is also made available under their terms (Rosen 2004; Fitzgerald & Bassett 2005). For example, reusing a large snippet of GPL licensed code by combining it with code developed internally in a firm would result in the obligation that this other code which had been developed internally is also put under GPL terms (e.g. de Laat 2005; Meeker 2008). The consequence of this would be that users of the resulting software would have the right to access, modify and redistribute not only the code originally licensed under the GPL but also the code developed internally in the firm (see Chapter 3.3.2). 149
Beyond creating a copy, they might also need or intend to create derivative works based on the original code or distribute copies of the code for sale, rent, lease or lending.
150
Note that only the main OSS licenses and their main obligations are covered here. For a more comprehensive overview see e.g. Rosen (2004), St. Laurent (2004) or Meeker (2008).
151
The “reciprocity”-effect is sometimes also called “copyleft”-effect or “hereditary”-effect. See Murray (2009) for an overview of the various terminologies employed.
138
Commercial software developers’ perspectives on internet code reuse
The exact terms of this “reciprocity”-effect may differ from license to license (Rosen 2004; Olson 2008). For the most common licenses GPL, LGPL and MPL they are as follows: − The GPL as the main OSS license implements a strong “reciprocity”-effect which reaches out to all software “[…] containing the Program [code] or a portion of it, either verbatim or with modifications and/or translated into another language.“152 − The LGPL is a sister license of the GPL and basically follows the same principles. However, it differs in that it does not demand that code which is connected to LGPL code via linking (both static and dynamic) is also licensed under its terms (Rosen 2004; Meeker 2008). Nonetheless, merging a large snippet of LGPL licensed code with other code would still result in a “reciprocity”-effect (Rosen 2004). − In yet another form the “reciprocity”-effect induced by the MPL covers only other code in the same file as the MPL licensed code while code in other files is not affected (Rosen 2004; de Laat 2005). In addition to the differences between the licenses regarding the “reciprocity”-effect there is also legal uncertainty regarding which forms of connection between two pieces of code actually trigger a “reciprocity”-effect because this topic has not yet been tried in court.153 Due to that there exist different opinions about the situations in which a “reciprocity”-effect applies or not. For example some practitioners and also scholars argue that the scope of the “reciprocity”-effect demanded by the GPL is not supported by copyright law and suggest that when reusing GPL licensed code via dynamic linking the “reciprocity”-effect is not applicable (e.g. Madanmohan & De 2004; Rosen 2004; Ruffin & Ebert 2004; Knoll 2009). The FSF as the organization having drafted the GPL however disagrees in this matter and claims that the “reciprocity”-effect also reaches out to code dynamically linked to GPL licensed code (Free Software Foundation 2009a).154 As another class of obligations resulting from OSS reuse some licenses demand that the entity reusing OSS code does not file a patent infringement suit against the original creator of the code (Rosen 2004). Depending on the specific license this may cover only 152
GPL, version 2, section 0, http://www.gnu.org/licenses/old-licenses/gpl-2.0.html, last accessed 17.12.2009.
153
See German and Hassan (2009) for a technical overview of the various forms of connection between two pieces of code.
154
Only the first version of the GPL was drafted by Richard Stallman (see Chapter 3.3.1) while the subsequent second and third versions were drafted by the FSF which was however founded by Richard Stallman and follows his beliefs regarding software freedom.
Commercial software developers’ perspectives on internet code reuse
139
patent infringement suits regarding the reused code or patent infringement suits in general (Meeker 2008). Beyond the “reciprocity-effect” and patent obligations, OSS licenses may also contain other obligations regarding reuse which are however typically less problematic for commercial firms to deal with. Among these obligations are (Rosen 2004; St. Laurent 2004; Fitzgerald & Bassett 2005; Meeker 2008): − Not deleting any copyright, patent, trademark or attribution notices in the reused code. − Inclusion of the original license text of the reused code in the software containing the code. − Inclusion of the file containing the contributors to the reused code in the software containing the code. − Distribution of the warranty disclaimer addressing the reused code with the software containing the code. − Acknowledging the creator of the reused code in the end-user documentation of the software containing the code or in the software itself. − Explicitly stating which parts of the reused code have been modified and including the date of the change. − Renaming modified reused files to avoid confusion with original versions. Non-OSS internet code obligations While OSS code accounts for the majority of code available on the internet, there does also exist other code which is not in the public domain but governed by non-OSS licenses. Olson (2008) reports having counted over 400 different licenses of such code. The obligations which the licenses of such other code may ask for are very heterogeneous and there does not exist a scheme to classify them (Meeker 2008; Murray 2009). Some common obligations asked for are not to use the code for commercial purposes, only to reuse the code for a limited period of time or only in certain countries (Cohn-Sfetcu & Mayer 2009; Murray 2009).
140
Commercial software developers’ perspectives on internet code reuse
Consequences of violating obligations The preceding paragraphs have established that code available for free download on the internet is typically not public domain knowledge, but still comes under obligations which need to be accounted for when reusing the code. This is especially important for the reuse of internet code in commercial firms because first, copyright holders of internet code are more likely to enforce their rights against firms than against individuals (O’Mahony 2003; Barraclough 2008)155 and second, such enforcement is likely to negatively affect firms’ financials in one of the ways described in the following hypothetical situations.156 If a developer within a firm has reused internet code in one of the firm’s products without fulfilling the obligations of the reused code, the license of the code is violated and not valid any further. Legally, the firm is now not allowed to distribute the code as part of their product until the infringement is resolved. In this situation the copyright holder of the code could effect an injunction which denies the firm the right to sell its product (Rosen 2004; Arne 2008; Bennett & Ivers 2008; Hogle 2008). If the obligation violated is one which can be resolved easily, such as attribution of the original creator of the code, solving the issue should be only a matter of days, but still the firm might lose sales and profit in this short period of time when it is not allowed to sell its product (Olson 2008). Alternatively, the obligation violated may also be of a type which takes long to resolve or is even impossible to work out. For example, if the product of the firm contained code licensed under the GPL in such a way that it is tightly coupled with other code which the firm considers the basis of its competitive advantage, complying with the GPL obligation to make available the source code of this other code under GPL terms is most likely offlimits for the firm (Knoll 2009).157 To solve this issue, the firm might redesign the architecture of its product to clearly segregate the GPL licensed code from the own code to avoid the obligation (Henkel & Baldwin 2009) or substitute the GPL licensed code with 155
In the OSS community, OSS developers usually try to handle violations of the obligations of their code by other OSS developers informally (O’Mahony 2003).
156
When enforcing their copyright versus commercial firms, OSS developers’ main goal is to “[…] encourage compliance, not to seek damages” (O’Mahony 2003, p. 1188). Due to that, typically the commercial firms are contacted directly at first and only if no solution can be found the copyright holders would sue in court (McGhee 2007; Barraclough 2008; Strod 2009).
157
Besides such situations where the “reciprocity”-effect requires firms to make their own code available under the license of the internet code, the “reciprocity”-effect may also lead to license incompatibilities between different pieces of internet code when reused together (Meeker 2008, p. 59 ff.; German & Gonzalez-Barahona 2009; German & Hassan 2009). If two pieces of internet code come under different licenses both creating a “reciprocity”-effect and these two pieces of internet code are tightly coupled in the software product being developed, these two pieces of code are incompatible because the obligations of the two licenses can never be fulfilled simultaneously. As a consequence, the firm must not sell the software product.
Commercial software developers’ perspectives on internet code reuse
141
other code which provides the same functionality (Rosen 2004). Both solutions would most likely take a long time during which the firm must not sell its product. Alternatively, they might even be cost prohibitive or technically not feasible which would permanently bar the firm from selling its product. As a third alternative, the firm might choose to make available own code which is tightly integrated with internet code licensed under a highly restrictive license (e.g. the GPL) under the license of the reused code in order to comply with the obligation and to be allowed to offer its product again. In this case customers and potentially also competitors would be able to inspect the code and imitate or enhance its functionality which, as in the Cisco/Linksys case (see Section 1.1), would most likely negatively affect the firm’s bargaining position versus its customers and consequently reduce the share of value the firm can appropriate. Finally, despite being rather uncommon,158 the firm might also be required to pay damages because of the products already sold, in the worst case as high as the profits generated by violating the obligations (Moskin & Wettan 2009; Jaeger 2010). Theoretically, the firm might alternatively also be forced to ensure compliance with the obligations for the products already sold (Henley 2009), e.g. by replacing parts of the software contained in them. Having ascertained the legal situation of internet code, the potential obligations which need to be considered when reusing internet code and the potential issues which commercial firms may face if their products include reused internet code but do not comply with the obligations, the next chapter discusses existing scholarly work on the reuse of internet code in software development in commercial firms.
4.2.2. Internet code reuse in commercial software development Given the compelling value creation advantages of knowledge reuse in general and of software reuse in particular (see Section 3.2) it would be surprising if reusing the abundance of internet code available which can typically be accessed for free (i.e. gratis) were not an interesting option for software development in commercial firms. And indeed, despite some initial concerns regarding the quality, security and support available for internet code (e.g. Brown & Booch 2002; Spinellis & Szyperski 2004), recent scholarly 158
Especially OSS developers usually aim at enforcing compliance and do not seek damages (O'Mahony 2003). Moreover, in non of the few existing court cases addressing violated internet code license obligations the author is aware of such damages have been claimed (Carver 2005; Jaeger 2010).
142
Commercial software developers’ perspectives on internet code reuse
work indicates that internet code reuse allows software development in commercial firms to lower costs and development time while even increasing software quality (e.g. Norris 2004; Ajila & Wu 2007; Ebert 2008). In addition to that, case studies have found that some firms actively try to exploit these benefits of internet code reuse (e.g. Madanmohan & De 2004; Ruffin & Ebert 2004; Ebert 2008) and point out that internet code reuse has grown in importance for commercial software development in recent years (Madanmohan & De 2004; Ajila & Wu 2007). Both Mäki-Asiala and Matinlassi (2006) and Ven and Mannaert (2008) have even identified small firms for which internet code reuse is an essential part of their business model and which could most likely not exist without it.159 Of particular interest to commercial software development and different to conventional reuse is the fact that internet code reuse allows tapping into the benefits of code reuse without having to build reusable resources internally upfront or buying them from other firms (Ruffin & Ebert 2004; Ajila & Wu 2007; Ebert 2008). Based on their intensive study of software reuse at an Israeli firm, Morad and Kuflik (2005) even suggest that OSS code reuse may provide higher benefits for commercial firms than conventional software reuse. Norris (2004) similarly highlights several advantages of internet code reuse over conventional code reuse when comparing both forms of code reuse as applied in a NASA software development project. He points to the higher quality of the internet code, better support for it and easier integration. To realize these benefits in commercial software development, internet code can be reused either systematically or in ad-hoc fashion (Brown & Booch 2002; Morad & Kuflik 2005). Systematic internet code reuse in commercial software development If internet code is reused systematically in commercial software development, it has become part of the software development process in the respective firm (Madanmohan & De 2004; Morad & Kuflik 2005). In such a setup the firms have full control over the integration of internet code reused in their software development projects. They have implemented a clear process through which internet code is introduced into their products and qualify all internet code before it enters their software (Levi & Woodard 2004; Norris 159
Similarly, a venture capitalist interviewed in the qualitative pre-study (see Chapter 4.3.1) pointed out that “[…] for some software start-up firms reusing OSS code is their only chance to get on par with the incumbents in their industry as quickly as possible” (Translated from German).
Commercial software developers’ perspectives on internet code reuse 160
2004; Ruffin & Ebert 2004; Olson 2008).
143
While there is no standard setup to systematic
internet code reuse in commercial software development, a potential implementation might look like this (Madanmohan & De 2004; Norris 2004; Morad & Kuflik 2005). At the beginning of a software development project, functionality requirements which could potentially be addressed with internet code are collected. Following that, internet code meeting these requirements is identified and evaluated against predefined criteria including license obligations. Only code which meets all criteria is reused. Once the code to be reused has been selected, the obligations which come with its reuse are integrated into the requirements of the software development project and compliance with them is checked along the way during development and at the release of the product. In line with this systematic approach, Palamida (2008), a services firm supporting other firms to best leverage OSS code during their software development, has for example preevaluated 25 OSS components which it suggests to be reused in commercial software development in order to reduce costs and time while at the same time increasing quality. Given the structured process of systematic internet code reuse it seems reasonable that the obligations of the code to be reused are met because both evaluating them and ensuring compliance should be steps in the software development process. Motorola for instance has established a “Strategic Technology Asset Management Process” in their overall software development process to account for licensing and IP issues of reused internet code while Philips covers these topics in their “Intellectual Property and Standardization” review (Madanmohan & De 2004). Ad-hoc internet code reuse in commercial software development Besides the systematic reuse of internet code in commercial software development, individual commercial software developers may also reuse internet code in ad-hoc fashion. When doing so, a single developer would rather spontaneously search the internet for existing code related to her current development task, download this code and integrate it into her work (Morad & Kuflik 2005; Hummel et al. 2008).161 Ad-hoc reuse of internet code is typically independent from and unrelated to other developers or formalized processes within the developer’s firm. Commercial software developers can easily reuse
160
Given the administrative effort necessary for systematic internet code reuse, it is only applicable to component reuse and not to snippets.
161
Contrary to systematic reuse, ad-hoc reuse of internet code may also entail snippet reuse.
144
Commercial software developers’ perspectives on internet code reuse
internet code to help them solve a development problem faster or better as the required code is only a few clicks away from them on the internet. While ad-hoc reuse of internet code by individual commercial software developers has not yet been researched systematically, anecdotal evidence and practitioner accounts suggest that it is quite common (Bennett & Ivers 2008; Kaneshige 2008; Olson 2008; Alexy 2009, p. 31). Moreover, existing literature also alludes that it is especially during the ad-hoc reuse of internet code when license obligations are either not inquired about thoroughly or even ignored. Levi and Woodard (2004, p. 8) for example draft the following scenario: “A programmer needs certain software modules to complete a project. Rather than writing the modules himself, he searches the Internet and finds readily available source code that meets his needs. The programmer incorporates this code into the project, paying scant attention to any license terms that may be included with the code that he is using.” Levi and Woodard (2004, p. 8) further claim that their scenario “[…] happens daily in hundreds of large corporations, often without the knowledge of anyone other than the programmers themselves” and describe such behavior as the “standard coding practice for many programmers.” Other authors (IPX 2004; Palamida 2005; Davidson 2006; waters 2006; McGhee 2007; Arne 2008; Bennett & Ivers 2008; Koohgoli 2008; Cohn-Sfetcu & Mayer 2009; Knoll 2009) report similar situations and similarly claim that internet code is often introduced in commercial software development projects via individual developers’ ad-hoc reuse who might not account properly for the resulting license obligations. Emphasizing this point Olson (2008, p. 6) for example quotes a fictitious software developer with the words: “Everybody is using it [internet code] and it’s really good, so it’s really stupid not to take advantage.” Software development firm VMware (2008) which makes use of systematic internet code reuse points to the potential risks of their developers reusing internet code in ad-hoc fashion and not accounting for the resulting obligations properly in the “risk” section of their quarterly filings to the U.S. Securities Exchange Commission. There VMware (2008, p. 34) writes that they “[…] have established processes to help alleviate these [internet code reuse] risks [i.e. their systematic internet code reuse process], including a review process for screening requests from our development organizations for the use of open source, but we cannot be sure that all open source software is submitted for approval [in their systematic internet code reuse process] prior to use in our products.” As a
Commercial software developers’ perspectives on internet code reuse
145
consequence of this VMware (2008, p. 33) warns that internet code “[…] could negatively affect our ability to sell our products and subject us to possible litigation.” In order to mitigate the risks which firms may face as a consequence of their developers’ ad-hoc internet code reuse without accounting for the obligations of the code, both scholars and practitioners suggest that firms acknowledge the ad-hoc reuse of internet code by their developers and actively manage it (e.g. Levi & Woodard 2004; Bennett & Ivers 2008; Koohgoli 2008; Meeker 2008, p. 74 ff.).162 In the course of this approach firms need to educate their developers making them aware of the issues of internet code reuse and the obligations which can be attached to code which is available for free download. Further, firms should introduce policies “[…] balancing the benefits of the use of open source with business and legal risks” (Bennett & Ivers 2008, p. 4). Such policies should offer guidance to the individual developers when reusing internet code by e.g. generally permitting the reuse of code under certain licenses, banning the introduction of code under certain other licenses and defining a process how of to deal with code not covered by these general rules (Levi & Woodard 2004; Meeker 2008, p. 121 f.; Olson 2008).163
4.2.3. Intermediate conclusion and detailed research questions Code available on the internet is an attractive resource for both firms and individual developers involved in commercial software development. Recent research analyzing the benefits of internet code reuse highlights the positive value creation effects of knowledge reuse and the fact that internet code is typically available for free. Moreover, some scholars have argued that internet code reuse holds higher benefits for commercial firms than conventional code reuse. However, despite being available for free download, internet code is rarely in the public domain. To the contrary, most internet code is still protected by copyright and comes under licenses which result in obligations which those reusing the code need to comply with in order to be allowed to capitalize on the code. Some of these obligations can be implemented easily, such as not deleting any copyright, patent, trademark or attribution notices or including the original license text of the reused code with the software in which it is reused. Other obligations, however, may pose major issues for commercial firms. Some code for example may demand to be used only for a limited period of time or request 162
Alternatively, firms could also ban the reuse of internet code categorically; however, thereby they would also forego the value creation advantages of it.
163
See Meeker (2008, p. 123-134) for an example of a corporate policy addressing the reuse of internet code.
146
Commercial software developers’ perspectives on internet code reuse
that other code integrated with it is also made available in source code form to the users of the software. This last obligation for example is a consequence of reusing code governed by the GPL, the license most frequently applied to internet code (McGhee 2007). Not accounting properly for the obligations of the internet code reused in their software can force firms to stop offering their software for a limited time or even permanently. Further, if they are forced to comply with the obligations of the internet code they have reused governed by licenses such as the GPL their value appropriation is in danger. Despite these risks, recent research and also practitioner accounts highlight that internet code reuse has grown in importance for commercial software development. It can occur in two different instances. In systematic internet code reuse the integration of code existing on the internet into commercial software development projects has become part of firms’ software development approach and follows a structured and controlled process. In such a setting it should be feasible to properly address the obligations from the reused internet code correctly. In its second form, internet code is reused in ad-hoc fashion when individual software developers search the internet for code useful to their current development task and integrate it, often without making this known to other developers in their project or their manager. Anecdotal evidence suggests that this behavior is not uncommon and even further speculates that it is especially during the ad-hoc reuse of internet code when the obligations of the reused code are not thoroughly inquired about or when they are even knowingly ignored. Existing research addressing internet code reuse has mainly focused on its benefits, that is its value creation side. On the value appropriation side most scholars have touched on the topic of obligations resulting from reusing internet code reuse (e.g. Brown & Booch 2002; Madanmohan & De 2004; Spinellis & Szyperski 2004), but with the notable exception of Chen et al. (2008), the discussion of this issue is typically limited to either merely acknowledging its existence of such obligations or to formulating normative approaches how to deal with it by means such as policies. Going beyond this, Chen et al. (2008) find in their survey that most Chinese commercial software developers do not read the licenses of the OSS code they reuse and if they read them, they often do not understand the terms.164 Thus, despite its importance for commercial firms and their value appropriation, empirical evidence addressing internet code reuse obligations in commercial
164
Chen et al.’s (2008) study does not explore the topic of obligations from internet code reuse beyond the descriptive statistics describing whether licenses are read and understood.
Commercial software developers’ perspectives on internet code reuse
147
software development is scarce or as Chen et al. (2008, p. 91) observe, “[…] few follow-up studies have been performed to examine how the licensing issues are managed in practice.” In addition to the neglect of the topic of obligations from internet code reuse, most existing research addresses internet code reuse in the context of systematic reuse, assuming that there exist dedicated processes through which internet code is reused (e.g. Brown & Booch 2002; Norris 2004; Ruffin & Ebert 2004). The ad-hoc reuse of internet code by individual commercial software developers which might pose the bigger risk to their employers has been largely ignored in scholarly work. Despite the anecdotal evidence that individual developers do practice ad-hoc reuse of internet code and when doing so might not thoroughly check for resulting obligations or even knowingly ignore them, there is no systematic investigation addressing this topic. Little is known about how knowledgeable individual commercial software developers are regarding the obligations which may result from internet code reuse, the role which internet code reuse plays for their individual work or how frequently they reuse internet code in a way creating risks for their employer. Moreover, also research addressing the determinants of developer behavior potentially endangering value appropriation through the ad-hoc reuse of internet code is lacking. Addressing these gaps in existing research, this part of the dissertation strives to analyze the perspectives of individual commercial software developers on ad-hoc internet code reuse focusing on obligations and value appropriation risks by investigating the following research questions with a large-scale quantitative survey: − How well aware are individual commercial software developers of the obligations which may come with reused internet code? (Question 1) − Do commercial firms provide guidance to their software developers regarding the reuse of internet code? (Question 2) − How important is reusing internet code in ad-hoc fashion for the work of individual commercial software developers? (Question 3) − How does the importance of reusing internet code in ad-hoc fashion differ between different individual commercial software developers? (Question 4) − How frequently do individual commercial software developers reuse internet code in ad-hoc fashion in a way violating resulting obligations? (Question 5) − Which factors influence whether individual commercial software developers reuse internet code in ad-hoc fashion in a way violating resulting obligations? (Question 6)
148
Commercial software developers’ perspectives on internet code reuse
The first five questions are addressed in descriptive and exploratory fashion in Section 4.5 while question six is discussed using structural equation modeling techniques in Section 4.6. Before this, Section 4.3 describes the research model addressing question six and Section 4.4 reports the design and conduction of the survey employed to generate the data required to answer the questions. Since the remainder of this part of the dissertations deals exclusively with the ad-hoc reuse of internet code and does not address the systematic reuse of internet code anymore, the formulation “internet code reuse” relates only to “ad-hoc internet code reuse” when employed in the following. This allows for shorter and less complex formulations.
4.3.
Research model and hypotheses
To guide the choice of variables to be captured in the survey questionnaire and to formulate hypotheses for research question six, a research model is developed in this section. Drawing on existing research on ethical behavior, especially ethical behavior in the IS domain (Chapter 4.3.1) and the results from a qualitative pre-study (Chapter 4.3.2), the research model presented in Chapter 4.3.3 aims at shedding light on the question of why individual commercial software developers reuse internet code in a way (potentially) violating resulting obligations. When reusing internet code, commercial software developers may among other issues endanger the value appropriation of their employers through two forms of behavior which are both covered by the research model. First, developers may choose not to thoroughly investigate the obligations which come with the code they reuse. With this behavior they potentially violate obligations because the code they reuse may or may not come with critical obligations. As the second behavior developers may be aware of the obligations of the code they reuse, but knowingly choose not to account for them. In this situation they actually violate obligations. Beyond these two generic forms of critical behavior the research model is also intended to cover the reuse of different forms of internet code. Following the foundations of software reuse presented in Chapter 3.2.2, developers may rely on the internet to reuse whole components or only small snippets.
Commercial software developers’ perspectives on internet code reuse
149
4.3.1. Theoretical models to predict ethical behavior For the individual commercial software developer reusing internet code in a way (potentially) violating resulting obligations165 is a form of unethical behavior. It would not be a form of unethical behavior if developers were not aware of the existence of obligations which may come with reused internet code at all. However, as the data presented in Chapter 4.5.2 highlight, nearly all developers know at least that such obligations do exist. Furthermore, developers were explicitly made aware of obligations of internet code before completing the part of the survey addressing the research model. Following Thong and Yap (1998, p. 214), a behavior is unethical when “one party in pursuit of its goals engages in behavior that [in an unjust way]166 affects the ability of another party to pursue its goals.” In business and employment contexts individuals frequently face situations in which they might choose to behave unethically, yet such situations seem to be especially common in the IS domain (e.g. Paradice 1990; Thong & Yap 1998; Pierce & Henry 2000).167 Examples for such situations in the IS field frequently surface in the context of intellectual property, privacy, accuracy or accessibility (Mason 1986; Culnan 1993). Some scholars (e.g. Loch & Conger 1996; Marshall 1999; Winter et al. 2004) propose that the field of IS is especially susceptible to ethically critical situations because legal and moral guardrails frequently cannot keep pace with the technological progress. Addressing unethical behavior in business and employment contexts, various scholars have developed dedicated models to predict ethical and unethical behavior (e.g. Kohlberg 1969; Rest 1986; Trevino 1986; Bommer et al. 1987; Jones 1991) and all of these models have found following. However, the theory of reasoned action (TRA) and the theory of planned behavior (TPB) have been employed most frequently by scholars to analyze behaviors with an ethical dimension (e.g. Beck & Ajzen 1991; Randall & Gibson 1991; Vallerand et al. 1992; Kurland 1995; Flannery & May 2000; Buchan 2005). Both theories offer general behavioral models from the domain of social psychology which had originally not been developed specifically with the purpose of explaining ethical behavior in mind. Especially the IS domain relies heavily on the two models (e.g. Lin & Ding 2003; Peace et al. 2003; Leonard et al. 2004; Woolley & Eining 2006; Cronan & Al-Rafee 2008; 165
In the following the formulation “(potentially) violating obligations of reused internet code” is employed to cover the two instances of critical behavior introduced above.
166
The “unjust way” may result from violations of law and/or moral principles (Lefkowitz 2006).
167
See O’Fallon and Butterfield (2005) for a recent literature review of scholarly work regarding ethical and unethical behavior in general.
150
Commercial software developers’ perspectives on internet code reuse
Liao et al. 2009) with some researchers arguing that TPB is more useful than TRA in predicting ethical behavior in the context of IS (e.g. Loch & Conger 1996; Chang 1998). Staying in the tradition of ethics research in the IS context, also the research model employed in this study relies on TRA and TPB as its guiding structure. However, and also similar to many other studies using these models to predict ethical behavior (e.g. Kurland 1995; Banerjee et al. 1998; Flannery & May 2000; Peace et al. 2003; Buchan 2005; d'Astous et al. 2005), the original models are modified and extended to better fit the behavior of interest in this investigation.168 Further, and also in line with the majority of studies addressing ethical behavior (e.g. Randall & Gibson 1991; Flannery & May 2000; Peace et al. 2003; Leonard et al. 2004; Buchan 2005), the dependent variable of the research model is the intention to engage in the behavior and not the behavior itself. Because all data for this study is captured with a questionnaire at one single point in time (see Section 4.4) a behavior variable would reflect past behavior while the other variables would reflect current intention, current attitude etc. As survey participants most likely reconsider the ethical dimension of their behavior when completing the questionnaire this temporal discrepancy would invalidate the model results if the dependent variable were behavior (Ajzen 1991). However, as is laid out below in Chapter 4.3.3, intention is a highly reliable predictor of behavior and thus well suited as dependent variable in studies of ethical behavior (Ajzen 1991; Beck & Ajzen 1991).
4.3.2. Qualitative pre-study The research questions of this part of the dissertation require a large-scale survey among commercial software developers. Before conducting this survey, a qualitative prestudy was carried out. The purpose of the pre-study was three-fold. First, the information gathered in the pre-study helped to inform and refine the research model discussed in this section (Greene et al. 1989). Second, the pre-study helped to gain a better understanding of software development in commercial firms, of the reuse of internet code in this setting and of issues potentially resulting from obligations of internet code. Both aspects facilitated the design of the survey instrument. Third, the findings generated during the qualitative research were later employed to support the analysis and interpretation of the quantitative survey findings (Miles & Huberman 1994). 168
Ajzen (1991, p. 199) himself supports modifications of TPB, stating that “the theory of planned behavior is, in principle, open to the inclusion of additional predictors if it can be shown that they capture a significant portion of the variance in intention or behavior after the theory’s current variables have been taken into account.”
Commercial software developers’ perspectives on internet code reuse
151
In order to take a broad perspective on the topic of internet code reuse in commercial software development and the obligations which may result from it, 20 interviews were conducted in the period from October 2008 to July 2009. Interviewees were experts from four different groups related to the topic of this investigation: − Six interviews were conducted with professionals who work for firms developing software and act as experts regarding the use of internet and OSS code in their respective firms.169 − Seven interviews were conducted with experts working for professional services firms which support other firms to best leverage internet code reuse in their software development by tapping into the benefits of internet code reuse while at the same time avoiding violations of resulting obligations. − Three interviews were conducted with lawyers who specialize on internet code licenses and advise firms when and how to comply with obligations resulting from these licenses. − Four interviews were conducted with professionals working for venture capital (VC) funds specializing on software investments. In the wake of cases such as the Cisco/Linksys issue (see Section 1.1), venture capital investors have become highly sensitive to internet code reuse. Value appropriation problems and other issues resulting from violated obligations of internet code can heavily affect VC funds’ deal valuations of their portfolio companies. Consequently, scrutinizing the code base of a potential investment target for internet code reuse has become a common step in VC funds’ due diligence process (e.g. Davidson 2006; Egger & Hogg 2006). In line with the exploratory character of the pre-study, the interviews were conducted as semi-structured interviews to allow comparison of the answers, but still leave enough room to address new topics and questions (Bortz & Döring 2003; Schnell et al. 2005). 16 of the 20 interviews were conducted either by phone or internet-based voice communication and four interviews took place in personal meetings. The interviews lasted between 25 minutes and one hour and 42 minutes with an average duration of 51 minutes. 15 of the 20 interviews were recorded and transcribed, for the other interviews careful notes were taken.
169
The explicit roles of these professionals are e.g. CTO, “head of embedded Linux team” or “member of open source core team.”
152
Commercial software developers’ perspectives on internet code reuse
In addition to the interviews, also discussions which evolved from the survey pretest with commercial software developers (see Chapter 4.4.3) were included in the qualitative pre-study. Results of the qualitative pre-study are reflected in the research model, the setup of the questionnaire and the discussion of the results of the quantitative survey.
4.3.3. Determinants of violations of internet code reuse obligations Based on TRA, TPB, the qualitative pre-study and existing scholarly work on internet code reuse and ethical behavior, a research model is developed in the following to explore the determinants which lead individual commercial software developers to reuse internet code in a way (potentially) violating resulting obligations. Theory of reasoned action (TRA) The theory of reasoned action (TRA) (Fishbein & Ajzen 1975; Ajzen & Fishbein 1980) is a parsimonious model to explain human behavior. It assumes that the behavior of individuals is determined by their intention (motivation) to engage in the behavior and argues that the stronger the individual’s intention the greater the likelihood of her engaging in the behavior. As Ajzen and Fishbein (1980, p. 181) formulate it, “intentions are assumed to capture the motivational factors that influence a behavior; they are indications of how hard people are willing to try, of how much effort they are willing to exert in order to perform the behavior.” Having asserted that behavior is determined by the intention to engage in the behavior, TRA proceeds by identifying the determinants of intention, namely attitude toward the behavior and perceived subjective norm regarding the behavior (see Figure 4-1). If an individual expects a positive outcome of a behavior and feels that important others encourage or at least accept the behavior, she is likely to develop a positive intention regarding the behavior (Fishbein & Ajzen 1975; Ajzen & Fishbein 1980). The development of an attitude toward a behavior follows Fishbein and Ajzen’s (1975) expectancy-value model in which they argue that behavioral beliefs lead to the formation of an overall attitude toward a behavior. Typically an individual holds multiple beliefs regarding a behavior and each one reflects an expected outcome or consequence of engaging in the behavior which may be positive or negative.170 The overall attitude toward
170
People can hold a large number of beliefs regarding a certain behavior, but as Miller (1956) shows only a rather small number is considered when forming an attitude.
Commercial software developers’ perspectives on internet code reuse
153
a behavior is thus the sum of the expected outcomes of the multiple individual behavioral beliefs weighted by the likelihood of the respective expected outcomes. Figure 4-1: Theory of reasoned action and theory of planned behavior (Adapted from Ajzen (2002, p. 1)) Included in both TRA and TPB
Behavioral beliefs
Attitude toward the behavior
Normative beliefs
Subjective norm
Control beliefs
Perceived behavioral control
Included in in TPB, but not in TRA
Intention
Behavior
In addition to attitude toward a behavior perceived subjective norm regarding a behavior influences intention. Individuals’ normative beliefs whether important others support or discourage a behavior determine the motivation (intention) to engage in the behavior. These important others can be individuals or small groups such as friends or colleagues and may vary depending on the context. The normative beliefs regarding the opinion of important others can obviously vary for each important other, as can the willingness to comply with the opinion of that other person or group. The overall perceived subjective norm regarding the behavior is the sum of individual normative beliefs weighted by the respective willingness to comply with each belief. Theory of planned behavior (TPB) as an extension of the theory of reasoned action A key limitation of TRA is that it is only applicable to behaviors under volitional control, that is to behaviors which can be performed or not performed at will (Ajzen 1985). To overcome this issue which severely limits the range of behaviors to which the theory can be applied, Ajzen (1985, 1991) has developed the theory of planned behavior (TPB) as an extension to TRA. TPB stays true to the propositions of TRA that intention predicts behavior and that attitude toward the behavior and perceived subjective norm regarding the behavior are determinants of the intention to perform the behavior. However, it introduces perceived behavioral control as an additional factor influencing both intention and behavior (see Figure 4-1). Perceived behavioral control accounts for the factors (control beliefs) which may be perceived as impediments to the behavior but also for the perceived opportunities to engage in the behavior. Ajzen and Madden (1986, p. 457) describe it as
154
Commercial software developers’ perspectives on internet code reuse
“the person’s belief as to how easy or difficult performance of the behavior is likely to be.” An individual with a high perceived behavioral control is confident that she can carry out the behavior. More specifically, perceived behavioral control can be differentiated into a “capability” and a “controllability” portion (Ajzen 2002). The first part refers to the internal control the individual has over the behavior and is quite similar to Bandura’s (1997) concept of perceived self-efficacy. It accounts for aspects such as the information the individual has or her skills, abilities, emotions and compulsions concerning the behavior (Ajzen 1988, p. 128 f.). The second part of perceived behavioral control addresses situational issues outside of the individual and “[…] determine[s] the extent to which circumstances facilitate or interfere with the performance of the behavior” (Ajzen 1988, p. 129). Both portions of perceived behavioral control seem to be applicable to commercial software developers reusing internet code in a way (potentially) violating resulting obligations. On the “capability” side, developers with lower technical skills should find it quite difficult to reuse existing internet code and thus may not come into a situation where they might violate obligations in the first place. Especially the integration of components has been found to require some level of technical proficiency (see Chapter 3.4.3). On the “controllability” side, developers might be confronted with various impediments incorporated into their workplace setting which make it difficult to reuse internet code in a way (potentially) violating resulting obligations. For example one developer from the qualitative pre-study explains that he cannot access the internet at work and thus finds it difficult to reuse internet code: “I’m a COBOL programmer. In my company, COBOL programmers are not able to access the internet.” Alternatively, developers working with rare programming languages might not find much reusable internet code. Another developer from the qualitative pre-study notes that “there is no xharbour code (what I use) on the internet.” Additionally, even if developers can access the internet and if there is reusable code for them available, their firms might have implemented technical preventive systems making it difficult for them to either reuse internet code or to not fully investigate the resulting obligations and ensure compliance. A third developer from the qualitative pre-study refers to this situation when commenting, “they also were implementing code scanning tools as a watch dog [at my last employer].” As an example for such a technical preventive system, Protecode (2009, p. 3) offers a software tool labeled “Developer IP Assistant” which can be integrated into the software development workstation of individual developers and then
Commercial software developers’ perspectives on internet code reuse
155
“[…] detects source and binary code (introduced by cut/paste or drag/drop of code snippets, or copying complete files or folders from an external source such as a web site or a memory stick), logs it, analyzes it and identifies its IP attributes, checking them against enterprise IP policies and providing real time feedback.”171 While empirical studies of behaviors with an ethical dimension have found support for TPB and its assumptions (e.g. Beck & Ajzen 1991; Flannery & May 2000; Peace et al. 2003), the relative importance of attitude toward the behavior, subjective norm and perceived behavioral control are expected to vary with regard to the behavior at stake (Ajzen 1991; Beck & Ajzen 1991). Attitude for example has been found to be the main TPB factor predicting intention in the decision to pirate software or other digital content (Peace et al. 2003; d'Astous et al. 2005) while perceived behavioral control has been identified as the primary determinant of the intention to shoplift (Beck & Ajzen 1991). Due to that it is important to analyze each specific behavior of interest, such as the reuse of internet code in a way (potentially) violating resulting obligations by commercial software developers, and test the significance of each factor in predicting intention. Summarizing the usefulness of TRA and TPB as base for a research model explaining why commercial software developers reuse internet code in a way (potentially) violating resulting obligations, three hypotheses can be posited: Commercial software developers who hold a more positive attitude toward reusing internet code in a way (potentially) violating resulting obligations, those who perceive a more positive subjective norm regarding this behavior and those who sense a higher level of perceived behavioral control regarding this behavior should exhibit a greater intention to reuse internet code in a way (potentially) violating resulting obligations. H1: A more positive attitude toward reusing internet code in a way (potentially) violating resulting obligations…. H2: A higher level of subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations… H3: A higher level of perceived behavioral control regarding the reuse of internet code in a way (potentially) violating resulting obligations… …will lead to greater intention of commercial software developers to engage in the behavior. 171
This tool however only reduces perceived behavioral control regarding the reuse of internet code in a way actually violating resulting obligations. Regarding potential violations it rather helps developers by reducing their costs for investigating the obligations which come with the code they want to reuse.
156
Commercial software developers’ perspectives on internet code reuse
Ethical work climate theory While the traditional TRA and TPB concepts take into account individuals’ attitude toward a behavior and also the attitude of peers via subjective norm, some scholars have argued that when ethical decisions are at stake also the greater context in which these decisions are made influences the decision makers (Trevino 1986; Wyld & Jones 1997; Cohen 1998). Tetlock (1985, p. 298) for instance posits that “both individuals and small groups of individuals are constrained by the norms, procedures, and resources of the institutions in which they live and work.” Similarly, Victor and Cullen (1988, p. 101) speak of organizations as “[…] social actors responsible for the ethical or unethical behavior of their employees.” Some scholars have even proposed that in employment situations the employing organization has a stronger influence on individual behavior than the personal characteristics of the individual (e.g. Perrow 1997; Andreoli & Lefkowitz 2009). In line with the above theorizing, organizational ethics research has found empirical support for the relationship between an ethical climate within an organization and ethical behavior of the employees of this organization (e.g. Trevino et al. 1998; VanSandt et al. 2006; Andreoli & Lefkowitz 2009). Conceptualizing ethical climate within organizations, Victor and Cullen (1988, p. 101) have developed the multidimensional construct of ethical work climate172 which they define as “the prevailing perception of typical organization practices and procedures that have ethical content […].” While the ethical work climate within an organization is, by definition, a macro-level construct (Wyld & Jones 1997), the way it is perceived by the members of the organization influences them with regard to “[…] the types of ethical conflicts considered, the process by which such conflicts are resolved, and the characteristics of their resolution” (Victor & Cullen 1987, p. 55). The ethical work climate is grounded in the institutionalized normative systems within organizations. Drawing on foundations of moral philosophy and the theory of cognitive moral development, Victor and Cullen (1987, 1988) have proposed that ethical work climate is multidimensional and have deduced nine dimensions of ethical work climate. Their
172
In general, work climate can be defined as perceptions which “are psychologically meaningful molar descriptions that people can agree characterize a system’s practices and procedures” (Schneider 1975, p. 474). The ethical work climate of an organization is one aspect of its total work climate. See Wyld and Jones (1997) for an overview of other aspects of organizational work climate.
Commercial software developers’ perspectives on internet code reuse
157
empirical validation found however only support for five such dimensions which they have labeled caring, law and code, rules, instrumental, and independence.173 Using the constructs developed by Victor and Cullen (1988) for the empirical dimensions of ethical work climate, these dimensions can be integrated into and tested in TPB-based research models (e.g. Flannery & May 2000; Buchan 2005). The inclusion of ethical work climate constructs adds the effect of the organizational context on individual ethical behavior to the model. As the majority of behaviors studied with TPB does not take place in an organizational context, extending the theory with a factor representing this context is a logical extension of the existing theory (Buchan 2005). Further, the ethical work climate should capture information different from that covered by subjective norm as it not only reflects a small group of people but the whole organization or at least a substantial subgroup of it (Victor & Cullen 1988). In addition to the above considerations, also insights from the qualitative pre-study suggest that ethical work climate influences whether commercial software developers reuse internet code in a way (potentially) violating resulting obligations. One developer of the qualitative pre-study for instance explains that “in our company law is considered a necessary evil, but rules of fairness, fairplay and honesty are a priority. We treat open source software with all due respect while conveniently ignoring copyright on proprietary software if the copyright holder’s requirements are strongly unfair compared to the value of the software.” Given the behavior of interest, the two ethical work climate dimensions law and code and rules seem to be most appropriate for inclusion in the research model. The law and code dimension reflects how important complying with laws and professional codes of conduct is in the organization. Given that obligations from internet code reuse result from legal instruments such as copyright (see Chapter 4.2.1) and respecting IP is part of many IS codes of conduct (see Chapter 4.4.2), developers in firms with an ethical work climate of complying with laws and codes should have a lower intention to reuse internet code in a way (potentially) violating resulting obligations. H4a: An ethical work climate of complying with laws and codes will lead to a lesser intention of commercial software developers to reuse internet code in a way (potentially) violating resulting obligations. 173
The discrepancy between the five ethical work climate dimensions supported empirically and the nine dimensions deduced theoretically is because some of the theoretical dimensions turned out to be part of the same underlying factor.
158
Commercial software developers’ perspectives on internet code reuse
While the foundations for ethical deliberation are societal, professional or legal and thus extra-organizational in the case of the law and code dimension, the rules dimension of ethical work climate reflects intra-organizational, firm-based principles such as firm policies, procedures, rules and norms (Wyld & Jones 1997). Given that some firms developing software have policies, processes and rules how to deal with internet code which typically also address the topic of obligations resulting from reuse (see Chapter 4.2.2), developers in firms with an ethical work climate of complying with firm rules should have a lower intention to reuse internet code in a way (potentially) violating resulting obligations. 174 H4b: An ethical work climate of complying with firm rules will lead to a lesser intention of commercial software developers to reuse internet code in a way (potentially) violating resulting obligations. Tetlock’s (1985, p. 298) proposition that “both individuals and small groups of individuals are constrained by the norms, procedures, and resources of the institutions in which they live and work” suggests that ethical work climate not only influences individuals’ behavioral intention as proposed in H4a and H4b, but also affects the subjective norm which individuals perceive from their colleagues who are subject to the same “norms, procedures, and resources” within their firm.175 Consequently, in firms with an ethical work climate of complying with laws and codes and in firms with an ethical work climate of complying with firm rules, individual developers should perceive a more negative subjective norm regarding the reuse of internet code in a way (potentially) violating resulting obligations. H4c: An ethical work climate of complying with laws and codes… H4d: An ethical work climate of complying with firm rules…176 …will have a negative effect on subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations. 174
In addition to hypothesis H4b one could argue that the relationship between an ethical work climate of complying with firm rules and commercial software developers’ intention to reuse internet code in a way (potentially) violating resulting obligations is moderated by the existence of a firm-internal policy addressing the reuse of internet code by developers. This issue is addressed during research model testing (see Chapter 4.6.4), but no additional hypothesis regarding it is posited.
175
The subjective norm individuals perceive results from the opinion of a few others they interact with such as friends or colleagues while the ethical work climate they perceive has a much broader base reflecting practices and procedures agreed on within the organization they work in.
176
The considerations regarding the existence of a firm-internal policy addressing the reuse of internet code by developers laid out in footnote 174 for hypothesis H4b also apply to hypothesis H4d.
Commercial software developers’ perspectives on internet code reuse
159
Having established the determinants of commercial software developers’ intention to reuse internet code in a way (potentially) violating resulting obligations, the second part of the research model in the following addresses factors influencing the attitude commercial software developers hold regarding this behavior. As has been discussed before, an individual’s attitude toward a behavior is the sum of her behavioral beliefs regarding the positive and negative outcomes and consequences of performing the behavior (Fishbein & Ajzen 1975; Ajzen 1991). In the case of reusing internet code in a way (potentially) violating resulting obligations in commercial software development expected utility theory and deterrence theory are useful to identify the behavioral beliefs which may influence commercial software developers’ attitude. Expected utility theory Benefits and costs as basic economic factors are commonly also claimed to be factors in human decision making. Building on this, expected utility theory (e.g. Fishburn 1970; Savage 1972; Schoemaker 1982) states that in situations with risky choices, a rational, selfinterested individual will favor the behavior which maximizes her own utility. To identify this behavior the individual weighs the potential outcome of each alternative by taking into account its expected benefits and costs.177 When developing functionality for a software project, commercial software developers can usually choose between three alternatives. They can implement the software functionality themselves, they can reuse internet code making sure that all resulting obligations are met or they can reuse internet code in a way (potentially) violating resulting obligations. The expected utility of each alternative is the sum of all expected benefits of the respective alternative less its expected costs. Developers will favor reusing internet code in a way (potentially) violating resulting obligations if the expected utility of this option is greater than that of the other two options. The benefits and costs involved in evaluating the alternatives can be incorporated into the research model as antecedents of attitude toward the behavior (Peace et al. 2003; Buchan 2005). High perceptions of the expected benefits of reusing internet code in a way (potentially) violating resulting obligations should lead to a more positive attitude toward the behavior while high perceptions of the expected costs of the behavior should lead to a
177
Benefits and costs do not need to be monetary because individuals are assumed to convert noncomparable items such as time saved by reusing internet code and punishment by the employer into a comparable unit through a utility function.
160
Commercial software developers’ perspectives on internet code reuse
more negative attitude (Ajzen 1991; Peace et al. 2003). In order to extend the research model in this way, benefits and costs involved are discussed in the following. Benefits: Usefulness of internet code. As the first benefit, internet code reuse can make a developer’s job easier by allowing her to solve a technical problem she could not solve herself or by allowing her to develop better software in shorter time (see Chapter 3.2.2 for an overview of the potential positive effects of code reuse). Different developers may perceive these benefits differently. For example, one developer from the qualitative pre-study explains that, “I feel the open source community is a lifesaver and I use the internet daily to do my work, sometimes to find modules [i.e. components], but mostly […] to find examples of how to achieve a task correctly in complicated logic.” Quite to the contrary, another developer comments, “I have never found code from the internet to be useful beyond showing an approach to some subject. Porting others’ code is too hard to be worth the trouble.” It seems likely that the first developer holds a more positive attitude toward internet code reuse (potentially) violating resulting obligations because he needs internet code as a “lifesaver” for his job. Even if he incurs expected costs from violating obligations, these costs should be made up for by the high benefit he perceives from the reuse of internet code. Quite differently, the second developer should have a more negative attitude toward internet code reuse (potentially) violating resulting obligations because he perceives little benefits from internet code reuse which could offset the costs possibly resulting from (potentially) violating obligations. In his own words the second developer explains, “this [low usefulness of internet code] makes the ethical considerations of using another’s code very simple: It’s too much work to be worth the trouble.” Summarizing, developers who perceive a higher usefulness of internet code should also be more positive toward reusing it in a way (potentially) violating resulting obligations. H5a: Usefulness of internet code will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations.
Commercial software developers’ perspectives on internet code reuse Benefits: Severity of time pressure.
178
161
The previous argumentation has stated that
internet code reuse can help make developers’ job easier by increasing the effectiveness, efficiency and quality of their work. In addition to these direct benefits, internet code reuse also offers the indirect benefit of helping developers to avoid missing deadlines and facing the resulting negative consequences. Since experts from the qualitative pre-study point out that “there is enormous pressure both on costs and time in today’s commercial software development”179 (software development manager interviewed in qualitative pre-study), this effect of internet code reuse should be a special consideration of commercial software developers. More generally, research on software quality shows that developers under time pressure tend to take “short-cuts” in order to meet their deadlines (e.g. Brooks 1975; PatéCornell 1990). Such “short-cuts” are “[…] decisions made in private that are motivated by a desire to stay on schedule, but are not in the best interest of the project” (Austin 2001, p. 195). It seems likely that reusing internet code and due to time pressure not considering (potentially) resulting obligations is one such “short-cut” for commercial software developers in which they “[…] hope for the best [and] leave potential sources of difficulty [such as reuse obligations] unexplored […]” (Austin 2001, p. 195). In a theoretical model Austin (2001) shows that developers who perceive more severe consequences from missing deadlines, such as not being considered for promotions or pay raises are more likely to take “short-cuts” ignoring the potential negative issues which might follow as a result of the “short-cut.” Transferring this logic to the context of reusing internet code in a way (potentially) violating resulting obligations is supported by the comment of a developer from the qualitative pre-study: “A high-stress environment requires the programmer to take risks like using code from the internet if available, to meet deadlines. Usually the programmer doesn't have the liberty to delay a product. It's a sink or swim. I worked within a high-stress environment and was in a familiar quandary: When to ask a question and risk to be labeled an idiot? Should I spend time figuring out a
178
The research model assumes a relationship between the severity of not meeting deadlines and developers’ attitude toward reusing internet code in a way (potentially) violating resulting obligations. It does not test a relationship between the general existence of time pressure and developers’ attitude. The rationale for this is that while most software development projects have some share of time pressure, developers will only react to this if they perceive “severe” consequences from not meeting the resulting deadlines. Further, the existence of time pressure may vary from project to project and could even differ for different points in time within a project. Contrary to this, the severity of time pressure should be rather stable over time within a firm and thus relates better to attitude which is also a construct expected to be rather stable over time.
179
Translated from German.
162
Commercial software developers’ perspectives on internet code reuse
problem or seek help? The chance is high that the question is deemed elementary and hence, would be labeled as sub-par for the assignment. However, not asking would delay development. This is a common lose-lose situation and ripe for quickly grabbing code from a risky source like the internet.” The outlined benefit of avoiding consequences from missing deadlines should be more appealing to developers working in firms where missing deadlines and thus time pressure is perceived to be more severe than to other developers. Consequently, the first group of developers should also be more positive toward reusing internet code in a way (potentially) violating resulting obligations. H5b: Severity of time pressure in the developer’s firm will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. Benefits: Cost of compliance. The last relevant benefit is avoiding the cost for ensuring compliance with obligations when reusing internet code. This cost which developers might want to avoid can be broken down into two components. The first one addresses the costs for investigating which obligations come with the internet code the developer wants to reuse while the second one relates to the costs borne for ensuring that all obligations previously identified are accounted for properly. Similar to the perceived usefulness of internet code reuse, developers can also hold different perceptions regarding these two cost components and thus expect different levels of benefits from avoiding the cost. Concerning the costs to thoroughly check for the obligations of internet code one developer of the qualitative pre-study for instance finds them to be rather high and explains, “my problem is that licenses are often written in legalese which is hard to comprehend.” It seems likely that developers with this position consider avoiding the cost of compliance by not checking for reuse obligations thoroughly as an attractive option. To the other extreme, another developer has exactly the opposite opinion: “Software license issues are easy to check, and clearly nobody should integrate code without checking the license.” The statement nicely emphasizes that due to the perceived low cost of compliance this developer considers not checking for reuse obligations an unattractive behavior. The same discrepancy can also be observed with regard to the second component addressing the costs for accounting for the obligations. Here developers typically have to
Commercial software developers’ perspectives on internet code reuse
163
engage with others in their firm (often their supervisors) to determine whether and how the obligations of the code can be fulfilled.180 If developers expect high costs in the form of a lengthy and difficult discussion with their firm combined with a high likelihood of not being allowed to reuse the internet code in the end, they might very well consider avoiding this step, reusing the code right away and violating the obligations they are aware of as attractive. Strod (2009, p. 1) provides an illustrative example for a time-consuming process which developers must pass through to ensure compliance with the obligations of the internet code they want to reuse: “In one major software company, developers must fill out a 10-page form and make a presentation before a review board in order to use open source code.” It would not be too surprising to find developers in this firm who deem avoiding these substantial costs and simply reusing internet code without accounting for resulting obligations a reasonable choice. Once more a developer from the qualitative pre-study has a perception opposite to the situation described by Strod: “Where I work we have used some open source code, but done it properly. If I wanted to use some [internet code], I would ask our CTO […]. This would be as easy as walking into his office and having a five minute chat about it.” The difference in the time and effort necessary between filling out a 10-page form and a short conversation with one’s manager suggests that developers might consider to just ignore the obligations of the internet code they have reused if they perceive the cost of compliance to be too high, because the process is very difficult or has a high probability of not allowing the developer to reuse the code at stake. H5c: Cost of compliance will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. The second factor on the cost side is the potential punishment which can result for the individual software developer and/or for her firm from (potentially) violating obligations from reused internet code.181 This cost type is closely linked to deterrence theory which is a particular implementation of expected utility theory in the context of punishment. It is discussed in the following.
180
This is necessary as for example software developers can typically neither decide by themselves whether proprietary code affected by a “reciprocity”-effect can be made available under an OSS license nor can they legally assign a license to code for which they do not hold the copyright.
181
For the firm punishment can only result from actually violating obligations while for the developer punishment may result from potentially and actually violating obligations.
164
Commercial software developers’ perspectives on internet code reuse
Deterrence theory In deterrence theory literature punishment is usually decomposed into punishment severity and punishment certainty (e.g. Tittle 1980). Related to the expected utility theory, deterrence theory proposes that the level of illegal behavior decreases when punishment severity and/or punishment certainty are increased (Ehrlich 1973, 1996). In the context of IS, Straub (1990) finds support for the deterrence theory in the context of computer abuse in organizations and Peace et al. (2003) identify a link between both punishment severity and punishment certainty and the intention of employees to pirate software at their workplace. Similar to usefulness of internet code, severity of time pressure and cost of compliance, punishment severity and punishment certainty are factors which directly relate to the expected outcome of reusing internet code in a way (potentially) violating resulting obligations. Thus, in the logic of the research model they should affect the attitude of individual commercial software developers. In the context of reusing internet code in a way (potentially) violating resulting obligations, two different types of punishment severity need to be considered. The severity of the punishment for the firm and the severity of the punishment for the individual developer.182 Costs: Punishment severity (firm). Regarding punishment severity for the firm there should be developers who are well aware of the consequences which their employers might face if they violate obligations from internet code reuse (see Chapter 4.2.1 for an overview of these consequences) and who consider these consequences as significant. One example for such a developer is a participant of the qualitative pre-study who explains, “a license can be legally enforced. You have to read it before use.” This statement illustrates that developers who know that violating obligations from reused code could create substantial (legal) trouble for their employer should have a more negative view on reusing internet code in a way (potentially) violating resulting obligations. On the contrary there should however also be developers who are less aware of the issues which can result for their employer from incorrectly reusing internet code. Again a developer from the qualitative pre-study illustrates this by stating that “many of the snippets I find on the internet can be considered folklore. Thus, there is no need to refrain from reusing them.”183 It seems
182
Deterrence theory as applied by Straub (1990) and Peace et al. (2003) assumes that individuals react to the punishment they personally have to expect following a certain behavior. However, Thornton et al. (2005) point out that individuals may also react to the punishment their employers have to expect.
183
Translated from German.
Commercial software developers’ perspectives on internet code reuse
165
plausible that this developer also holds a more positive view on reusing internet code in a way (potentially) violating resulting obligations since he perceives the consequences of this behavior for his firm as less severe. Summarizing, developers who perceive a high punishment severity for their firms because they are more aware of the consequences of violating internet code obligations should be less positive toward reusing internet code in a way (potentially) violating resulting obligations. H6a: Punishment severity for the developer’s firm will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. Costs: Punishment severity (developer). Regarding punishment severity for the individual developer, the qualitative pre-study has revealed that there exist firms with explicit rules on how their developers have to deal with internet code and that some of these firms also strictly enforce these rules. As one developer emphasizes this point, “I work for a company that expressly prohibits including open source software. […] if I were to cut and paste [internet code], it would cost me my job.” While this particular firm bans reuse of internet code altogether, there are also firms which allow only the reuse of code under certain licenses or only code with certain obligations and have introduced punishments for developers not complying with these rules. Yet, on the other side, there seems to be also a large number of firms which do not address the topic of obligations from internet code reuse at all and which thus also should not have established any form of punishments for developers who violate or potentially violate obligations of internet code. As the CTO of a software development firm which actively manages the reuse of internet code interviewed in the qualitative pre-study explains, “I believe that 80% of the firms developing software are not aware of the risks of internet code. […] And of the remaining 20%, 5% to 15% knowingly ignore the potential issues.”184 Resulting, developers who based on rules of their firm perceive a high punishment severity for themselves when reusing internet code in a way (potentially) violating resulting obligations should have a less positive attitude toward the behavior. H6b: Punishment severity for the developer will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations.
184
Translated from German.
166
Commercial software developers’ perspectives on internet code reuse
Costs: Punishment certainty.185 Similar to punishment severity which is differentiated into the punishment for the firm and for the individual developer, punishment certainty can refer to detecting that the firm’s software contains internet code and does not account for the resulting obligations on the one hand and to identifying the individual developer who is responsible for this issue on the other hand. The research model focuses however only on the probability of detecting internet code reused in the firm’s software, as it should generally be feasible to comprehend which developer was responsible for developing the part of the software in which a violation of an obligation has surfaced. It is generally assumed that “determining whether [internet code] is present in a corporation’s code base is a difficult task to perform accurately” (McGhee 2007, p. 8). Yet, recently organizations such as gpl-violations.org186 have been founded to actively pursue the violation of obligations from reused internet code by commercial firms. Further, experts in combing software for reused internet code such as Hemel (2008) have published hands-on instructions for copyright holders to investigate whether their code is reused by firms without paying attention to the resulting obligations. Developers who are more aware of these recent developments should perceive a higher punishment certainty. Beyond that, there are various other factors which might influence the punishment certainty developers perceive such as the programming languages employed (as the binary code created by some programming languages can be analyzed more easily than that of others) or the deployment mode of the software (e.g. embedded software vs. standalone software or few customers vs. many customers). Resulting, developers who perceive a high punishment certainty for the software they develop should be less positive toward reusing internet code in a way (potentially) violating resulting obligations. H6c: Punishment certainty will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. Summarizing, the proposed research model (see Figure 4-2) suggests that in line with TPB the intention of commercial software developers to reuse internet code in a way (potentially) violating resulting obligations is determined by their attitude toward the 185
Punishment certainty captures the likelihood that somebody outside of the developer’s firm determines that the firm’s software includes internet code and does not account for the resulting obligations. Typically, somebody outside of the developer’s firm has only access to the binary code of the software. The likelihood of the developer’s firm determining itself that its own software reuses internet code and does not account for the resulting obligations is covered through the “controllability” portion of perceived behavior control.
186
http://gpl-violations.org, last accessed 12.01.2010.
Commercial software developers’ perspectives on internet code reuse
167
behavior (H1), the subjective norm they perceive (H2) and their perceived behavioral control (H3). Further, derived from ethical work climate theory, an ethical work climate of complying with laws and codes is conjectured to affect intention (H4a) as well as perceived subjective norm (H4c). The same two effects are proposed to exist for an ethical work climate of complying with firm rules (H4b, H4d). As determinants of attitude, expected utility theory suggests perceived usefulness of internet code (H5a), perceived severity of time pressure (H5b) and perceived cost of compliance (H5c). Further, following deterrence theory, relationships between attitude and perceived punishment severity for the individual developer’s firm (H6a) and herself (H6b) as well as perceived punishment certainty (H6c) are inferred.
H5a (+)
Attitude H5
H6 a H6 b
Subjective norm
(-)
Punishment severity (developer)
Punishment certainty
H4 a
(-)
H4c (-)
H4d (-) H6 c
Determinants of costs
Punishment severity (firm)
Ethical work climate: Law & code
Ethical work climate: Rules
H2 (+)
b H4
Intention
(-)
(+ )
) (+
(-)
5c H
(-)
Cost of compliance
) (+
Severity of time pressure
+) b(
H 3
Usefulness of internet code
1 H
Determinants of benefits
Figure 4-2: Internet code reuse obligation violation research model
Perceived behavioral control
In order to answer the research questions presented in Chapter 4.2.3 and to test the research model developed in this chapter, data on internet code reuse by individual commercial software developers and (potential) violations of obligations resulting from this were collected with a survey. The design of this survey and the process of conducting it are discussed in the next section.
168
4.4.
Commercial software developers’ perspectives on internet code reuse
Survey design and methodology
4.4.1. Data source and sample selection The research objects of this study are individual software developers working in commercial firms. Obviously, there exists no complete directory which would have allowed contacting a large number of developers from a broad range of firms for the survey. However, given the need for commercial developers from many different firms,187 participants in software development newsgroups seemed to be a well suited population. Newsgroups are similar to discussion forums on web pages, yet differ from these as they rely on a different technical infrastructure (e.g. Hauben & Hauben 1997, p. 161 ff.). They allow posting, reading and replying to messages by many users in different locations. Software development is a popular topic in these newsgroups and it seemed likely that many of the people discussing software development topics there would also develop software for a living. In total 528 different newsgroups dealing with software development could be identified. In order to be able to contact software developers active in these groups, a database was constructed by downloading all messages available in the 528 software development newsgroups with a self-developed Java program in late July 2009.188 Using the database, the survey population was constructed as follows (see Figure 4-3). The database contains a total of 1,314,336 messages posted in the 528 newsgroups between July 14th, 2006 and July 16th, 2009. These messages had been posted by 93,541 unique participant profiles.189 Of this total frame population 55,329 participant profiles or about 59% had not posted in any one of the 528 newsgroups after July 1st, 2008 and were thus excluded from the survey as they seemed to be inactive.190 Of the remaining 38,212 participant profiles, 13,525 were excluded due to various issues. First, several participant profiles seemed to refer to the same participant who was active in the newsgroups with different email addresses. Further, for some profiles it could safely be assumed after
187
Developers from only one or only a few firms might be rather homogeneous in their beliefs and opinions since they are influenced by the same firm or firms.
188
Newsgroup messages are deleted after a certain period of time. The oldest messages included in the database had been posted on 14.07.2006.
189
A participant profile had been created for each unique email address in the database.
190
The date of 01.07.2008 was chosen as a cut-off date in order to address only newsgroup participants with the survey who had been active at least once in the twelve months before the survey was sent.
Commercial software developers’ perspectives on internet code reuse
169 191
manual inspection that the participant was not a software developer.
Finally, some
192
profiles did not contain valid email addresses.
Figure 4-3: Construction of internet code reuse survey population Participants in software development newsgroups between July 2006 and July 2009* 100,000
93,541 E.g. clearly invalid email address, duplication of other profile, participant clearly not active in software development
80,000 60,000 38,212
40,000
-55,329
24,687 -13,525
23,475
Participant profiles active after 01.07.2008
Participant profiles with issues
Participant profiles without issues
Pretest population
Survey population
-1,212
Participant profiles inactive after 01.07.2008
0
Total participant profiles
20,000
59.1%
40.9%
14.5%
26.4%
1.3%
25.1%
in % of total participant profiles
*The exact cut-off dates are 14.07.2006 and 16.07.2009
The remaining 24,687 participant profiles seemed to be valid regarding the above elimination criteria. As a last step, further 1,212 participant profiles were removed from the survey population because they had been asked to pretest the survey (see Chapter 4.4.3). After all these adjustments, 23,475 participant profiles remained available as the population for the final survey, equaling 25% of all participant profiles initially constructed from the database. Of these 23,475 participant profiles, a random sample of 14,000 was drawn and invited to participate in the survey.
4.4.2. Survey design The survey was conducted via an online questionnaire (see Appendix A.2.2).193 Its design and implementation followed the same general considerations and the same process as laid out in Chapter 3.5.2 for the survey among OSS developers. For the sake of brevity these considerations and this process are not repeated here.
191
This was e.g. the case if the participant had only posted to the newsgroups trying to sell something not related to software.
192
Some newsgroup participants use fake email addresses in order to avoid receiving spam mail.
193
The online questionnaire was developed using the (http://www.limesurvey.com, last accessed 12.06.2009).
OSS
survey
application
LimeSurvey
170
Commercial software developers’ perspectives on internet code reuse
Most questions of the survey were designed as mandatory questions. Exceptions were demographic questions because some culture groups do not feel comfortable providing this information. Nearly all questions were conditional in order to ensure that developers were only presented questions relevant to them. This was especially important because the survey had to accommodate four different types of participants. First, people currently developing software for a firm. Second, people who have formerly developed software for a firm. Third, people who develop software but have never done so for a firm and fourth, people who are not capable of developing software but had still received an invitation to the survey. Based on their individual situations, members of the four groups were presented only questions adequate for them. Beyond the general considerations of designing a survey which were addressed similarly to the survey in Chapter 3, the ethical nature of this survey required some additional deliberations. As Randall and Gibson (1990) point out, a key problem of survey research addressing ethical topics is the omission of contextual information necessary to elicit realistic decision making. As a solution they suggest Fredrickson’s (1986) scenario methodology as a means to embed realism into ethics surveys. This methodology suggests using a scenario to provide a standardized decision stimulus to the survey participants. Similarly, also Cavanagh and Fritzsche (1985) propose to use scenario vignettes to capture real situations and make conditions comparable for each respondent. Based on these suggestions, scholars in IS (e.g. Banerjee et al. 1998; Thong & Yap 1998; Pierce & Henry 2000; Leonard et al. 2004; Moores & Chang 2006; Haines & Leonard 2007), but also in other research domains such as sales and marketing (e.g. DeConinck & Lewis 1997) or health-care (e.g. Randall & Gibson 1991) have employed scenario vignettes in their studies of ethical topics.194 Following this stream of research also this survey used scenario vignettes to infuse realism into the questionnaire. The vignettes employed are based on the Code of Ethics and Professional Conduct adopted by the Association for Computing Machinery (ACM) in 1992. The purpose of the code was to “[…] deter unethical behavior of the members [of the ACM]” by “[…] list[ing] possible violations and threaten[ing] sanctions for such violations” (Anderson et al. 1993, p. 98). When presenting and discussing this code, Anderson et al. (1993) list nine cases describing situations calling for ethical decision making in the IS domain. The first one of these cases (see Appendix A.2.1) describes a 194
In a review of 174 recent empirical studies investigating ethical situations, O’Fallon and Butterfield (2005) find that 55% of those studies rely on scenarios.
Commercial software developers’ perspectives on internet code reuse
171
commercial software developer who under time pressure and stuck with technical problems reuses existing code without thoroughly checking for all obligations of the code and without accounting for those obligations she is aware of.195 Using the learnings from the qualitative pre-study (see Chapter 4.3.1), the existing case of Anderson et al. (1993) was modified in such a way that it reflects the situation of a commercial software developer reusing internet code today as accurately as possible. The modified case now presents a commercial software developer named Joe who is under time pressure to complete his module of a software development project and who is not sure how to implement a certain piece of functionality specified for his module. In order to solve this situation Joe reuses internet code in ad-hoc fashion. In order to account for the heterogeneity of internet code reuse behaviors which (potentially) violate obligations (see Section 4.3) three different scenarios were derived from the above base case:196 − In scenario 1 (see Figure A-2 in the Appendix) the form of internet code which Joe reuses is a snippet for which he does not check thoroughly whether there are obligations which need to be fulfilled when reusing it. − In scenario 2 (see Figure A-3 in the Appendix) the form of internet code which Joe reuses is a component and similar to scenario 1 he does not check thoroughly whether there are obligations which need to be fulfilled when reusing the component. − In scenario 3 (see Figure A-4 in the Appendix) Joe, like in scenario 1, reuses a snippet, however, he does check for the obligations resulting from reuse and finds that the snippet effects a “reciprocity”-effect similar to that of the GPL (see Chapter 4.2.1). Joe believes that discussing with his firm about complying with the “reciprocity”-effect would take long and sees a chance that his firm would not consider compliance as an option but rather forbid him to use the snippet. Due to that he chooses to simply ignore the obligation, alters the snippet a little bit and integrates it into his work.
195
Reflecting the time when this code of conduct was developed, the developer does obviously not reuse internet code, but code that she has been made aware of by a colleague.
196
While the three scenarios derived do not cover all possible behaviors (potentially) violating obligations when reusing internet code, the qualitative pre-study results suggest that these scenarios reflect the most common behaviors.
172
Commercial software developers’ perspectives on internet code reuse
All three scenarios present a positive ending – in the short run at least – in which Joe manages to deliver his module with all required functionality on time and in which the software development project of Joe’s firm is commercially successful. The resulting questionnaire consists of six sections. Sections one to four and section six were identical for each respondent while for section five one of the three scenarios was randomly selected for each developer invited to participate in the survey. The resulting survey structure is as follows: − 1. Demographics: The survey starts out with a general section addressing demographic information about the participants. Moreover, this section inquires about the participants’ software development history to find out for which situations they have ever developed software. The following sections and their questions build on this information. Participants who have never developed software in any form are directed to the end of the survey right away. − 2. Software development in the participant’s firm: Part two deals with questions regarding the current or last firm for which the participants have been developing software and their work for this firm.197 − 3. Reuse of internet code: The next section focuses on the reuse of internet code and asks the participants about the role of internet code for their work at their current or last firm and their familiarity with obligations which may result from the reuse of internet code.198 − 4. Scenario setting: This section informs the developers that they will be presented a scenario in the following and asks them to assume that they are in the situation described in the scenario while working at the last firm they have been developing software for. If the participants have never worked as software developers for a firm they are asked to imagine a fictitious firm they work for as a software developer.199 − 5. Internet code reuse scenario: Part five presents one of the three scenarios described above to the participants and inquires how the participants would behave if they were in the situation described in the scenario. 197
Obviously, this section is only presented to participants who either currently develop software for a commercial firm or have done so in the past.
198
Participants who have never developed software for a commercial firm are only asked about their familiarity with obligations from reused internet code.
199
In the analyses referring to commercial software developers presented later these developers are not considered.
Commercial software developers’ perspectives on internet code reuse
173
− 6. Concluding questions: The survey ends with a block of five quiz questions to objectively assess participants’ knowledge about obligations which may come with reused internet code and an instrument to assess the effect of social desirability in participants’ answers.
4.4.3. Pretest Similar to the survey among OSS developers (see Chapter 3.5.3) a pretest was conducted before launching the survey to make sure that the survey questions can be understood well and that all relevant answers are available for selection (Bortz & Döring 2003, p. 331; Schnell et al. 2005, p. 347). Again, first academic peers (four) provided feedback on the questionnaire, checking question types, phrasing, presentation and the order of the questions. Following to that, in June and July 2009 two rounds of pilot studies were conducted with 300 and 1,000 newsgroup participants selected at random from the database. These pilot studies had the primary purpose of assessing the quality of the instruments employed and determining whether newsgroup participants were willing to take part in the survey. The feedback received was very positive and the overall structure of the survey and its questions did not need any changes. Based on the comments received, minor changes were applied to the wording of some of the questions to avoid misunderstandings.
4.4.4. Conducting the survey Of the total survey population of 23,475 participant profiles, 14,000 were selected at random and sent an email invitation to take part in the survey. In order to personalize the invitation email, the real name of each newsgroup participant was extracted from the database and used in the invitation text.200 Similar to the survey conduct described in Chapter 3.5.4, Dilman’s (1978, p. 12) suggestions were followed in order to achieve a high response rate. Again, participants were sent an e-mail containing a direct link to the survey they only had to click on. Furthermore, the questionnaire was designed in such a way that it should not take more than 15 minutes to complete. To maximize participants’ benefit of taking the time to complete the survey, they were promised a detailed aggregate report of the data and given the option to sign-up for a raffle giving away ten book gift certificates. Finally, credibility 200
For most participant profiles a real name could be identified. For the others the nick name of the participant as stated in their newsgroup postings was used.
174
Commercial software developers’ perspectives on internet code reuse
was built with the participants by leveraging the reputation of Technische Universität München. The survey was active from August 2009 to November 2009. Of the 14,000 emails sent, 2,227 could not be delivered. While the participant profiles had been scoured for invalid email addresses (see Chapter 4.4.1), the high number of email invitations which could not be delivered reflects that some newsgroup participants apparently use fake contact information in their profiles. Of those newsgroup participants who did receive an invitation, 1,171 completed the survey (see Table 4-1), yielding a response rate of 9.9% which is in line with the typically low response rates of web surveys (Couper 2000). Of the 1,171 responses 38 had to be eliminated due to inconsistent or corrupt entries, resulting in a final data set with 1,133 observations. 412 answers relate to scenario 1, 343 to scenario 2 and 378 to scenario 3. Table 4-1: Internet code reuse survey response statistics Total invitations sent thereof delivered to designated recipients thereof not delivered to designated recipients Total questionnaires completed Total response rate (based on delivered invitations) Inconsistent or corrupt responses Total usable questionnaires completed thereof with scenario 1 thereof with scenario 2 thereof with scenario 3
14,000 11,773 2,227 1,171 9.9% 38 1,133 412 343 378
To estimate the presence of common method bias in the survey data, Harman’s onefactor test was employed. In this test all variables of a model are loaded onto a single factor in a principal component factor analysis. A significant amount of common method bias is assumed to exist if only one factor emerges or if one factor explains the majority of all the variance in the data (Podsakoff et al. 2003). None of these two conditions holds for the data of this study since the maximum variance explained by one factor is 22.7 percent and a total of ten factors have eigenvalues of greater than 1. Consequently there should not be large common method bias in the data. Besides common method bias another issue which could affect the results of this investigation is social desirability. According to Podsakoff (2003, p. 881) social desirability is a “[…] tendency on the part of individuals to present themselves in a favorable light, regardless of their true feelings about an issue or topic.” Effects resulting from this tendency are particularly relevant for studies with ethical topics because of their sensitive nature (Beck & Ajzen 1991; Randall & Fernandes 1991).
Commercial software developers’ perspectives on internet code reuse
175
Following Niederhof (1985) several steps had been taken in order to minimize social desirability effects. First, the survey was administered through an online application in anonymous form. Second, the scenarios employed to describe the unethical behavior helped to create some psychological distance between the respondent and the protagonist. Third, all critical survey items were presented in a nonthreatening, neutral tone. Yet, because such measures can only reduce social desirability effects and do not rule them out completely, the effect of social desirability was also measured in the questionnaire. The method employed most frequently to assess social desirability effects is the MarloweCrowne social desirability scale (Crowne & Marlowe 1960).201 In order to spare survey participants answering the 33 items of the original scale, a shorter version developed by Strahan and Gerbasi (1972) with only ten items was applied in this study.202 In order to assess the degree of social desirability effects in the data, the correlations between the social desirability scale and the constructs of the research model are analyzed. Most correlations are significant (see Table 4-2), suggesting the existence of social desirability effects. For example, participants seem to report an intention to (potentially) violate obligations lower than it actually is and present the two dimensions of ethical work climate inquired about more positively than they actually are. However, correlations are rather weak, suggesting that there are only mild social desirability effects. Table 4-2: Correlation of social desirability scale with other variables Importance of internet code reuse
Past behavior
Punishment certainty
Punishment severity (developer)
Punishment severity (firm)
Other variables Cost of compliance
Severity of time pressures
Usefulness of internet code
Ethical work climate: Rules
Ethical work climate: Law & code
Perceived behavioral control
Subjective norm
Attitude
Intention
Research model constructs
-0.14*** -0.07** -0.07** -0.06** 0.17*** 0.14*** -0.10*** -0.01 -0.08*** 0.06** 0.05* -0.03 -0.06* -0.05 * significant at 10%, ** significant at 5%, *** significant at 1% Notes: With the M-C 1(10) scale a short-form of the original Marlowe-Crowne social desirability scale (Crowne & Marlowe 1960) developed by Strahan and Gerbasi (1972) was employed; correlations displayed are Pearson product-moment correlation coefficients; correlations with a significance level <= 10% are bolded; multi-item research model constructs built as indices; Cronbach’s Į of social desirability scale is 0.585; N=1,121.
Finally, similar to the survey among OSS developers (see 3.5.4) a late-response analysis (Armstrong & Overton 1977) was conducted to test whether the respondents are representative of the population (non-response bias). Survey participants on average were very fast in taking the questionnaire. 49% completed the survey on the day on which they 201
Thompson and Phua (2005) find more than 1,900 papers referencing the Marlowe-Crowne social desirability scale for the period from 1974 to 2002.
202
Of the different scales developed by Strahan and Gerbasi (1972) the M-C 1(10) scale was employed. With 256 cited references in the period from 1974 to 2002, Thompson and Phua (2005) find this short form developed by Strahan and Gerbasi the most popular reduced version of the Marlowe-Crowne social desirability scale.
176
Commercial software developers’ perspectives on internet code reuse
had received the invitation. Due to this, participants who took the questionnaire more than three days after receiving the invitation have to be considered late-respondents already. The late-respondents account for about ten percent of the total respondents. Regarding the variables employed in the research model there are no significant differences between early and late respondents. Regarding other variables there is one significant difference. With 34.6 years of age on average, early respondents are significantly younger than late respondents with 37.5 years of age on average (paired t-test, p=0.0050). This difference in age is also reflected in significant differences in the years of software development experience (paired t-test, p=0.0007) and the years of experience as software developer in a commercial firm (paired t-test, p<0.0001). These differences might be caused by the fact that older respondents are more likely to have more social commitments (e.g. families) and/or jobs with more meetings etc. and thus could not respond to the survey invitation immediately. Resulting, younger software developers might be overrepresented in the survey data. Based on the survey data, the next section addresses the research questions regarding the reuse of internet code and the (potential) violations of resulting obligations by software developers in commercial firm in descriptive and exploratory fashion. After that, section 4.6 tests the research model with structural equation modeling techniques.
4.5.
Descriptive and exploratory analyses
4.5.1. Survey participants and their firms Before addressing the research questions, this chapter provides data about the survey participants and the firms for which they develop software or have developed software. Description of survey participants Of the 1,133 survey participants whose demographics are summarized in Table 4-3 the vast majority is male (98%), on average 35 years old and lives in Europe (54%) or North America (27%). Participants are well educated (86% of them hold a university degree or currently study toward one, 15% even hold a Ph.D. or are enrolled in a Ph.D. program) and most of them have studied or study IT-related subjects such as computer science (49%), engineering (20%), physics (9%) or mathematics (8%).203 203
19% of the participants are still students.
Commercial software developers’ perspectives on internet code reuse
177
Table 4-3: Demographics of internet code reuse survey participants Percentage Percentage Age (mean: 34.9, median: 32) Self-assessment of software development skills* 1-19 4% Basic 2% 20-29 34% Below average 7% 30-39 32% Average 26% 40-49 17% Above average 46% 50+ 13% Excellent 19% Region of residence Software development exp. (mean: 13.8, median: 13)* North America 27% Less than 1 year 1% South America 3% 1-4 years 14% Europe 54% 5-9 years 23% Asia and rest of world (RoW) 16% 10-14 years 26% Highest level of education 15-19 years 13% Non-university education 14% 20 years or more 23% Undergraduate or equivalent 35% Exp. as software developer in firms (mean: 9.7, median: 7)* Graduate or equivalent 35% Less than 1 year 4% Ph.D. or equivalent 16% 1-4 years 29% Subject of highest university degree* 5-9 years 25% Computer Science or related subject 49% 10-14 years 23% Engineering or related subject 20% 15-19 years 6% Mathematics or Physics 17% 20 years or more 12% Other 14% Software development experience in… …commercial firms 77% …own entrepreneurial activities 43% …private purpose software development 79% …OSS projects 54% …education 78% …other software development activities 8% No software development experience at all 1% *Percentages refer only to those participants for whom the segmentation is applicable, e.g. “self-assessment of software development skills” refers only to those respondents who have ever developed software. Notes: “Experience” is abbreviated with “exp.”; N=1,133.
77% of the survey participants have experience as software developers working for commercial firms. 43% have developed software for their own entrepreneurial ventures and 79% have developed software for private purposes. 78% have developed software during their education and 54% have contributed code to OSS projects. Further 8% have developed software for other purposes and only 12 participants (1%) have never developed software at all. On average the participants with software development experience have been programming for 13.8 years. Interestingly, 46% consider their software development skills as “above average”, 20% even think of themselves as “excellent” software developers. The participants who have developed software for commercial firms have an average experience of 9.7 years in this job. Three aspects stand out in the demographics just presented. First, the survey participants seem relatively young with an average age of 35 years. Second, the share of participants who has developed software for OSS projects (54%) seems relatively high and third, the finding that 65% of the participants consider their software development skills better than average is startling. The combinations of these three findings suggests that the
178
Commercial software developers’ perspectives on internet code reuse
survey respondents and most likely more generally participants in software development newsgroups are not perfectly representative of the average software developer employed in a commercial firm.204 Presumably people who discuss with others about software development publicly on the internet are younger, more skilled and more OSS savvy than the average commercial software developer. Consequently, newsgroup participants might also be more aware of the obligations which may result when reusing internet code. Thus, this study potentially understates the issues which commercial firms might face from the reuse of internet code by their developers. Given the objectives of this study, the remainder of this part of the dissertation focuses on those 869 survey participants who are currently developing software for commercial firms or who have done so in the past.205 These participants are referred to as “developers” in the following. Description of developers’ firms and their work there Of the 869 developers 79% are currently developing software for commercial firms. The other 21% have done so in the past, but are not working as commercial software developers anymore. All information inquired about developers’ firms and their work there (see Table 4-4) is related to the last firm for which the developers have been creating software. On average, developers have been with their firm for 4.8 years206 and 51% of them work there as programmer. 28% are employed as software/system architect and 5% as project manager. The majority of the developers (77%) are permanently employed at their firm and only a minority works on a time-limited basis such as freelance contracts. Developers’ firms are largely headquartered in Europe (52%) and North America (35%). Regarding the size of their firms, 19% of the developers work for firms with less than 10 employees while 22% are employed in firms with more than 5,000 employees. The largest group (37%) is formed by developers working for firms sized between 10 and 199
204
In a survey among software developers within one commercial firm Alexy (2009, p. 92 ff.) reports an average age of 40 years and a 42% share of developers who have contributed to OSS projects.
205
Since the majority (72%) of the developers not developing software for commercial firms anymore today have still done so within the last five years and more than 50% have even still been active within the last two years, also this group is included in the analyses.
206
Developers who are not creating software anymore today were with their last firm for 4.79 years on average while developers still developing software have been with their current firm for 4.85 years on average.
Commercial software developers’ perspectives on internet code reuse
179
employees. Only about a third of the developers work for firms older than 20 years while 22% are employed by firms younger than five years. In terms of the programming languages employed most frequently by the developers there is quite some heterogeneity. Most popular are C++ (20%), Java (14%) and Python (12%), however, 14% of the developers primarily use niche languages which had not been offered for selection in the questionnaire. Table 4-4: Characteristics of commercial software developers’ firms Percentage Percentage Developers’ tenure at their firm (mean: 4.8, median: 3) Location of firm’s headquarters** Less than 1 year 18% North America 35% 1-4 years 50% South America 3% 5-9 years 17% Europe 52% 10 years or more 15% Asia and rest of world (RoW) 10% Developers’ role at their firm* Size of firm** Programmer 51% 1-9 employees 19% Software/system architect 28% 10-199 employees 37% Project manager 5% 200-999 employees 12% Systems/requirements/business analyst 2% 1,000-5,000 employees 10% Database designer/developer 2% More than 5,000 employees 22% Tester 1% Age of firm** Other role 11% Less than 1 year 2% Developers’ main programming language at their firm 1-5 years 20% C++ 20% 6-10 years 20% Java 14% 11-20 years 21% Python 12% More than 20 years 37% C 9% Ruby 8% PHP 5% C# 5% Other 14% *Information about developers’ roles is only available for 807 of the 869 developers. **Developers who were not able to provide this information and selected “I do not know” in the questionnaire are excluded from the calculation of percentages. Note: N=869.
54% of the developers work for firms for which software development is their main business and 68% of the developers write software for external customers of their firms as opposed to software for internal use within the firm. Of those developers writing software for external customers about one third develops custom-built software while two thirds write off-the-shelve software for multiple customers. These distinctions are important because checking for obligations from reused internet code and ensuring compliance is especially relevant for those developers creating software for multiple external customers. This is because of two reasons: First, the more widely distributed software is the more likely it is that somebody becomes aware of potential violations. Second and more
180
Commercial software developers’ perspectives on internet code reuse
importantly, the majority of the value appropriation issues related to “reciprocity”-effects (see Chapter 4.2.1) surface in software developed for multiple external customers.207 Lastly, 20% of the developers write software for embedded applications. This might also be relevant with regard to (potential) violations of obligations of reused internet code as software for embedded applications is typically considered to be more difficult to analyze for reused code. After having established the demographics of the survey participants and the characteristics of their commercial software development activities, the following chapters present descriptive and exploratory answers to the research questions regarding internet code reuse in commercial software development and the (potential) violations of resulting obligations in this context.
4.5.2. Developer awareness of internet code reuse obligations A first indication of whether commercial software developers account properly for potential obligations when reusing internet code should be their level of awareness regarding the potential obligations of internet code reuse in commercial software. Quite surprisingly, 24% of the developers have never received any training or other information regarding internet code reuse in commercial software development and the resulting benefits and risks (see Figure 4-4). Also interesting is the additional finding that only very few developers have been exposed to “formal” forms of training and information. Only 20% of the developers have learned about internet code reuse in commercial firms and only 17% of them have discussed the topic during their education.208 Much more important than these “official” channels are the internet (65% of the
207
As has been laid out in Chapter 4.2.1, “reciprocity”-effects demand that proprietary code tightly integrated with reused code is made available under the license of the reused code. In case the reused code is licensed under an OSS license this entails that users of the resulting software have the right to access, modify and distribute the source code of the formerly proprietary code. If the software at stake is developed only for firm internal purposes, the firm is its own only customer and no value appropriation issues can occur. If the software is developed for only one customer (i.e. custom-built software) only this one customer can demand to access, modify and distribute the proprietary code, but only after already having paid for the development of the software. Thus, also in this case there is no direct value appropriation risk. However, an indirect value appropriation risk exists if the single-customer redistributes the software to multiple other customers of its own. In this case the single-customer could face direct value appropriation issues and pass them on to its supplier. Despite this, value appropriation is mainly in danger in software developed for multiple external customers. However, some rather uncommon licenses such as the GNU Affero General Public License (AGPL) may even create value appropriation issues through their special “reciprocity”-effect if software is used only internally.
208
Of the developers younger than 30 years, 27% have discussed internet code reuse during their education.
Commercial software developers’ perspectives on internet code reuse
181
developers), friends or colleagues (44% of the developers) and even IS related magazines (32%). While both internet and friends or colleagues may be sources of reliable and correct information regarding internet code reuse in commercial software development, there is also the risk that wrong pieces of information understating the topic of obligations resulting from internet code reuse proliferate in these channels. This would be less of an issue if commercial firms and institutions of education were correcting potential misconceptions among developers; however, the data suggest that this is not happening. Figure 4-4: Commercial software developers’ training regarding internet code Sources from which developers have received training/information on the benefits and risks of internet code reuse (in % of developers) 80% 65% 60 44% 40 32% 20%
20
24% 17% 5%
0
Number of developers in class
Internet
Friends or colleagues
IS related magazines
Commercial firm
Education/ university
Other
No training and information at all
564
385
274
174
147
43
207
Note: N=869.
Given the prominent role of potentially unreliable sources for developers to learn about internet code reuse in commercial software development and the fact that nearly a quarter of the developers have not received any information at all, it is interesting to investigate developers’ knowledge regarding obligations from internet code reuse directly. When selfassessing their knowledge, 49% of the developers consider themselves as “familiar” with obligations from internet code reuse and claim, “I am aware of nearly all potential obligations and can deal with them well” (see Table 4-5).209 Only 5% think of themselves as “not very familiar” or “not familiar at all” while 16% see themselves as “very familiar,” considering themselves as an “[…] expert on obligations of code from the internet […]” and state that “[…] other people ask me for my opinion on this topic.”
209
The text in italics is a verbatim copy of the text presented in the questionnaire.
182
Commercial software developers’ perspectives on internet code reuse
Contrasting this self-assessment with the results of a self-developed five-question quiz regarding obligations potentially resulting from internet code reuse suggests that developers overestimate their knowledge.210 With a Pearson product-moment correlation coefficient of 0.340 (p<0.001), self-assessment and quiz score are not correlated very strongly and even those developers self-assessing their knowledge as “very familiar” on average failed on two questions in the quiz, obtaining only a mean score of 3.08 out of a maximum of 5 (see Table 4-5). Table 4-5: Commercial software developers’ internet code reuse knowledge
Share of developers self-selecting into respective group Developers’ average score in quiz on obligations of internet code reuse (max. score attainable: 5, average score across groups: 2.52) Notes: N=869.
Not familiar at all 2%
Not very familiar 3%
Somewhat familiar 30%
Familiar 49%
Very familiar 16%
1.08
1.33
2.06
2.74
3.08
Developers do not necessarily have to be experts on the obligations from internet code reuse if they can for example turn to legal professionals within their firms with their questions. However, the situation that even those developers who claim to be asked for help regarding obligations from internet code reuse by colleagues have difficulties answering the quiz questions correctly hints toward a risk potential for their employers. Having established that many developers are not aware of some of the obligations which may surface when reusing internet code, it is interesting to investigate determinants of developers’ knowledge. First, and as expected, developers who are active in OSS projects are significantly more knowledgeable regarding the various obligations of internet code reuse than other developers (Tobit regression, p<0.001).211 Further, both North American and Asian developers are significantly less knowledgeable than European developers (Tobit regression, p=0.045 and p=0.037, respectively) while there is no significant difference between South American and European developers (Tobit regression, p=0.590). Moreover, developers with a background in mathematics or physics, business administration and other subjects scored significantly worse on the quiz than developers
210
See Appendix 4.2.3 for the quiz questions and statistics describing developers’ answers. The quiz was developed following the qualitative pre-study and reflects the most common obligations which commercial software developers may face when reusing internet code.
211
The factors influencing developers’ knowledge regarding internet code reuse obligations are determined by Tobit regression analysis with robust standard errors (observations=869, pseudo R²=0.0413, F(16, 853)=8.09, p<0.0001) using their quiz score ranging from 0 to 5 as dependent variable. As independent variables the model includes developers’ experience as professional software developers in years, their gender, their geographic residence on a continent level and the subject of their studies. Further variables indicate whether they have ever been active in OSS projects and to which forms of trainings and information regarding internet code reuse they have been exposed.
Commercial software developers’ perspectives on internet code reuse
183
with degrees in computer science and related subjects or engineering (Tobit regression, p=0.060, p=0.034 and p=0.036, respectively).212 Surprisingly, most forms of training and information covered in the questionnaire do not improve developers’ knowledge regarding internet code reuse obligations. Developers who have been exposed to trainings or information on the topic within commercial firms (Tobit regression, p=0.303), from friends (Tobit regression, p=0.474), from magazines (Tobit regression, p=0.425) or from other sources (Tobit regression, p=0.669) are not significantly more proficient regarding internet code reuse obligations than developers without such trainings and information. Developers who have been taught about the topic during their education scored even significantly worse on the quiz than developers without internet code reuse on their curriculum (Tobit regression, p=0.055). Only pieces of information which developers have acquired on the internet significantly enhance their knowledge (Tobit regression, p=0.001). This finding somewhat mitigates the concerns raised earlier regarding the internet being developers’ main source of information about internet code reuse obligations. Given the above findings that developers’ average knowledge regarding the potential obligations from internet code reuse seems to be limited and training and information is first mostly ineffective and second largely delivered outside of firms and universities, it would seem reasonable for firms to introduce explicit policies providing close guidance to developers considering to reuse internet code. Despite this, of the 869 developers only 302 or about one third work in firms with policies explicitly addressing internet code reuse.213 A comparison of those firms with policies with firms without policies yields the following insights: − Large firms (with more than 5,000 employees) are significantly more likely to have policies than smaller firms (logistic regression, p<0.001).214 Further, very small firms (with less than ten employees) do not differ from medium sized firms (with 1,000 to 212
Developers with a background in engineering do not differ significantly from those with a degree in computer science or a related subject (Tobit regression, p=0.555).
213
The existence of such policies was queried via the developers completing the survey. Thus, it is possible that the number of firms with policies is higher, but their developers are not aware of the policies.
214
The characteristics of firms with policies addressing internet code reuse are determined by logistic regression analysis with robust standard errors (observations=818, pseudo R²=0.1171, ȋ²(14)=103.46, p<0.0001) using a dummy as dependent variable which indicates whether the respective developer’s firm has policies addressing internet code reuse or not. As independent variables the model contains the size of the firm, the location of its headquarters, its age, information regarding whether software development is its main business activity and a construct indicating its ethical climate regarding laws and codes (see Chapter 4.6.3 for a description of the construct).
184
Commercial software developers’ perspectives on internet code reuse 5,000 employees) regarding their likelihood of having policies (test of equality of coefficients after logistic regression, p=0.664). It seems reasonable that rather small firms do not have written policies because direct communication between employees might be more efficient for them. Yet, this should not apply anymore to firms with more than 1,000 employees and thus it is startling that this class of firms is not more likely to address internet code reuse with policies.
− 53% of the firms founded less than a year ago have policies while only 35% of the firms older than one year have policies. In the logistic regression this difference is not significant (e.g. p=0.151 for the difference between the start-ups and incumbents older than 20 years), but this may be due to the very low number of developers working for such start-ups (17). This finding might be an indication that very recently, founders have begun to take not only the benefits of internet code reuse very seriously but also the potential issues. − Contradicting the opinion of several interviewees from the qualitative pre-study,215 firms headquartered in Asia are not significantly less likely to have policies than firms with their head office in Western Europe or North America (test of equality of coefficients after logistic regression, p=0.777 and p=0.603, respectively). Also surprisingly, firms with their headquarters in South America are significantly more likely to have policies than firms with their center of operations in any other region.216 − Firms for which developing software is their main business activity significantly more often have policies regarding internet code reuse than firms which develop software next to one or several other main activities (logistic regression, p<0.001). Presumably this is because their upper management has a better understanding of current issues in software development. − Finally, firms with a stronger ethical work climate regarding complying with laws and codes are significantly more likely to have policies in place that addresses internet code reuse (logistic regression, p<0.001). A further interesting finding regarding firms’ policies addressing internet code reuse is that about one quarter of the developers working in firms with such policies claims not to 215
Several interviewees had pointed out that Asian companies are less concerned about IP rights.
216
Logistic regression, e.g. p=0.051 for the difference between firms headquartered in South America and Western Europe.
Commercial software developers’ perspectives on internet code reuse
185
have read these policies. Partially mitigating concerns resulting from this finding, developers working in settings less critical with regard to reuse obligations, that is internal software development projects or custom-built software projects, are significantly less likely to have read the firm policies than other developers (logistic regression, p=0.022).217 However, being a potential source of concern, also developers’ happiness with their job at her current employer has a significant impact on whether developers read policies or choose to ignore them (logistic regression, p=0.010). Moreover, among the different roles, programmers are surprisingly significantly less likely to have read policies than developers with other roles, second only to database developers.218 Among the other roles (e.g. project manager, tester, software/system architect) there are no significant differences. Other factors such as developers’ geographic residence, their employment status (permanently employed vs. time-limited contracts) or the ethical work climate within the firm (regarding both compliance with laws and codes and firm rules) do not significantly influence developers’ reading of the policies. Summarizing, while developers are rather confident when self-assessing their knowledge regarding obligations from internet code reuse, an objective quiz raises some concerns. Further, educating developers about internet code reuse seems not to be high on the agenda in commercial firms and institutions of education. Developers’ main source of information on this topic is the internet which seems to be a reliable channel providing developers with correct information on obligations from internet code reuse. Surprisingly, trainings and information provided within commercial firms or during education do not appear to improve developers’ proficiency regarding internet code reuse obligations. Of the firms developing software only a minority uses policies to guide their developers when reusing internet code. Especially smaller and mid-sized firms and firms for which software development is not their main business do not employ such policies.
217
The characteristics of developers which influence whether they have read their employer’s policies regarding internet code reuse are determined by logistic regression analysis with robust standard errors (observations=283, pseudo R²=0.1293, ȋ²(16)=35.41, p=0.0035) using a dummy as dependent variable which indicates whether the respective developer has read the policies or not. As independent variables the model contains the time the developer has already been with her firm, her gender, her geographic residence on a continent level and her software development role at her firm. Further, her employment status, whether she develops embedded software or not and the criticality of the software she develops regarding reuse obligation are contained in the model. Lastly, her happiness at her firm and the ethical work climate she perceives regarding laws and codes and rules are included in the multivariate analysis.
218
With a likelihood of 65% of having read the policies programmers and database developers (50%) differ significantly from the reference group of software/system architects (87%) (logistic regression, p=0.002 and p=0.020, respectively).
186
Commercial software developers’ perspectives on internet code reuse
Also interestingly, about a quarter of the developers have not read the policies of their firms. Having established developers’ knowledge about internet code reuse obligations and firms’ approaches regarding explicit policies which address this topic, the next chapter deals with the actual importance of internet code reuse for commercial software developers.
4.5.3. Internet code reuse in commercial software development Taking into account the difficulties of measuring code reuse with a questionnaire discussed in Chapter 3.6.2, a rather coarse-grained measure is applied in this study to understand to which extent commercial software developers rely on reusing internet code in their work. Developers were asked to indicate the importance of reusing internet code for their individual work at their firm on a 5-point scale ranging from “not important at all” to “very important”. The results depicted in Figure 4-5 point out that the majority of the developers (56%) consider internet code reuse at least as “somewhat important” for their work. 18% of the developers even deem internet code reuse as “very important” and only 16% do apparently not reuse internet code at all, considering it as “not important at all.” On the five-point scale the mean importance of internet code reuse is 2.91, the standard deviation is 1.33 and the median is 3, representing “somewhat important”. Figure 4-5: Importance of internet code reuse for commercial software developers Importance of internet code reuse for commercial software developers (in % of developers) 30%
28% 23%
20
18% 16%
15%
10
0 Number of developers in class
Not important at all
Not very important
Somewhat important
Important
Very important
139
243
199
134
154
Note: N=869.
To provide a historical perspective on the evolution of internet code reuse for commercial software developers’ work, in the following, the importance of internet code
Commercial software developers’ perspectives on internet code reuse
187
reuse as perceived by developers is compared along the years in which the developers have stopped creating software (see Figure 4-6). The data show that developers who stopped writing software before 2004 apparently did not rely on internet code reuse very much.219 With an average importance value of 1.8 these developers rate internet code reuse between “not important at all” and “not very important” for their individual work. Only after 2003 did the importance of internet code reuse for commercial software developers’ work increase significantly (oneway ANOVA, p<0.0001),220 rising from a mean importance value of 1.8 in 2002 and 2003 to 3.0 (equaling “somewhat important”) in 2008 and 2009. Figure 4-6: Evolution of importance of internet code reuse over time Importance internet code reuse for commercial software developers by developers' last year as developer (in average importance perceived by developers surveyed) 3.0
3 2.5 2.2 2
1
1.8
1.8
Before 2002
2002 & 2003
2004 & 2005
2006 & 2007
S.D.
1.2
1.3
1.3
1.2
2008 & 2009 1.3
Number of developers in class
32
13
17
28
779
Notes: Average values displayed for multi-year groups; S.D. = standard deviation; importance scale: 1=Not important at all, 2=Not very important, 3=Somewhat important, 4=Important, 5=Very important; N=869.
Before 2004 code available from the internet might have been suited only for reuse in a small number of situations in commercial software development because it was not mature enough yet and did only cover a few areas of functionality. However, as a result of the strong growth of OSS in recent years (Deshpande & Riehle 2008), both the quality and the fields for which there exists code on the internet should have increased strongly which made internet code reuse much more attractive for commercial software developers. Besides the significant positive effect of time already discussed (ordered logistic regression, p<0.001), several other factors influence the importance commercial software developers attribute to internet code reuse for their work.221 First, similar to the results of 219
Besides the growing importance of internet code reuse, the data might also reflect selection effects on the side of the software developers.
220
The significance of the effect of time on the importance of internet code reuse by commercial software developers is also confirmed in multivariate analysis with multiple additional developer characteristics as independent variables. This analysis is presented in the further course of this chapter.
221
The reasons explaining the importance of internet code reuse for developers’ work are determined by ordered logistic regression analysis with robust standard errors (observations=807, pseudo R²=0.1024,
188
Commercial software developers’ perspectives on internet code reuse
Sojer and Henkel (2010a)222 regarding code reuse in OSS development, there seems to be a positive effect of access to local search regarding the reuse of internet code in commercial software development. Developers who have been or who are active in OSS projects deem internet code reuse significantly more important for their work as commercial software developers than other developers (ordered logistic regression, p=0.011). Probably they are more likely to rely on internet code because their costs of reusing (i.e. searching for internet code, evaluating, adapting and integrating it; see Chapter 3.2.2) are reduced due to their access to the OSS community. Further, developers with more experience as commercial software developers also attach a higher importance to internet code reuse (ordered logistic regression, p=0.060). Also their costs of reusing should be reduced as they can turn to their own experiences when pondering internet code reuse. Second, developers with different roles within their firms attach different levels of importance to internet code reuse.223 With mean importance values of 3.16 and 3.00, respectively, software/system architects and project managers deem internet code reuse most important for their work.224 They are followed by programmers with a mean importance value of 2.80 which is significantly lower than the importance attached to internet code reuse by architects (ordered logistic regression, p=0.045). Following the programmers is a group consisting of systems analysts (mean=2.56), testers (mean=2.25) and database developers (mean=2.24). All members of this group seem to rely significantly less on internet code reuse than programmers (test of equality of coefficients after ordered logistic regression, p=0.047, p=0.021 and p=0.023, respectively). Compared to programmers, systems analysts, testers and database developers are less concerned with actually writing code, thus it seems reasonable that reusing internet code is less important ȋ²(41)=426.38, p<0.0001) using an ordinal variable ranging from 1 (=”not important at all”) to 5 (=”very important”) as dependent variable. As independent variables the model contains information about the developer (experience as professional software developer in years, gender, geographic residence, subject of studies, OSS experience, software development skill level, knowledge about internet code reuse obligations (both self-assessed and tested) and previous training or information about internet code reuse) and information about her professional activities as software developer (last year as commercial software developer, employment status at her firm (permanent versus time-limited contract), software development role (programmer, tester etc.), main programming language and type of software developed (embedded vs. traditional software, internally used vs. externally sold software, custom-built vs. software for multiple customers). In addition to the factors discussed in the following which significantly influence the importance of reuse of internet code, also male developers seem to be more inclined to rely on internet code than female developers (p=0.094). 222
See also Chapter 3 of this dissertation.
223
Whether differences between the different roles are significant is tested in the multivariate regression described in footnote 221.
224
There is no significant difference in the importance of internet code reuse as perceived by software/system architects and project managers (ordered logistic regression, p=0.593).
Commercial software developers’ perspectives on internet code reuse
189
for them. The differences between software/system architects and project managers on the one hand and programmers on the other might be rooted in the greater latitude of the first group. The architecture of a piece of software reflects some basic “design choices” which may make reusing external code easier or more difficult (Baldwin & Clark 2006; MacCormack et al. 2006).225 Since it is architects and project managers who make these “design choices” these two groups might have more control over reusing internet code than programmers for whom the architecture of the software they develop is usually exogenous. Third, the main programming language a developer uses influences the importance of internet code reuse for her work.226 Developers using programming languages which are especially reuse-friendly and e.g. provide means to reuse code from various other programming languages with little or no modification such as Ruby (mean importance=3.89) or Python (mean importance=3.65) seem to rely most heavily on internet code.227 This group of developers is followed – at significant distance – by another group working with languages such as Perl (mean importance=3.22), JavaScript (mean importance=3.07), Java (mean importance=3.03) or PHP (mean importance=2.93).228 Finally, developers using more traditional programming languages such as C (mean importance=2.72) or C++ (mean importance=2.62) and less common programming languages such as Fortran (mean importance=2.57), Visual Basic (mean importance=2.50), C# (mean importance=2.42) or Pascal (mean importance=1.80) form the last group and consider internet code reuse least important.229 Besides the factors significantly influencing the extent of internet code reuse, also some factors which do not exhibit a significant influence provide interesting insights.230 First, 225
See Chapter 3.4.3 for a more detailed discussion of this topic.
226
Whether differences between the different programming languages are significant is tested in the multivariate regression described in footnote 221.
227
There is no significant difference in the importance of internet code reuse as perceived by developers mainly working with Ruby and those primarily programming in Python (ordered logistic regression, p=0.343).
228
The differences in the importance of internet code reuse between developers using Ruby and those using Perl, JavaScript, Java or PHP are significant with p=0.049, p=0.028, p<0.001 and p<0.001, respectively, in the ordered logistic regression described in footnote 221.
229
In a postestimation after the ordered logistic regression described in footnote 221, coefficients regarding the importance of internet code reuse for C, C++, Fortran, Visual Basic, C# are significantly lower than the coefficient for Java with p=0.077, p=0.001, p<0.001, p=0.054 and p=0.001, respectively. With a significance level of p=0.160 the coefficient for Pascal does not differ significantly from the coefficient for Java, however, this programming language is found only in a very small number of observations.
230
Beyond the factors discussed below also the following independent variables seem not to significantly influence the importance of internet code reuse for commercial software developers: Software development proficiency (ordered logistic regression, p=0.593), embedded vs. traditional software (ordered logistic regression, p=0.389) and education background of the developer (e.g. difference
190
Commercial software developers’ perspectives on internet code reuse
developers’ reuse of internet code seems to be independent of the risk potential of internet code reuse. There is no significant difference in the importance of internet code reuse for developers working on software to be sold to multiple external customers and developers working on custom-built software for only one customer or software for firm-internal use (ordered logistic regression, p=0.247). This finding might imply that even if their employer’s value appropriation might become at risk, developers do not restrain from reusing internet code. However, the insignificancy of the coefficient might also result from the fact that there exists less reusable code for internal or custom-built software due to its tailored nature. Resulting, developers in software development projects for multiple external customers might still be taking into account the potential value appropriation risks for their employers and reuse less internet code than they could as a consequence. However, the data might not show this effect because it could be counterbalanced by the lower availability of reusable internet code for custom-built or internal use software development projects. Second, developers who have never received any form of training or information on internet code reuse do not differ significantly in their perceived importance of internet code reuse from developers who have been made aware of its benefits and potential obligations (ordered logistic regression, p=0.121). Adding to that, while developers who self-assess their knowledge regarding the risks resulting from internet code reuse better also rely more on internet code reuse (ordered logistic regression, p<0.001), this relationship does not hold for the objective level of knowledge assessed with the quiz contained in the questionnaire (ordered logistic regression, p=0.596). Apparently, developers do not account for their own proficiency or lack thereof regarding internet code reuse obligations when considering the reuse of internet code. Third, a strong belief shared by many of the experts interviewed in the qualitative prestudy is that internet code reuse and especially internet code reuse (potentially) violating resulting obligations is more popular in geographies where IP is honored less, particularly in Asia.231 Further, interviewees suggest also more internet code reuse and (potential) violations in project setups which are more difficult to control for firms such as outsourced software development. Regarding the importance of internet code reuse, both assumptions cannot be confirmed with the quantitative data: between engineers and developers with degrees in computer science of related subjects, ordered logistic regression, p=0.205). 231
For example one interviewee rather bluntly claimed: “Recycling of existing things is part of the Asian culture and this is also reflected in the way they develop software” (Translated from German).
Commercial software developers’ perspectives on internet code reuse
191
− Asian developers do not differ significantly in the importance they attach to internet code reuse form European (test of equality of coefficients after ordered logistic regression, p=0.365) and North American (test of equality of coefficients after ordered logistic regression, p=0.278) developers. Interestingly, South American developers however consider internet code reuse significantly more important than any other developers.232 One potential explanation for the discrepancy between the qualitative and quantitative findings might be that the Asian developers in the survey are not representative of the average software developer in Asia. An indication for this might be the fact that the Asian participants in the survey engage in newsgroup discussions in English language with other developers mainly from Europe and North America. − Addressing the second point, developers employed on time-limited contracts such as freelancers as an example of a project setup more difficult to control do not differ from permanent employees in the importance they attribute to internet code reuse (ordered logistic regression, p=0.824). Summarizing, reuse of internet code by commercial software developers seems to be common and has grown in importance in recent years. The extent of internet code reuse practiced by individual developers is among other things influenced by their OSS involvement, their role in software development and their main programming languages. It seems however independent of the risk potential of internet code reuse and also of the sophistication of developers’ knowledge about the obligations which may result from it. To quantify the risk which firms developing software incur from their developers’ reusing of internet code, the next chapter analyzes how frequently developers have (potentially) violated internet code reuse obligations in the past.
4.5.4. Extent of (potential) violations of internet code obligations Across all three scenarios employed in the survey (see Chapter 4.4.2) the majority of the developers claim to have never reused internet code in a way (potentially) violating resulting obligations in the past in the way described in the respective scenario they were presented in the survey (see Figure 4-7). In more detail, 21% of the developers have at least once reused an internet snippet without thoroughly investigating the obligations which might have come with the snippet (scenario 1) and 16% have at least once reused an 232
Logistic regression, e.g. p=0.035 for the difference between European and South American developers.
192
Commercial software developers’ perspectives on internet code reuse
internet component without checking thoroughly for potential obligations (scenario 2). In the same range, 15% of the developers have at least once reused an internet snippet consciously ignoring and thus violating some of its obligations (scenario 3). As expected, knowingly violating obligations (scenario 3) seems less common than potentially violating obligations by not checking thoroughly for them (scenarios 1 and 2). Further, not checking thoroughly for obligations from snippet reuse (scenario 1) seems more common than not checking in detail for obligations from component reuse (scenario 2). Probably this is because detecting a reused snippet in a piece of software is more difficult than detecting a reused component and thus the punishment certainty (see the research model in Chapter 4.3.3) when reusing a snippet without thoroughly investigating the resulting obligations is lower than when reusing a component. Statistically, developers’ past behavior in scenario 1 differs significantly from that in scenario 3 (paired t-test, p=0.0451), but there are no significant differences between the behaviors in scenario 1 and scenario 2 (paired t-test, p=0.2035) and the behaviors in scenario 2 and scenario 3 (paired t-test, p=0.5211). Figure 4-7: Frequency of (potential) violations of internet code obligations Frequency of (potential) violations of obligations from internet code reuse by scenario (in % of developers) 100%
316
256
297
80
Often Sometimes Rarely Once
60 40
Never
20 0
Share of developers who have (potentially) violated an obligation at least once Mean frequency*
Scenario 1: Not checking thoroughly for obligations from internet snippet reuse
Scenario 2: Not checking thoroughly for obligations from internet component reuse
Scenario 3: Ignoring obligations from internet snippet reuse
20.6%
16.0%
14.5%
1.42
1.33
1.29
*Mean frequency calculation based on frequency scale: 1=Never, 2=Once, 3=Rarely, 4=Sometimes, 5=Often. Note: N=869.
When generalizing this data, the findings presented should be treated as a lower threshold of internet code reuse (potentially) violating resulting obligations. This is first because the data are affected by some mild social desirability effects (see Chapter 4.4.4) and second because the survey population is most likely not representative of software developers in commercial firms in general (see Chapter 4.5.1).
Commercial software developers’ perspectives on internet code reuse
193
4.5.5. Summary The purpose of the descriptive and exploratory analyses in this section was to shed light on the reuse of internet code by individual commercial software developers with a special focus on (potential) violations of the possibly resulting obligations. Further, this section was intended to establish the context for the analyses presented in the next chapter explaining determinants of internet code reuse (potentially) violating resulting obligations by individual commercial software developers. The previous chapters have shown that internet code reuse by commercial software developers is not uncommon with more than 50% of the surveyed developers considering it at least as “somewhat important” for their work. In the last five years the amount of internet code reused in commercial software development seems to have grown strongly (in parallel to the growth of OSS). The role which internet code reuse plays for individual commercial software developers is influenced by their familiarity with OSS, their role in the software development process and the programming languages they rely on primarily. Interestingly, software/system architects and project managers seem to consider internet code reuse as more important than programmers. Among the different programming languages Ruby and Python appear to be most supportive of internet code reuse. Leading to some concern regarding the (potential) violation of obligations resulting from internet code reuse, there seems to be no difference in the importance of internet code reuse between developers working on projects for multiple external customers and developers creating internal or custom-built software. The lack of a difference between those groups of developers may however also be related to the fact that there might exist less reusable code for custom-built or internal-use software. Further, while the objectively assessed knowledge which developers have regarding obligations from internet code reuse seems not to influence how much they rely on it, developers who subjectively (and probably wrongly) consider themselves as experts reuse more internet code. Regarding developers’ awareness of internet code reuse obligations the data highlights that nearly a quarter of the developers have never received any training or information on this topic and that neither commercial firms nor institutions of education seem to consider educating developers in these matters as very important. Because of this, developers’ main source of information about internet code reuse is the internet. Strikingly, the internet is also the only effective source of information. Trainings and information provided in commercial firms or in universities seem not to address the relevant topics and appear not to increase developers’ knowledge about internet code reuse obligations. Given this setting
194
Commercial software developers’ perspectives on internet code reuse
it is not surprising that many developers seem to have some gaps in their knowledge regarding internet code reuse obligations. Unfortunately, they appear not to be well aware of this and overestimate their own proficiency. Not surprisingly, developers with OSS experience not only reuse more internet code in their work as commercial software developers, but are also more aware of the potentially resulting obligations. In this situation where many developers lack detailed knowledge of internet code reuse obligations it is further surprising that only about one third of the developers work in firms which provide guidance regarding internet code reuse through policies. Especially smaller and mid-sized firms as well as firms with a main business activity outside of software development appear not to employ this mechanism to prevent their developers from (potentially) violating obligations of reused internet code. An additional concern with regard to policies is the fact that about one quarter of the developers working for firms with such policies has not read them. This is especially true for developers who are not happy with their current job and developers working on internal or single-customer software development projects. Regarding geographical segmentation two findings are worth noting. First, the commonly held belief that internet code reuse is very common in Asia is not supported by this study. Asian developers seem not to reuse more internet code than their Western counterparts and are also not less knowledgeable about potentially resulting obligations than North American developers. Further, Asian firms have a comparable likelihood of employing policies to regulate internet code reuse as Western firms. However, these findings may be affected by selection biases resulting from the Asian developers who have participated in the survey. As a second finding, internet code reuse seems to be most advanced in South America. South American developers make most use of internet code reuse, are equally aware of potential obligations as European developers and receive more guidance through policies from their firms than developers from any other region. Finally, 15 to 20 percent of the developers have either violated or potentially violated obligations of reused internet code in the past. As expected, developers seem more likely not to check thoroughly for obligations than to knowingly ignore them. Further, more developers have not checked thoroughly for obligations from snippets than from components. As a concluding note, the demographics of the survey population (see Chapter 4.5.1) and the tests for social desirability effects (see Chapter 4.4.4) suggest that the (potential) violations of internet code reuse reported in this study constitute a lower boundary while
Commercial software developers’ perspectives on internet code reuse
195
the importance of internet code reuse and developers’ knowledge regarding potentially resulting obligations should be rather considered as an upper boundary. This section has painted a detailed picture of the reuse of internet code by commercial software developers and of the (potential) violations of resulting obligations. This picture provides the context for the next section exploring determinants of internet code reuse in a way (potentially) violating resulting obligations by commercial software developers.
4.6.
Research model testing and results
Following the descriptive and exploratory analyses, this section addresses research question six with structural equation modeling techniques. By testing the research model developed in Chapter 4.3.3, determinants describing why commercial software developers reuse internet code in a way (potentially) violating resulting obligations are identified. Chapter 4.6.1 summarizes the hypotheses of the research model before the statistical methods employed are discussed (4.6.2). The measurement part of the research model is described and assessed in Chapter 4.6.3 and the hypotheses of the research model are tested in Chapter 4.6.4 and discussed in Chapter 4.6.5.
4.6.1. Hypotheses The research model developed in Chapter 4.3.3 proposes hypotheses to explain why commercial software developers reuse internet code in a way (potentially) violating resulting obligations. Table 4-6 provides a recapitulation of these hypotheses.
196
Commercial software developers’ perspectives on internet code reuse
Table 4-6: Summary of hypotheses regarding violations of internet code obligations Theory of planned behavior model H1 A more positive attitude toward reusing internet code in a way (potentially) violating resulting obligations will lead to greater intention of commercial software developers to engage in the behavior. H2 A higher level of subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations will lead to greater intention of commercial software developers to engage in the behavior. H3 A higher level of perceived behavioral control regarding the reuse of internet code in a way (potentially) violating resulting obligations will lead to greater intention of commercial software developers to engage in the behavior. Ethical work climate theory H4a An ethical work climate of complying with laws and codes will lead to a lesser intention of commercial software developers to reuse internet code in a way (potentially) violating resulting obligations. H4b An ethical work climate of complying with firm rules will lead to a lesser intention of commercial software developers to reuse internet code in a way (potentially) violating resulting obligations. H4c An ethical work climate of complying with laws and codes will have a negative effect on subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations. H4d An ethical work climate of complying with firm rules will have a negative effect on subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations. Expected utility theory H5a Usefulness of internet code will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H5b Severity of time pressure in the developer’s firm will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H5c Cost of compliance will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. Deterrence theory H6a Punishment severity for the developer’s firm will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H6b Punishment severity for the developer will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H6c Punishment certainty will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations.
4.6.2. Statistical methods used As Ajzen (2002, p. 2) explains, “the theoretical constructs [of TPB] are hypothetical or latent variables. They cannot be directly observed but must instead be inferred from observable responses.” Traditionally, models with latent variables used to be tested with two independent steps in which an exploratory factor analysis is followed by classical regression (e.g. Beck & Ajzen 1991; Banerjee et al. 1998; Flannery & May 2000). More recently, research has relied on structural equation modeling techniques to simultaneously analyze factor structure and research model (e.g. Chin 1998a; Gefen et al. 2000). Of the two general approaches to structural equation modeling, partial least squares (PLS) is chosen over covariance-based structural equation modeling (CBSEM)233 in this study. This is consistent with other recent research investigating ethical behavior in the IS context which frequently uses PLS as method (e.g. Chang 1998; Peace et al. 2003; Limayem et al. 2004; Moores & Chang 2006; Goles et al. 2008). Specifically, the software SmartPLS 2.0 (M3) (Ringle et al. 2005) is utilized in this study. 233
CBSEM is frequently also referred to as LISREL (Linear Structural Relations). LISREL is the software most frequently used for CBSEM.
Commercial software developers’ perspectives on internet code reuse
197
Serving the same purposes in general, in contrast to CBSEM, PLS does not require normal distribution for the items employed to form the latent constructs (Herrmann et al. 2006; Vilares et al. 2009) and is less demanding with regard to minimum sample sizes (Wold 1989; Chin & Newsted 1999). Moreover, PLS avoids problems of factor indeterminacy and inadmissible solutions in complex models (Fornell & Bookstein 1982; Wold 1985; Krijnen et al. 1998). Since this study aims at employing medium-sized samples to identify determinants which predict whether commercial software developers reuse internet code in a way (potentially) violating resulting obligations, PLS seems to be more suitable than CBSEM. Further, Shapiro-Wilk and Shapiro-Francia tests suggest that the items used to form the latent constructs of this study are not distributed normally.234 PLS performs a series of ordinary least squares (OLS) regressions and in doing so iteratively estimates subsets of model parameters and “provides successive approximations for the estimates, subset by subset, of loadings and structural parameters” (Fornell & Bookstein 1982, p. 441).235 Given its OLS base, PLS coefficients can be interpreted as standardized regression coefficients.236 Since PLS does not make any assumptions regarding the distribution of the raw data, significance levels of the coefficients should be calculated via bootstrapping (Chin 1998b).237 Since PLS does – in contrast to CBSEM – not provide any global goodness-of-fit criteria, PLS models are typically evaluated in a two-step process which assesses partial model structures with a catalog of criteria initially developed by Chin (1998b).238 The first step of the process assesses the measurement models, investigating measurement reliability and validity of the constructs employed (Chapter 4.6.3). Once constructs have been found to be sufficiently reliable and valid, the second step evaluates the structural model with its parameters estimated between the latent constructs (Chapter 4.6.4). As both Loch and Conger (1996) and Leonard et al. (2004) find that unethical behavior in slightly different scenarios can be caused by different determinants, the research model is evaluated independently for each of the three scenarios in the following. Of the survey 234
With the exception of items NORM1, CLIM4, CLIM5, CLIM7 and CLIM8, the hypothesis that the item is distributed normally is not supported with p<0.01 for all items. See Table 4-7 for a description of the single items.
235
See e.g. Chin (1998b), Herrmann et al. (2006) or Henseler et al. (2009) for a detailed technical review of the PLS algorithm.
236
I.e. the magnitudes of these coefficients can be compared directly.
237
For this study the number of samples in the bootstrapping procedure is set to 200.
238
This two-step process is closely related to the general approach to assess structural equation models proposed by Anderson and Gerbing (1998).
198
Commercial software developers’ perspectives on internet code reuse
participants with a background in commercial software development, 316 have completed questionnaires for scenario 1, 256 have responded to scenario 2 and 297 have taken the survey with scenario 3.
4.6.3. Measurement model assessment and descriptive statistics In line with most other scholarly work on ethical behavior (e.g. Peace et al. 2003; Buchan 2005; Goles et al. 2008), all constructs of the research model of this study are operationalized with reflective measurement models.239 These measurement models and their respective items are presented and assessed in this chapter. For constructs measured reflectively Chin (1998b) suggests to assess their reliability and both convergent and discriminant validity. Beyond these assessments, also selected descriptive statistics regarding constructs and items are presented at the end of the chapter. Measurement models, construct reliability and convergent validity Typically, multiple criteria are employed to assess the reliability of reflective measurement models. First, Cronbach’s Į as a measure of the construct’s internal consistency should be greater than 0.7 (Nunnally 1978). Second, the construct’s composite reliability as another popular measure of internal consistency should be greater than 0.7, too (Chin 1998b).240 Third, the standardized loading of each item of the construct should be greater than 0.7 (Fornell & Larcker 1981).241 To ensure convergent validity,242 the average variance extracted (AVE) should be greater than 0.5 (Fornell & Larcker 1981). An AVE above 0.5 ensures that the construct is able to explain more than half of the variance of its items on average. In the following the operationalization of each research model construct is presented and the resulting constructs are assessed for reliability and convergent validity (see Table 4-7 for the full text of each item and detailed data regarding constructs and their items). Unless noted otherwise, items and constructs meet the thresholds for reliability and convergent validity introduced above: 239
See e.g. Chin (1998b) or Henseler et al. (2009) for the differences between reflective and formative measurement models.
240
Similar to Cronbach’s Į, composite reliability measures the internal consistency of a construct. However, it does not assume tau-equivalence. Due to this might be better suited for PLS models (Werts et al. 1974).
241
This ensures that the latent construct explains at least 50% of the variance of each of its items.
242
Convergent validity signifies that a construct is unidimensional, i.e. its set of items represents only one construct.
Commercial software developers’ perspectives on internet code reuse
199
− Developers’ intention to reuse internet code in a way (potentially) violating resulting obligations of reused internet code is measured with three items (one of them reversescored) adapted from Limayem et al. (2004) who measure software piracy intention with them. The items through which developers indicate how likely it is that they will show a behavior similar to that described in the scenario are rated on a 7-point Likert scale (“strongly disagree” to “strongly agree”).243 − Developers’ attitude toward reusing internet code in a way (potentially) violating resulting obligations is measured with three items (one of them reverse-scored) adapted from Beck and Ajzen (1991) who measure attitudes toward various forms of unethical behavior with them. − The subjective norm developers perceive is measured with four generic items (two of them reverse-scored) adapted from Beck and Ajzen (1991). Two of the items pertain to the normative beliefs developers hold regarding their friends (NORM1 and NORM2) while the other two items address developers’ colleagues within their firm (NORM3 and NORM4). − Developers’ perceived behavioral control is measured with four generic items (one of them reverse-scored) adapted from Beck and Ajzen (1991) and Ajzen (2002). Two of the items refer to the “capability” portion of perceived behavioral control (CONT1 and CONT2) while the other two address the “controllability” part (CONT3 and CONT4). For scenarios 1 and 3 the standardized loadings of items CONT2 and CONT4 are slightly below the required value of 0.7. However, as the overall construct criteria meet the cut-off values in all three scenarios all items are retained in the construct.244 − Both ethical work climate dimensions are captured with the original questionnaire items and scales of Victor and Cullen (1988). For the law and code dimension of ethical work climate developers are presented four items which describe their firm as very compliant with laws and codes. Developers are asked to indicate on a 6-point Likert scale (“completely false” to “completely true”) how accurately each of the items describes the work climate in their firm.245
243
Unless noted otherwise all items of the following constructs are gauged this way.
244
With a value of 0.69 in scenario 1, Cronbach’s Į is marginally below the threshold value of 0.7.
245
As Victor and Cullen (1988) stress, this approach places survey participants in the role of observers reporting on and not evaluating the perceived work climate in their firm.
200
Commercial software developers’ perspectives on internet code reuse
− Similar to the law and code dimension the rules dimension of the ethical work climate in developers’ firms is measured with the four original items from Victor and Cullen (1988) on the original 6-point scale. These items describe following firm rules and procedures as very important. − Since no previously validated construct was available to measure the usefulness of internet code as perceived by individual commercial software developers, new items were developed to measure this construct. The items are based on the review of scholarly work about code reuse in software development (see Chapter 3.2.2) and also reflect findings of the qualitative pre-study (see Chapter 4.3.2). Further, the construct was tested during the two rounds of survey pretests (see Chapter 4.4.3). The resulting construct consists of three items describing internet code reuse as helpful for the job of commercial software developers. − The severity of time pressure which developers perceive in their job is determined with a three-item construct adapted from Coyle et al. (2009) who measure the negative consequences of music piracy with it. Two of the items are reverse-scored. In this study, the construct assesses how serious developers perceive the negative consequences of missing deadlines within their firm. − To capture the cost of compliance a new construct was developed applying the same process as for the development of the usefulness construct. For scenarios 1 and 2 the construct consists of two items describing checking for all potential obligations of internet code as difficult and time-consuming (COST1a and COST2a). For scenario 3 the construct is comprised of two different items which present discussions between the developer and her firm about complying with the identified obligations as difficult and time-consuming (COST1b and COST2b). − The perceived punishment severity for the firm is measured with three items (one of them reverse-scored) which are based on a construct employed by Peace et al. (2003) to measure the consequences of software piracy in the workplace. − Three items (two of them reverse-scored) adapted from Coyle et al. (2009) are employed to operationalize the perceived punishment severity for the developer. Coyle et al. (2009) use the items to measure punishment following music piracy. − Finally, the perceived punishment certainty is captured with a self developed construct consisting of three items (two of them reverse-scored). When developing this construct the same process was applied as for the development of the usefulness
Commercial software developers’ perspectives on internet code reuse
201
construct. For scenario 3 the standardized loading of item CERT2 is slightly below the required value of 0.7. However, as CERT2 significantly exceeds the threshold of 0.7 in the other two scenarios and the overall construct criteria are met in all three scenarios, the item is retained in the construct. Summarizing, all research model constructs seem reliable measures of their underlying concepts and present convergent validity. As the last step in assessing the measurement model, discriminant validity is investigated in the following.
4.79 1.91 5.0 0.70
5.31 1.75 6.0 0.60
4.71 1.14 5.0 0.84 4.08 1.27 4.0 0.79 4.62 1.02 5.0 0.90
4.57 1.23 5.0 0.90 3.95 1.41 4.0 0.81 4.65 1.13 5.0 0.86
4.36 1.50 5.0 0.82
4.18 1.51 4.0 0.77
4.59 1.16 5.0 0.87
5.09 1.05 5.0 0.89
4.99 1.11 5.0 0.92
4.55 1.23 5.0 0.92
4.96 1.22 5.0 0.91
4.91 1.18 5.0 0.92
0.89 0.93 0.76
5.11 1.11 5.0 0.88
5.01 1.15 5.0 0.89
0.90 0.93 0.77
4.11 2.02 4.0 0.80
4.74 2.00 5.5 0.81
0.82 0.83 0.86 0.87
3.66 2.17 3.0 0.74
4.0 3.0 3.0 5.0
3.41 2.14 3.0 0.60
1.76 1.67 1.73 1.67
4.30 2.18 5.0 0.90
0.69 0.81 0.52
4.35 3.50 3.33 4.89
4.72 2.15 6.0 0.84
0.89 0.92 0.75
CONT1: Personally, I could easily do what Joe did if I wanted to CONT2: Based on my knowledge and skills I would find it difficult to do what Joe did (R) CONT3: There is nothing outside of my control which could prevent me from doing what Joe did CONT4: It would be mostly up to me whether or not I do what Joe did Ethical work climate: Law and code CLIM1: People at YourCo are expected to comply with the law and professional standards over and above other considerations CLIM2: At YourCo, the law and ethical codes are a major consideration CLIM3: At YourCo, people are expected to strictly follow legal and professional standards CLIM4: At YourCo, the first consideration is whether a decision violates any law Ethical work climate: Rules CLIM5: It is very important to follow the company's rules and procedures at YourCo CLIM6: At YourCo, everyone is expected to stick by company rules and procedures CLIM7: Successful people at YourCo go by the book CLIM8: People at YourCo strictly obey the company policies
0.86 0.86 0.86 0.88
4.45 3.58 3.52 4.54
NORM1: Most of my friends would disapprove (R) NORM2: Most of my friends would think that it is okay NORM3: Most of my colleagues at YourCo would not mind NORM4: Most of my colleagues at YourCo would disapprove (R) Perceived behavioral control
4.5 4.0 3.0 5.0
2.71 1.66 2.0 0.83
3.09 1.92 2.0 0.85
1.76 1.71 1.84 1.83
2.40 1.55 2.0 0.87
2.53 1.71 2.0 0.90
0.81 0.89 0.73
2.11 1.40 2.0 0.91 5.45 1.72 6.0 0.84
2.28 1.43 2.0 0.91
INT3: It is likely that I will do what Joe did in the future
5.45 1.66 6.0 0.90
2.45 1.56 2.0 0.94
5.15 1.80 6.0 0.81
5.16 1.77 6.0 0.90
Attitude ATT1: For me at YourCo, doing what Joe did would be foolish in a similar situation (R) ATT2: For me at YourCo, doing what Joe did would be justified in a similar situation ATT3: When doing what Joe did in a similar situation at YourCo, the benefits would outweigh the downsides for me Subjective norm: What would other people say if they learned that while working at YourCo, you had done what Joe did in the scenario?
2.76 1.66 2.0 0.92
INT2: I would never do what Joe did (R)
0.87 0.91 0.72
0.90 0.93 0.77
0.80 0.87 0.62
0.87 0.91 0.72
0.81 0.89 0.72
1.68 1.64 1.82 1.72
5.0 3.0 3.0 5.0
0.84 0.85 0.87 0.87
3.99 1.34 4.0 0.79 4.59 1.02 5.0 0.91
4.60 1.13 5.0 0.88
4.53 1.17 5.0 0.83
4.12 1.51 4.0 0.75
5.03 1.07 5.0 0.88
4.93 1.13 5.0 0.89
5.05 1.07 5.0 0.89
5.18 1.86 6.0 0.68
4.74 1.98 5.0 0.76
3.33 2.15 2.0 0.69
4.51 2.18 5.0 0.87
4.84 3.18 3.45 4.82
2.73 1.73 2.0 0.88
2.25 1.49 2.0 0.89
5.29 1.76 6.0 0.77
2.19 1.39 2.0 0.93
5.33 1.76 6.0 0.88
2.43 1.55 2.0 0.95
0.89 0.91 0.73
0.88 0.92 0.73
0.75 0.84 0.57
0.88 0.92 0.73
0.81 0.89 0.72
Scenario 1 (N=316) Scenario 2 (N=256) Scenario 3 (N=297) Mean S.D. Med. Ȝ C’s Į CR AVE Mean S.D. Med. Ȝ C’s Į CR AVE Mean S.D. Med. Ȝ C’s Į CR AVE 0.90 0.94 0.83 0.91 0.94 0.84 0.91 0.94 0.84
INT1: I may do what Joe did in the future
Construct/Item Intention
202 Commercial software developers’ perspectives on internet code reuse
Table 4-7: Reliability, convergent validity and descriptive statistics of constructs
Notes: In the questionnaire developers were instructed to assume “YourCo” is the last firm for which they have been developing software; “[USE1]” denotes the name of the item as it is referred to in later analyses; item loadings for reverse coded items are depicted after reversing; Abbreviations: (R) = reverse coded item S.D. = Standard Deviation, Med. = Median, Ȝ = Loading of item on its designated construct, C’s Į = Cronbach’s Į, CR = Composite Reliability, AVE = Average Variance Extracted.
USE3: It would make it easier for me to do my job Severity of time pressure: How serious do you think would it be for you personally if you failed to deliver required functionality on time at YourCo? DEAD1: It would not hurt my career much (R) DEAD2: It would not affect my future much (R) DEAD3: There would be major negative consequences for me Cost of compliance: How easy would it be for you to check thoroughly for potential obligations that come with snippets/components from the internet that you want to integrate when working at YourCo? / How easy do you think would it be for you to discuss with YourCo about complying with the obligations of snippets that you want to integrate in your work?* COST1a: It would take very long for me to thoroughly check for all obligations that come with the snippets/components COST2a: It would be very difficult for me to check for all potential obligations of the snippets/components COST1b: Such discussions would take very long COST2b: Such discussions would be very difficult Punishment severity (firm): How serious do you think would be the consequences for YourCo if it became public that their software includes snippets/components from the internet, but does not fulfill the obligations of these snippets/components? SEV_FIRM1: There would be no or very low consequences for YourCo (R) SEV_FIRM2: YourCo would be in serious legal trouble SEV_FIRM3: YourCo would incur major financial losses Punishment severity (developer): How serious do you think would be the consequences for you personally if you were caught doing what Joe did in the scenario while working at YourCo? SEV_DEV1: It would not hurt my career much (R) SEV_DEV2: There would be major negative consequences for me SEV_DEV3: It would not affect my future much (R) Punishment certainty: How easy do you think would it be to detect that YourCo’s software contains snippets/components from the internet? CERT1: It would be very difficult for anybody to find out (R) CERT2: The probability that anybody would find out is very high CERT3: Scanning for snippets/components from the internet in YourCo's software is virtually impossible (R)
Construct/Item Usefulness of internet code: How useful do you think would it be for your work at YourCo to download snippets/components from the internet and integrate them into the software you are developing for YourCo? USE1: It would improve my job performance USE2: It would increase my productivity
3.80 2.00 4.0 0.89
4.25 1.97 5.0 0.83
3.45 1.76 3.0 0.89 4.64 1.81 5.0 0.91 3.40 1.77 3.0 0.92
4.04 1.88 4.0 0.92 3.46 1.80 3.0 0.86
0.78 0.87 0.70
0.90 0.94 0.84
2.88 1.77 2.0 0.92 5.13 1.68 6.0 0.91 4.68 1.72 5.0 0.89
4.94 1.73 5.0 0.87 2.74 1.61 2.0 0.80
3.78 1.86 4.0 0.93 4.16 1.87 4.0 0.88 3.77 1.82 4.0 0.93
3.35 1.77 3.0 0.87 4.68 1.78 5.0 0.91 4.03 1.78 4.0 0.90
2.93 1.67 2.0 0.95
0.87 0.92 0.80
3.15 1.73 3.0 0.94
4.35 1.57 5.0 0.93 4.35 1.59 5.0 0.93 3.51 1.53 3.0 0.86
5.06 1.68 5.0 0.93
2.97 1.65 2.0 0.94
0.87 0.94 0.89
0.88 0.93 0.81
4.93 1.66 5.0 0.96 5.14 1.59 5.0 0.96
3.14 1.70 3.0 0.95
4.28 1.55 5.0 0.92 4.27 1.55 5.0 0.94 3.70 1.56 4.0 0.83
4.98 1.59 5.0 0.93
4.75 1.68 5.0 0.94 4.86 1.66 5.0 0.97
0.95 0.96 0.90
0.87 0.92 0.79
0.89 0.93 0.82
0.89 0.93 0.82
0.88 0.94 0.89
0.89 0.93 0.83
0.95 0.97 0.90
4.34 1.91 5.0 0.90
4.76 1.82 5.0 0.89 2.87 1.65 2.0 0.62
3.64 1.79 4.0 0.95 4.52 1.82 5.0 0.92 3.65 1.82 3.0 0.94
3.03 1.76 3.0 0.90 5.00 1.74 5.0 0.89 4.34 1.76 5.0 0.88
3.60 1.87 3.0 0.87 3.22 1.88 3.0 0.97
4.41 1.57 5.0 0.95 4.46 1.62 5.0 0.93 3.57 1.58 3.0 0.86
4.88 1.66 5.0 0.97
4.78 1.66 5.0 0.95 4.89 1.66 5.0 0.92
0.78 0.85 0.67
0.93 0.96 0.88
0.87 0.92 0.79
0.85 0.92 0.85
0.90 0.94 0.84
0.94 0.96 0.89
Scenario 1 (N=316) Scenario 2 (N=256) Scenario 3 (N=297) Mean S.D. Med. Ȝ C’s Į CR AVE Mean S.D. Med. Ȝ C’s Į CR AVE Mean S.D. Med. Ȝ C’s Į CR AVE
Commercial software developers’ perspectives on internet code reuse 203
Table 4-7: Reliability, convergent validity and descriptive statistics of constructs – continued
*For scenarios 1 and 2 the first version of the text is used while the second version is presented for scenario 3. Notes: In the questionnaire developers were instructed to assume “YourCo” is the last firm for which they have been developing software; “[USE1]” denotes the name of the item as it is referred to in later analyses; item loadings for reverse coded items are depicted after reversing; Abbreviations: (R) = reverse coded item S.D. = Standard Deviation, Med. = Median, Ȝ = Loading of item on its designated construct, C’s Į = Cronbach’s Į, CR = Composite Reliability, AVE = Average Variance Extracted.
204
Commercial software developers’ perspectives on internet code reuse
Discriminant validity Complementary to convergent validity discussed above, discriminant validity indicates that there is a difference between the concepts measured by two constructs (Henseler et al. 2009). Two criteria are commonly employed to show discriminant validity. First, the Fornell-Larcker criterion (1981) demands that for each construct the square root of its AVE is greater than the construct’s highest correlation with any other construct.246 Second, the loading of each item with its a-priori construct should be greater than its highest loading with any other construct (Chin 1998b). Table 4-8 confirms that the Fornell-Larcker criterion holds for all constructs of the research model in all three scenarios. The constructs intention, attitude and subjective norm show relatively strong correlations to each other, as do the two ethical work climate constructs.247 However, for all constructs the square root of the AVE is higher than the maximum correlation with any other constructs. Thus, discriminant validity is ensured. Beyond assessing discriminant validity with the Fornell-Larcker criterion, also an analysis of item loadings and cross-loadings (see Table A-4 in the Appendix) confirms the discriminant validity of the constructs since all items have their highest loading with their a-priori construct.
246
This ensures that each construct shares more variance with its items than with any other construct.
247
This result can also be found in other scholarly work. E.g. Peace et al. (2003) and Buchan (2005) report similarly high correlations among the original TPB constructs. Victor and Cullen (1988) and VanSandt et al. (2006) present comparable correlations regarding ethical work climate constructs.
Commercial software developers’ perspectives on internet code reuse
205
Table 4-8: Construct correlations and discriminant validity
1. Intention 2. Attitude 3. Subjective norm 4. Perceived behavioral control 5. Ethical work climate: Law & code 6. Ethical work climate: Rules 7. Usefulness of internet code 8. Severity of time pressure 9. Cost of compliance 10. Punishment severity (firm) 11. Punishment severity (developer) 12. Punishment certainty
1 0.91 0.63 0.53 0.19 -0.21 -0.15 0.19 0.17 0.35 -0.32 -0.36 -0.06
1. Intention 2. Attitude 3. Subjective norm 4. Perceived behavioral control 5. Ethical work climate: Law & code 6. Ethical work climate: Rules 7. Usefulness of internet code 8. Severity of time pressure 9. Cost of compliance 10. Punishment severity (firm) 11. Punishment severity (developer) 12. Punishment certainty
1 0.92 0.67 0.55 0.20 -0.26 -0.17 0.08 0.11 0.22 -0.32 -0.28 -0.15
2 0.85 0.68 0.23 -0.21 -0.15 0.11 0.16 0.26 -0.43 -0.46 -0.10
2 0.85 0.56 0.08 -0.31 -0.17 0.07 0.17 0.14 -0.44 -0.34 -0.22
Scenario 1 (N=316) 3 4 5
0.87 0.27 -0.33 -0.22 0.07 0.14 0.27 -0.39 -0.53 -0.16
0.72 -0.16 -0.19 0.01 -0.03 0.03 -0.22 -0.21 -0.20
0.88 0.73 -0.03 0.08 -0.15 0.27 0.31 0.00
Scenario 2 (N=256) 3 4 5
0.85 0.30 -0.25 -0.17 0.12 0.09 0.16 -0.41 -0.41 -0.28
0.79 -0.07 -0.07 0.03 -0.02 0.16 -0.21 -0.25 -0.26
0.88 0.65 0.13 0.06 0.03 0.24 0.23 0.11
6
7
8
9
10
11
12
0.87 0.05 0.11 -0.08 0.19 0.27 0.03
0.95 0.16 0.04 0.01 0.00 -0.05
0.90 0.10 0.03 0.07 0.05
0.94 -0.03 -0.07 -0.02
0.89 0.55 0.28
0.91 0.27
0.84
6
7
8
9
10
11
12
0.85 0.01 0.02 0.10 0.21 0.21 0.12
0.95 0.13 -0.06 -0.06 0.04 0.08
0.91 0.08 0.11 0.16 -0.08
0.95 -0.11 -0.17 -0.08
0.90 0.59 0.27
0.91 0.33
0.89
Scenario 3 (N=297) 1 2 3 4 5 6 7 8 9 10 11 12 1. Intention 0.92 2. Attitude 0.77 0.85 3. Subjective norm 0.52 0.56 0.86 4. Perceived behavioral control 0.08 0.08 0.17 0.75 5. Ethical work climate: Law & code -0.27 -0.28 -0.22 -0.08 0.85 6. Ethical work climate: Rules -0.11 -0.08 -0.13 -0.10 0.65 0.85 7. Usefulness of internet code 0.17 0.10 0.07 0.05 0.03 0.07 0.94 8. Severity of time pressure 0.17 0.15 0.03 -0.06 0.12 0.24 0.07 0.91 9. Cost of compliance 0.18 0.15 0.21 0.04 0.04 0.01 -0.05 0.14 0.92 10. Punishment severity (firm) -0.34 -0.39 -0.36 -0.18 0.22 0.20 -0.02 0.10 -0.08 0.89 11. Punishment severity (developer) -0.38 -0.42 -0.50 -0.15 0.22 0.14 -0.14 0.16 -0.07 0.59 0.94 12. Punishment certainty -0.10 -0.14 -0.18 -0.06 0.03 0.03 -0.02 0.13 -0.18 0.20 0.14 0.82 Notes: The diagonal bolded entries are square roots of the average variance extracted (AVE) of the respective construct; the offdiagonal entries are correlations between constructs.
Descriptive statistics Before the next chapter evaluates the structural part of the research model, this chapter ends with selected descriptive statistics regarding the data employed in the research model. Mean, standard deviation and median of each individual construct item are depicted in Table 4-7. With item means between 2.11 and 2.84 (on a 7-point scale) developers have a rather low intention to reuse internet code in a way (potentially) violating resulting obligations.
206
Commercial software developers’ perspectives on internet code reuse
This is consistent with their past behavior discussed in Chapter 4.5.4. Also in line with the prior findings is that developers apparently have a slightly stronger intention to reuse snippets and not check for resulting obligations thoroughly (scenario 1) than to do the same with components (scenario 2) (paired t-test, p=0.0364).248 Moreover, the intention not to check for obligations from snippet reuse is marginally stronger than the intention to knowingly ignore obligations from snippet reuse (paired t-test, p=0.0932). Similar to intention, developers also have a rather negative attitude toward reusing internet code in a way (potentially) violating resulting obligations. Differences between the three scenarios in attitude mirror the differences already discussed for intention. Regarding subjective norm, developers perceive slightly stronger normative beliefs not to violate obligations they are aware of than not to reuse internet code without having checked for obligations thoroughly.249 In general, developers either “slightly disagree” or respond in “neutral” fashion to statements describing friends and colleagues as tolerating internet code reuse in a way (potential) violating resulting obligation. Interestingly, there are no consistently significant differences between developers’ normative beliefs from friends and those from colleagues. A possible explanation for this finding might be that developers’ friends are also software developers and thus have a similar stance on violating internet code obligations than their software developing colleagues. Relatively high scores in the perceived behavioral control items (ranging from 4.11 to 5.31) indicate that most developers feel confident about their abilities to reuse internet code and (potentially) violate obligations in doing so. Further, developers do also not see strong exogenous factors which could stop them in the behavior. Comparing the different scenarios, developers perceive stronger control to reuse snippets and (potentially) violate resulting obligations (scenarios 1 and 3) than to do the same with components (scenario 2).250 This is consistent with results of the qualitative pre-study where one developer for example explains, “I think I could easily smuggle in 15-20 lines of Perl [i.e. a snippet], but I think it would be all but impossible to mass import 15-20 whole files [i.e. one or more components].”
248
Unless noted otherwise, all t-tests in this chapter refer to differences between indices built with the items of the respective construct.
249
Paired t-test, p=0.0207 and p=0.2166 for the comparison between scenarios 1 and 3 and the comparison between scenarios 2 and 3, respectively.
250
Paired t-tests, p=0.0004 and p=0.0040, for the comparison between scenarios 1 and 2 and the comparison between scenarios 2 and 3, respectively.
Commercial software developers’ perspectives on internet code reuse
207
On average, developers rate their firms high on both dimensions of ethical work climate indicating that complying with laws and codes as well as following firm rules are important in their organizations. Developers on average “slightly agree” to statements describing internet code reuse as useful and surprisingly do not perceive a significant difference in the usefulness of snippets and components.251 This differs from the opinion commonly formulated in scholarly work (see Chapter 3.2.2) and also contradicts the typical opinion of developers from the qualitative pre-study. One of these for example explains, “I doubt very much that copy and pasting code [i.e. snippets] found on the internet boosts anybody’s productivity except perhaps very poor programmers. Using good open source libraries [i.e. components] is something completely different.” Mean scores between 3 and 4 indicate that developers on average do not perceive a high severity of time pressure since they do not expect extremely strong negative consequences from missing deadlines. Further, developers on average also do not consider the cost of compliance as high, but rather deem checking thoroughly for obligations of internet code and ensuring that they are fulfilled as generally feasible in acceptable time.252 Finally, developers expect medium punishment severities both for their firms and themselves and consider punishment certainty to be rather low as they deem finding out about reused internet code in their firms’ software quite difficult. As expected, developers judge successfully scanning for reused components in their firms’ software as easier than scanning for reused snippets.253 Following the confirmation of acceptable properties of the measurement models with regard to reliability and validity and after presenting descriptive information about the constructs and their items, the structural models are analyzed for each scenario in the next chapter in order to gain insights into the determinants which lead commercial software developers to reuse internet code in a way (potentially) violating resulting obligations.
251
Paired t-tests, p=0.1678 and p=0.1464 for the comparison between scenarios 1 and 2 and the comparison between scenarios 2 and 3, respectively.
252
Interestingly, checking for obligations from snippets seems not to be more difficult than checking for obligations from components (paired t-test, p=0.8842). Due to the differences in content, comparisons of scenarios 1 and 2 with scenario 3 are not possible for the cost of compliance construct.
253
Paired t-test, p<0.0001 for the two comparisons between scenario 1 and 2 and between scenario 2 and 3.
208
Commercial software developers’ perspectives on internet code reuse
4.6.4. Structural model assessment This chapter presents the results of testing the structural models with their relationships between the research model constructs. The research model is evaluated once for each scenario. To assess structural models in PLS, Chin (1998b) and Henseler et al. (2009) suggest to evaluate models’ predictive power through the R² values of their dependent variables and to analyze sign, significance and size of the paths between the model constructs. Scenario 1: Not checking thoroughly for obligations from snippet reuse. With an R² value of 0.419 the research model “moderately” (Chin 1998b)254 explains developers’ intention to reuse internet snippets without checking thoroughly for potential obligations (see Figure 4-8). Of the TPB hypotheses H1 and H2 are confirmed by the model results. Developers with a more positive attitude toward reusing internet snippets without checking thoroughly for potential obligations and those who perceive a more positive subjective norm regarding the behavior show a higher intention to engage in the behavior. The positive effect of a higher level of perceived behavioral control on intention as proposed in H3 is however not supported. Of the hypotheses resulting from ethical work climate theory only H4c is supported as developers working in firms with a stronger ethical work climate regarding compliance with laws and codes seem to perceive a less positive subjective norm regarding the reuse of internet snippets without checking thoroughly for potential obligations. None of the two dimensions of ethical work climate seems to exhibit the proposed direct influence on intention (H4a and H4b) and contrary to the laws and codes dimension, the rules dimension of ethical work climate appears not to affect developers’ subjective norm (H4d). In terms of effect sizes, the influence of attitude on intention is more than twice as strong as the effect of subjective norm. All of the three hypotheses derived directly from expected utility theory are confirmed. Developers who see a higher usefulness of snippet reuse in their work (H5a), those who perceive the consequences of missing deadlines as more severe (H5b) and developers considering the costs for checking for obligations from snippet reuse higher (H5c) have a more positive attitude toward reusing internet snippets without checking thoroughly for potential obligations. Finally, regarding the hypotheses inferred from deterrence theory,
254
Chin (1998b) considers R² values above 0.67, 0.33 and 0.19 as substantiate, moderate and weak, respectively.
Commercial software developers’ perspectives on internet code reuse
209
H6a and H6b are supported while H6c is rejected. As proposed, developers who perceive a higher punishment severity for their firm (H6a) or for themselves (H6b) seem to have a less positive attitude toward reusing internet snippets without checking thoroughly for potential obligations. Surprisingly, developers who deem it more likely that internet snippets can be found in their firm’s software appear not to hold a more negative attitude as proposed in the research model, but to the contrary give the impression of being more positive toward reusing internet snippets without checking thoroughly for potential obligations. However, this result is only significant on a 10% level and might as well be a statistical artifact. The two forms of punishment severity seem to exhibit the strongest effect on attitude. They are followed by cost of compliance and in some further distance by severity of time pressure. With a standardized coefficient of 0.085 the influence of usefulness of internet snippets is rather weak. All coefficients, signs and levels of significance remain qualitatively unchanged when including the social desirability scale (see Chapter 4.4.4) into the model to control for social desirability bias.
Punishment severity (developer)
-0 .2 32 72 1* ** ** *( (H H6 6 b: † a: -) -) -0 .
Punishment severity (firm)
: -)
Cost of compliance
: +)
+) 5 b: * (H 9** 4 1 0. ) :+ 5c (H ** * 3 21 0.
(H 6c
Severity of time pressure
0.085** (H5a
67 *
Usefulness of internet code
0 .0
Determinants of costs
Determinants of benefits
Figure 4-8: Structural model results for obligation violation model (scenario 1)
Punishment certainty
Attitude 0.
R²=0.346
Ethical work climate: Law & code
-0.0
50 1* ** (
50 (H4 a:
H1 :
+)
-)
-0.361*** (H4c: -)
Subjective norm
0.168** (H2: +)
R²=0.109 0.044 (H4d: -)
Ethical work climate: Rules
R²=0.419
-) 4b: 4 (H 0.00
( 25 0.0
: H3
Intention
+)
Perceived behavioral control Significant path
† While the coefficient is significant, its sign contradicts hypothesis H6c. * significant at 10%, ** significant at 5%, *** significant at 1% Notes: Significant coefficients are bolded; the expected coefficient sign for each hypothesis is indicated in parentheses; N=316.
Not significant path
Scenario 2: Not checking thoroughly for obligations from component reuse. With an R² value of 0.499 the research model also “moderately” (Chin 1998b) explains developers’ intention to reuse internet components without checking thoroughly for potential obligations (see Figure 4-9).
210
Commercial software developers’ perspectives on internet code reuse
Similarly to the model test with scenario 1, also in the situation of scenario 2 the TPB hypotheses H1 (attitude) and H2 (subjective norm) are confirmed. Contrary to the results of scenario 1, however, in the case of potentially violating obligations from component reuse also the proposed positive effect of a higher level of perceived behavioral control on intention (H3) is supported. A more detailed analysis points out that the influence stems from the “capability” portion of perceived behavioral control while the “controllability” portion seems not to have a significant effect on intention.255 Thus, there seem to exist developers who are less confident of their own skills regarding internet component reuse and who because of that also have a lower intention to reuse internet components without checking thoroughly for potential obligations. Regarding the ethical work climate theory hypotheses, the results are similar to those of the scenario 1 model. There is the expected significant negative effect of an ethical work climate of complying with laws and codes on subjective norm (H4c) while the other three hypotheses are not supported. In terms of effect sizes, attitude again exhibits the by far strongest effect in intention, followed by subjective norm and perceived behavioral control with a rather weak influence.
Punishment severity (developer) Punishment certainty
:)
H6 a
52 ** *(
+)
Attitude 0.5
R²=0.263
Ethical work climate: Law & code
-0.0
25 *** (
H1
16 (H4 a:
:+ )
-)
-0.248*** (H4c: -)
:-
)
.3 -0
H6 b
13 9* *( -0 .
Punishment severity (firm)
: -)
Cost of compliance
: +)
b: H5 ** ( 20* 0 .2 ) :+ 5c (H 0 6 0 0.
H6 c
Severity of time pressure
0.035 (H5a
05 9(
Usefulness of internet code
- 0.
Determinants of costs
Determinants of benefits
Figure 4-9: Structural model results for obligation violation model (scenario 2)
Subjective norm
0.222*** (H2: +)
R²=0.063 -0.006 (H4d: -)
Ethical work climate: Rules
-0.0
b: (H4 26
0.
*( 9* 08
R²=0.499
-)
: H3
Intention
+)
Perceived behavioral control Significant path
* significant at 10%, ** significant at 5%, *** significant at 1% Notes: Significant coefficients are bolded; the expected coefficient sign for each hypothesis is indicated in parentheses; N=256.
Not significant path
Of the direct expected utility theory hypotheses H5b predicting a positive relationship between severe consequences of missing a deadline and a positive attitude toward reusing 255
In an additional PLS model with the two parts of perceived behavioral control as independent constructs of their own, the standardized coefficient of the “capability” portion is 0.086 (p=0.035) while the coefficient of the “controllability” portion is 0.013 (p=0.665).
Commercial software developers’ perspectives on internet code reuse
211
internet components without checking thoroughly for potential obligations is confirmed (as in the scenario 1 model). Different to the first model, neither perceived usefulness of component reuse (H5a) nor the costs of compliance as seen by developers (H5c) seem to influence their attitude. Finally, as predicted by the deterrence theory hypotheses and as also confirmed for the scenario 1 model, a higher perceived punishment severity for the firm (H6a) and the developer (H6b) both lead to a less positive attitude toward reusing internet components without checking thoroughly for potential obligations. H6c, suggesting that a higher perceived punishment certainty negatively impacts on attitude, does not receive support. Punishment severity for the firm exhibits the strongest effect on attitude, followed by severity of time pressure. Contrary to the results for scenario 1, punishment severity for the developer is the weakest effect in the case of component reuse without checking thoroughly for potential obligations. Similar to scenario 1, including the social desirability scale (see Chapter 4.4.4) into the model does not qualitatively change any coefficients, signs or levels of significance. Scenario 3: Knowingly ignoring obligations from snippet reuse. With an R² value of 0.610 the research model also “moderately” (Chin 1998b) explains developers’ intention to reuse internet snippets in a way violating resulting obligations (see Figure 4-10).
-)
b: H5 ** ( 20* 0. 2 ) :+ 5c (H 7 6 0 0.
01 ** *(
:)
.2 -0
H6 b
Cost of compliance
Punishment severity (developer) Punishment certainty
.3 -0
: -) H6 c
07 0(
Punishment severity (firm)
- 0.
Determinants of costs
: +)
H6
Severity of time pressure
0.038 (H5a
a:
Usefulness of internet code
15 *** (
Determinants of benefits
Figure 4-10: Structural model results for obligation violation model (scenario 3) +)
Attitude 0.
R²=0.267
Ethical work climate: Law & code
- 0 .0
69 3* ** (
47 (H4 a:
H1 :
+)
-)
-0.226*** (H4c: -)
Subjective norm
0.117** (H2: +)
Intention
R²=0.046 0.016 (H4d: -)
Ethical work climate: Rules
b: (H4 13 -0.0
-0
.
4 00
R²=0.610
-)
3: (H
+)
Perceived behavioral control Significant path
* significant at 10%, ** significant at 5%, *** significant at 1% Notes: Significant coefficients are bolded; the expected coefficient sign for each hypothesis is indicated in parentheses; N=297.
Not significant path
Once more, the model results confirm the TPB hypotheses H1 (attitude) and H2 (subjective norm). In line with the results of scenario 1 but contrary to scenario 2, H3
212
Commercial software developers’ perspectives on internet code reuse
(perceived behavioral control) does not find support. Regarding the ethical work climate theory hypotheses, the results are similar to those of the two previous models. The expected significant negative effect of an ethical work climate of complying with laws and codes on subjective norm (H4c) seems to exist while the other three hypotheses are not supported. Again the effect of attitude on intention is stronger than that of subjective norm. Regarding both expected utility theory and deterrence theory hypotheses, the model results for scenario 3 are similar to those for scenario 2. H5b (severity of time pressure), H6a (punishment severity (firm)) and H6b (punishment severity (developer)) are supported while the three other hypotheses (usefulness of internet code, cost of compliance and punishment certainty) are not confirmed. In scenario 3 punishment severity for the developer exhibits the strongest influence on attitude while punishment severity for the firm and severity of time pressure are in the same range. As in scenarios 1 and 2, including the social desirability scale (see Chapter 4.4.4) into the model does not qualitatively change any coefficients, signs or levels of significance.
4.6.5. Discussion and summary After having presented the results of testing the research model hypotheses with the three different scenarios, this chapter aggregates and discusses the resulting findings and summarizes the testing of the research model (see Table 4-9).
Commercial software developers’ perspectives on internet code reuse
213
Table 4-9: Summary of research model hypotheses testing Confirmed? Hypothesis Theory of planned behavior model H1 A more positive attitude toward reusing internet code in a way (potentially) violating resulting obligations will lead to greater intention of commercial software developers to engage in the behavior. H2 A higher level of subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations will lead to greater intention of commercial software developers to engage in the behavior. H3 A higher level of perceived behavioral control regarding the reuse of internet code in a way (potentially) violating resulting obligations will lead to greater intention of commercial software developers to engage in the behavior.
Scenario 1 Scenario 2 Scenario 3
9
9
9
9
9
9
8
9
8
8
8
8
8
8
8
9
9
9
8
8
8
Expected utility theory H5a Usefulness of internet code will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H5b Severity of time pressure in the developer’s firm will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H5c Cost of compliance will have a positive effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations.
9 9 9
8 9 8
8 9 8
Deterrence theory H6a Punishment severity for the developer’s firm will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H6b Punishment severity for the developer will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations. H6c Punishment certainty will have a negative effect on attitude toward reusing internet code in a way (potentially) violating resulting obligations.
9 9 8
9 9 8
9 9 8
Ethical work climate theory H4a An ethical work climate of complying with laws and codes will lead to a lesser intention of commercial software developers to reuse internet code in a way (potentially) violating resulting obligations. H4b An ethical work climate of complying with firm rules will lead to a lesser intention of commercial software developers to reuse internet code in a way (potentially) violating resulting obligations. H4c An ethical work climate of complying with laws and codes will have a negative effect on subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations. H4d An ethical work climate of complying with firm rules will have a negative effect on subjective norm supportive of reusing internet code in a way (potentially) violating resulting obligations.
Theory of planned behavior. Across all three scenarios H1 and H2 are confirmed. As predicted by TPB, developers who hold a more positive attitude toward reusing internet code in a way (potentially) violating resulting obligations (H1) and those who perceive a more positive subjective norm regarding the behavior (H2) have a higher intention to engage in the behavior. In contrast to that, H3 finds moderate support only in the case of component reuse without thoroughly investigating potential obligations (scenario 2). For snippet reuse (scenarios 1 and 3) H3 is not confirmed. Only in the situation described in scenario 2 do those developers who perceive more behavioral control also have a higher intention to engage in the behavior. On a more detailed level this effect is driven by the “capability” portion of perceived behavioral control. Apparently there do exist developers who doubt their own capability to reuse components and thus also have a lower intention to engage in reusing internet components without checking thoroughly for potential obligations.
214
Commercial software developers’ perspectives on internet code reuse
Contrary to the “capability” portion, the “controllability” portion of perceived behavioral control which should among other things reflect the technical means which firms have implemented to prevent their developers from reusing internet code in problematic fashion has no significant effect on intention. Generally, the finding that perceived behavioral control has either no or only in parts an influence on developers’ intention is surprising. Other studies investigating digital piracy (of both software and music) as a behavior technically related to internet code reuse in a way (potentially) violating resulting obligations consistently find significant effects of perceived behavioral control on intention (e.g. Peace et al. 2003; d'Astous et al. 2005). The difference in the results between these studies and the present investigation is even more surprising as the items employed in this research are quite similar to the items employed by Peace et al. (2003) when investigating software piracy. Other studies of ethical behaviors which do not find a significant relationship between perceived behavioral control and intention (e.g. Randall & Gibson 1991; Kurland 1995) typically explain this finding by arguing that the analyzed behavior is under complete volitional control. In this case there should be no variance in the perceived behavioral control construct and thus also no effect on intention. Yet, this explanation seems not to be applicable to this study. As the high means and medians in the descriptive statistics of Table 4-7 point out, most developers are indeed confident of being able to reuse internet code in a way (potentially) violating resulting obligations if they want to, but with standard deviation values of above two there still seems to be variance in the data.256 Consequently, reusing internet code in a way (potentially) violating resulting obligations cannot be considered a behavior under developers’ complete volitional control. Summarizing, developers’ attitude toward reusing internet code in a way (potentially) violating resulting obligations has the strongest influence on their behavioral intention. Subjective norm regarding the behavior exhibits a weaker effect and perceived behavioral control has a moderate effect only in the case of component reuse without thoroughly checking for potential obligations. For snippet reuse, perceived behavioral control is not significantly influencing developers’ behavioral intention. Ethical work climate theory. Of the hypotheses derived from ethical work climate theory, H4c is consistently supported in all three scenarios while the null hypothesis cannot be rejected for the other three hypotheses (H4a, H4b and H4d). 256
The items are measured on a 7-point scale.
Commercial software developers’ perspectives on internet code reuse
215
As expected in H4c, an ethical work climate of complying with laws and codes exhibits a substantial influence on the subjective norm which developers perceive regarding the reuse of internet code in a way (potentially) violating resulting obligations. However, there seems to be no direct effect of the law and code dimension of ethical work climate on developers’ intention to reuse internet code in a way (potentially) violating resulting obligations (H4a). The latter result is similar to Buchan’s (2005) finding that the instrumental dimension of ethical work climate has no direct effect on intention. Apparently, developers are aware of the law and code dimension of the ethical work climate within their firm and consider it when communicating with their colleagues. This leads to the significant influence of the law and code dimension of ethical work climate on subjective norm. However, while developers seem to consider the law and code dimension of ethical work climate in their firm when talking to their peers, they appear not to walk the talk themselves which is reflected in the non-significant relationship between the law and code dimension of ethical work climate and intention. Nonetheless, the law and code dimension of ethical work climate has an indirect effect on intention through the subjective norm developers perceive from their colleagues. Regarding the rules dimension of ethical work climate, there is neither a significant influence of a climate of complying with firm rules on the intention to reuse internet code in a way (potentially) violating resulting obligations (H4b) nor is there a significant effect on perceived subjective norm (H4d). Two reasons possibly contribute to this result. First, only about one quarter of the developers surveyed works in firms with policies addressing internet code reuse and have also read these policies (see Chapter 4.5.2). Since an ethical work climate of complying with firm rules can only influence intention and subjective norm if relevant firm rules do exist and are read by developers, the characteristics of the sample employed in this study may suppress the effect.257 Second, as a result of the high correlation258 between an ethical work climate of complying with law and code and an ethical work climate of complying with firm rules,259 the first dimension appears to
257
However, neither H4b nor H4d are supported when including the existence of a firm-internal policy addressing the reuse of internet code by developers as a moderator for the relationship between an ethical work climate of complying with firm rules and both commercial software developers’ intention to reuse internet code in a way (potentially) violating resulting obligations and the subjective norm they perceive regarding the behavior.
258
The correlations between the law and code dimension of ethical work climate and the rules dimension are 0.73, 0.65 and 0.65 for scenarios 1, 2 and 3, respectively.
259
A high correlation between these two constructs is not surprising since firms for which complying with laws and codes is important should have implemented internal rules reflecting these laws and codes and demand that their employees stick to these internal rules.
216
Commercial software developers’ perspectives on internet code reuse
overshadow the latter one. In a model without the law and code dimension of ethical work climate there is a consistently significant negative effect of an ethical work climate of complying with firm rules on subjective norm as proposed by H4d and also the effect on intention is more pronounced, although still not significant.260 The non-significance of the effect on intention should have the same reason as discussed for the law and code dimension above. Expected utility theory. Of the hypotheses derived directly from expected utility theory, H5b is confirmed consistently in all three scenarios. Developers who perceive a higher severity of time pressure in the form of negative consequences from missing deadlines have a more positive attitude toward reusing internet code in a way (potentially) violating resulting obligations. For scenario 1, describing the reuse of internet snippets without thoroughly investigating potential obligations, also hypotheses H5a and H5c are accepted. Thus, in the situation described in scenario 1, those developers who perceive a higher usefulness of internet snippets (H5a) and those who deem the costs of investigating obligations higher (H5c) have a more positive attitude toward reusing internet snippets without properly checking their potential obligations. Yet, in the case of component reuse without thoroughly investigating potential obligations (scenario 2) and knowingly violating obligations from snippet reuse (scenario 3), neither usefulness of internet code (H5a) nor cost of compliance (H5c) exhibit a significant effect on attitude. In the case of cost of compliance in scenario 3, the items employed to measure the construct (COST1b and COST2b) might not fully account for situations in which developers expect that their firms will reject their request to reuse internet code without detailed consideration. Such situations might for example occur if developers want to reuse internet code under a specific license which their employer has banned categorically. The two items employed in the questionnaire capture whether discussing about reusing a particular piece of internet code with the firm would take long and be difficult. However, in the situation described above, the discussion would neither be long nor difficult. Yet, it seems likely that especially in such situations developers might hold a more positive
260
In this reduced model the standardized regression coefficients of an ethical work climate of complying with firm rules on subjective norm are -0.215 (p<0.001), -0.166 (p=0.003) and -0.129 (p=0.011) for scenarios 1, 2 and 3, respectively. For the effect on intention the standardized regression coefficients are -0.030 (p=0.375), -0.036 (p=0.361) and -0.043 (p=0.207) for scenarios 1, 2 and 3, respectively.
Commercial software developers’ perspectives on internet code reuse
217
attitude toward reusing internet code in a way violating resulting obligations because then they do not have to interact with their firm regarding the topic. Deterrence theory. Regarding the deterrence theory hypotheses, H6a and H6b are consistently supported. In all three scenarios developers who perceive a higher punishment severity either for their firm (H6a) or for themselves (H6b) hold a less positive attitude toward reusing internet code in a way (potentially) violating resulting obligations. Contrary to the effects of punishment severity, punishment certainty (H6c) does not exhibit a significant negative influence on developers’ attitude in all three scenarios. Two aspects might contribute to this lack of significance and help to partially explain it. First, the items employed to measure the punishment certainty construct aim at assessing how likely developers think it is that parties outside of their firm become aware of internet code reused in their firm’s software (see Table 4-7). A higher punishment certainty measured this way should only affect developers’ attitude if they believe that these outside parties will also take action upon finding violated obligations. Possibly a larger number of the developers assumes that copyright holders of internet code, especially if they are single individual OSS programmers, lack the motivation and financial resources required to claim the obligations of their code. For developers with this assumption it seems rational that a higher punishment certainty alone, as measured in the questionnaire, does not significantly impact on attitude. Second, as a result of the medium sized correlations between punishment certainty and the two punishment severity constructs,261 the influence of punishment certainty appears to be overshadowed by the two other constructs. In a model without the two punishment severity constructs, there is a moderate but consistently significant negative effect of punishment certainty on attitude as proposed by H6c.262
4.7.
Conclusion
This part of the dissertation has quantitatively analyzed the ad-hoc internet code reuse of individual software developers in commercial firms with a special focus on how they deal with the obligations potentially resulting from this behavior. Its nature of being available for free download but not necessarily being free of obligations makes reusable internet code an interesting example to investigate the value appropriation dimension of 261
The correlations between punishment certainty and punishment severity (firm) are 0.28, 0.27 and 0.20 for scenarios 1, 2 and 3, respectively. Between punishment certainty and punishment severity (developer) the correlations are 0.27, 0.33 and 0.14 for scenarios 1, 2 and 3, respectively.
262
In this reduced model the standardized regression coefficients of punishment certainty on attitude are -0.095 (p=0.041), -0.205 (p<0.001) and -0.139 (p=0.006) for scenarios 1, 2 and 3, respectively.
218
Commercial software developers’ perspectives on internet code reuse
knowledge reuse. Similar to the important role individual developers play in leveraging the positive value creation effects of knowledge reuse (see Chapter 3), individual developers are also the key to avoiding problems such as value appropriation issues from knowledge reuse since it is their behavior which strongly influences whether their firms may face value appropriation risks or other issues as a consequence of knowledge reuse or not. On a more detailed level, as the first large-scale quantitative investigation focusing on individual developers and violated obligations, this part of the dissertation has also furthered scholarly work on internet code reuse in commercial software development. Summary and theoretical contributions. With its findings this part of the dissertation contributes to two streams of literature: Literature on code reuse in commercial software development as a particular instance of knowledge reuse and literature on ethical behavior, particularly in the IS context. First, regarding scholarly work on code reuse in commercial software development, this part of the dissertation represents the first quantitative large-scale investigation of adhoc internet code reuse in commercial software development on the level of individual developers. The results point out that after a period of constant rather low importance, reusing internet code in ad-hoc fashion has become increasingly important for individual commercial software developers in recent years. Today the majority of developers consider ad-hoc internet code reuse at least as “somewhat important” for their work and only 16% seem not to be reusing internet code at all. Especially OSS savvy developers leverage internet code for their individual work. Further factors leading to more internet code reuse are greater latitude in structuring the software being developed (as e.g. project manager or architects possess it) and certain programming languages such as Ruby or Python. Somewhat surprisingly, neither the type of developers’ projects (e.g. internal-use software vs. software to be sold) nor developers’ familiarity with internet code reuse obligations significantly influence the extent of ad-hoc internet code reuse practiced. Despite the high importance of ad-hoc internet code reuse for their work, many developers seem not well prepared to deal with the obligations potentially resulting. Nearly a quarter of them have never received any form training or information on internet code reuse and only 20% have been educated on the topic in their firms. Similarly, only 17% of the developers have been confronted with information on internet code reuse during their education. As a result of this, developers’ main source of information about internet code reuse is the internet. Surprisingly, the internet is also the only effective source of information. Trainings and information provided in commercial firms or in universities
Commercial software developers’ perspectives on internet code reuse
219
seem not to address the relevant topics and appear not to increase developers’ knowledge about internet code reuse obligations. Given this setting, it is not surprising that many developers seem to have some gaps in their knowledge regarding internet code reuse obligations. Unfortunately, they appear not to be well aware of this and overestimate their own proficiency. As expected, developers with OSS experience not only reuse more internet code in their work as commercial software developers in ad-hoc fashion, but are also more familiar with the potentially resulting obligations. In this situation where many developers seem to lack detailed knowledge about internet code reuse obligations it is further surprising that only about one third of the developers works in firms which provide guidance regarding internet code reuse through policies. Especially smaller and mid-sized firms as well as firms with a main business activity outside of software development appear not to employ this mechanism to prevent their developers from (potentially) violating obligations of reused internet code. An additional concern with regard to policies is the fact that about one quarter of the developers working for firms with such policies has not read them. This is especially true for developers who are not happy with their current job and developers working on internal or single-customer software development projects. As a consequence of these circumstances between 15 and 20 percent of the developers have either actually violated or potentially violated obligations of reused internet code in the past. These findings augment existing research on code reuse in commercial software development which, when dealing with internet code, is mostly conceptual or qualitative, focuses on systematic reuse rather than ad-hoc reuse by individual developers and does only address the topic of obligations in the margin. Complementing this existing body of research, the study presented in this part of the dissertation offers a detailed picture based on large-scale quantitative data of the ad-hoc internet code reuse of individual software developers in commercial settings, focuses on obligations potentially resulting from this behavior and investigates how these obligations are dealt with. Second, with its findings regarding the determinants of ad-hoc internet code reuse in a way (potentially) violating obligations and thereby potentially endangering value appropriation, this part of the dissertation in addition to contributing to code reuse in commercial software development literature also extends scholarly work on ethical behavior, particularly ethical behavior in the IS context.
220
Commercial software developers’ perspectives on internet code reuse
By developing and partially confirming hypotheses regarding developers’ intention to reuse internet code in a way (potentially) violating resulting obligations this part of the dissertation identifies determinants of a form of unethical behavior in the IS context not investigated so far. In doing so, this study helps to respond to Winter et al.’s (2004, p. 298) call to “[…] understand and manage the many [IS] ethical issues as they raise.” The investigation of determinants of commercial developers’ intention to reuse internet code in a way (potentially) violating resulting obligations identifies TPB as a good explanatory structure and finds that developers’ intention is strongly influenced by their attitude toward the behavior and to a lesser degree by subjective norm regarding the behavior. Perceived behavioral control plays a role only in component reuse where some developers apparently do not trust in their own skills to reuse components from the internet. Beyond TPB also the influence of the ethical work climate in developers’ firms on their intention to reuse internet code in a way (potentially) violating resulting obligations was tested. Thus, this study is also a response to Flannery and May’s (2000, p. 656) call to “[…] examine the direct effect of organizational climate on individual ethical decision making.” Supplementing existing research which has investigated the effect of the instrumental dimension of ethical work climate (Flannery & May 2000; Buchan 2005), this study is – to the best knowledge of the author – the first to analyze the effect of the law and code dimension and the rules dimension of ethical work climate. A direct relationship between these two dimensions of ethical work climate and developers’ intention to reuse internet code in a way (potentially) violating resulting obligations could however not be identified. Yet, ethical work climate seems to substantially influence the subjective norm developers perceive and thus has an indirect effect on intention. The relatively stronger effect of attitude on intention makes it important to determine the antecedents of attitude. Expected utility theory and deterrence theory prove to be suitable theories for this purpose, suggesting several determinants, most of which could be confirmed either fully or at least partially. As suggested by expected utility theory, developers who perceive stronger consequences from missing deadlines in their firms hold a more positive attitude toward reusing internet code in a way (potentially) violating resulting obligations. Partially confirmed are the hypotheses that developers who consider internet code reuse as more helpful and those who deem the costs of compliance in the form of identifying potential obligations and convincing their firms to comply with the obligations as higher have a more positive attitude. These two effects can however only be observed in the case of reusing internet snippets and not checking for potentially resulting
Commercial software developers’ perspectives on internet code reuse
221
obligations thoroughly. Lastly, in line with deterrence theory, developers who perceive more severe consequences from (potentially) violating obligations from internet code reuse either for themselves or for their firm hold a less positive attitude toward internet code reuse (potentially) violating resulting obligations. With these determinants of individual developers’ intention to reuse internet code in a way (potentially) violating resulting obligations this part of the dissertation sheds light on the role of individual developers in potential issues resulting from knowledge reuse such as endangering firm value appropriation. A better understanding of this role of individual developers should help firms to devise strategies to tap into the value creation benefits of knowledge reuse without potentially having to face value appropriation risks and other issues. Managerial implications. Beyond their scholarly implications, the findings of this study are also of relevance to managerial practice. Implications can be clustered into two groups. First, the findings are relevant for the organization of software development in firms. Second, the results of this study should be considered on the more strategic level of mergers and acquisitions as well as financial transactions in the software context. Regarding the organization of software development in firms, the outcomes of this part of the dissertation can help firms developing software to leverage internet code reuse by tapping into its benefits while avoiding its potential issues such as value appropriation risks or injunctions preventing them from selling their products. As a general implication in this domain, the results presented should serve as a wakeup call to the “[…] many companies […] unaware that their developers are [re]using open source [and other internet code] or the extent to which it is being [re]used” (Bennett & Ivers 2008, p. 3).263 Firms need to acknowledge the ad-hoc reuse of internet code by their developers and also that some developers do violate obligations from reused internet code. This is highly important since in the wake of the recent Jacobsen v. Katzer ruling supportive of enforcing OSS licenses by the U.S. Court of Appeals for the Federal Circuit, copyright holders of internet code might become more likely to request compliance with the obligations of their code (Bennett & Ivers 2008; Knoll 2009). One approach for firms to react upon realizing this situation would be to follow Rosen’s (2004, p. 289) suggestion to “[…] make sure their employees don’t have access to
263
Similarly to Bennett and Ivers (2008) also other authors (e.g. Palamida 2005; waters 2006) claim that many firms are not aware of the internet code reuse by their developers.
222
Commercial software developers’ perspectives on internet code reuse
preexisting software, and […] train their employees not to copy other software.” And indeed, in doing so firms would avoid value appropriation risks and the other issues which may follow internet code reuse when resulting obligations are not accounted for properly. Yet, in doing so firms would also forego the attractive value creation advantages which internet code reuse offers. Thus, it seems to be the better choice to allow and even proactively manage internet code reuse while making sure that individual developers do not violate the potentially resulting obligations. OSS savvy developers can play a central role in this strategy. Based on their access to local search, such developers appear to make more use of ad-hoc internet code reuse in their commercial software development activities. Moreover, they are also more knowledgeable regarding the potential obligations which may result from internet code reuse and should thus be less likely to violate these obligations. Consequently, firms should encourage and support developers who want to be involved in OSS projects or even actively seek them in the labor market. As another step to increase internet code reuse and thus value creation, firms should carefully consider the programming languages of their software development projects. Working with Ruby or Python as programming languages should be a lever to increase the internet code reuse in a project. Addressing the question of how to avoid the violation of obligations from reused internet code, multiple levers can be derived from this study. Very generally, this topic needs to be positioned more prominently on firms’ agendas. Firms need to play an active role in making developers aware of the issues which may result from reusing internet code without accommodating the resulting obligations. They should aim to supplement the internet as developers’ main channel of information on the topic of internet code reuse by offering useful mandatory trainings and by providing practical information on the topic to their developers. Beyond that, firms should also lobby institutions of education such as universities to include this topic into the curricula of all degree programs training potential software developers in a way which touches on the issues relevant to applied software development. Furthermore, all firms not small enough to facilitate fast unbureaucratic communication between management and developers should establish easy to read and understand policies providing guidance to their employees on how to deal with internet code. This is equally important to firms developing software as their main business and those developing software to support other activities. Once having established such policies, firms need to make sure that developers are aware of them, read them and understand them. In this, firms should not differentiate between developers working on
Commercial software developers’ perspectives on internet code reuse
223
projects for multiple external customers and developers working on projects deemed as less critical such as internal-use or custom-built software development projects. Since there is a chance that such projects are offered to multiple external customers at a later point in time, firms should strive to be prepared for this option from the beginning on. Beyond these general implications and suggestions, the results of the research model evaluation also provide some very tangible suggestions to firms which want to combat the potential or actual violation of obligations when their developers reuse internet code in adhoc fashion. First, firms need to establish an understanding among their developers of the severity of the consequences for both firm and individual developer which can result from violating obligations of reused internet code. In communicating these consequences it is the perception of severe punishments for both firm and developer which is important. If individual developers perceive high levels of these two factors, the study indicates their intention to reuse internet code in a way (potentially) violating resulting obligations will decrease. Regarding the perception of punishment for the firm, companies should use credible real-life situations such as the Cisco/Linksys case (see Section 1.1) to convincingly illustrate to their developers that violated obligations of internet code are no peccadillo but can seriously harm the firm. Regarding punishment for the developer, firms should use the policies describing how to deal with internet code reuse to also clearly state significant penalties which developers have to face when violating the policies. Furthermore, simply having such penalties on the books will do little to create change if rules are not enforced. Second, the study results suggest that developers in firms in which missing deadlines has less severe negative consequences tend to reuse internet code in a way (potentially) violating resulting obligations less. A general policy of not enforcing internal deadlines is most likely not realistic in commercial firms which themselves face external deadlines enforced by customers (Austin 2001). Yet, firms should reconsider how much pressure they need to apply to their developers in order to find a balance between efficient work and work conducted without problematic shortcuts. In addition to that, firms should strive to set realistic deadlines because if developers consider themselves on time with their work, considerations regarding the consequences of missing their deadline should not be relevant for their decision making at all. Lastly, in situations where time pressure cannot be reduced and where missing deadlines needs to be sanctioned severely to ensure meeting external commitments, firms should be aware of the higher probability of reused internet code with
224
Commercial software developers’ perspectives on internet code reuse
violated obligations and meet this situation with extensive use of tools scanning for reused internet code in the particular software being developed. Third, firms should deploy measures to reduce developers’ costs of compliance, especially the time and effort necessary to identify the obligations which come with a certain piece of internet code. To achieve this, firms might want to appoint one or more internet code experts within each group of developers (Bennett & Ivers 2008; Olson 2008). These experts can then be addressed by other developers in unbureaucratic fashion and can help them to quickly determine the obligations of internet code and assess whether these obligations are compatible with firm policies. Additionally, firms might want to maintain an internal database, e.g. in the form of Wikis and FAQs, which can serve as a first point of information for developers who want to find out about obligations from certain internet code licenses for instance (Ruffin & Ebert 2004). Addressing decisions about whether complying with obligations from internet code is acceptable or not, firms should strive to deploy a fast and transparent process. Ingredients of such process could be a workflow integrated into firms’ existing software development tools (Olson 2008), white-lists which permit certain internet code obligations without further management involvement and rejections communicated well and in explanatory fashion to the developer who wanted to reuse internet code not in line with firms policies. Fourth and finally, firms should harness the study finding that developers “[…] look outside themselves for cues about what is right (appropriate) behavior and what is wrong (inappropriate) behavior” (Trevino 1986, p. 608)264 and try to make all employees somehow related to software development aware of the potential issues which may come with reusing internet code in a way (potentially) violating resulting obligations. The more employees are aware of the issue the stronger individual developers should perceive peer norms not to engage in the behavior. A general work climate emphasizing ethical behavior can further contribute to this effect. Beyond the implications discussed above, all of which relate to the internal management of software development activities within commercial firms, the results presented also hold a further implication in a wider context. Even if firms, by applying the above suggestions, manage internet code reuse well internally, they are still not immune from the potential consequences of violated obligations of internet code such as value appropriation risks. This is because the inducement to reuse internet code and potentially
264
Trevino (1986) originally referred to employees in general and not just software developers.
Commercial software developers’ perspectives on internet code reuse
225
or actually violate resulting obligations also applies to developers working for firms’ software suppliers, systems integrators and outsourcing partners. In the Cisco/Linksys case (see Section 1.1) the developer reusing the critical piece of OSS code and not taking into consideration the resulting obligations was actually not a Linksys employee, but an outsourced developer hired by Broadcom which itself was a supplier to Linksys (Egger & Hogg 2006; Olson 2008). Thus, even if the violation of an internet code obligation has happened outside of the focal firm, it may still be the focal firm which has to bear the resulting consequences. Responding to this situation, firms should urge their suppliers, partners and outsourced developers to follow policies and processes of the same level of sophistication as employed internally. Additionally, firms should develop contractual conditions which allocate damages from violated internet code obligations to the party which has actually caused them. All of these measures address levers identified in this study to help firms utilize internet code reuse in a way not resulting in issues from violated obligations. Thus, these measures should ultimately maintain or even enhance firm value creation while not reducing value appropriation. The second group of implications addresses a more strategic level. Given the degree of internet code reuse and also the extent of violated obligations in the course of this, both firms and financing entities such as VC funds also need to consider internet code obligations when engaging in mergers, acquisitions and other financial transactions (Davidson 2006; Barraclough 2008). Acquirers and financing parties should insist on clear documentation of the measures employed to avoid violated internet code obligations, conduct a rigorous technical due diligence on the code base of the target firm and include relevant considerations into their contractual terms. On the flipside, firms and especially start-up companies aiming to be acquired or in need of financing in the future should be particularly careful not to violate internet code obligations for the sake of their valuation.265 Future research. Both the results of this part of the dissertation and its limitations suggest multiple avenues for future research. First, the external validity of this study needs to be verified. The use of software developers active in newsgroups as survey population provided access to a group of commercial software developers with heterogeneous backgrounds, e.g. in terms of geography or the firms they work for. Since this study was
265
Olson (2008, p. 7) reports to have “[…] seen companies suffer serious “haircuts” in their acquisition price (as much as 40%) when unreported open source software was discovered in their code base during due diligence.”
226
Commercial software developers’ perspectives on internet code reuse
the first quantitative investigation of ad-hoc internet code reuse on the level of individual commercial software developers, this heterogeneity was important in order to derive a broad set of conclusions. However, there is reason to believe that the developers surveyed are not representative of the average commercial software developer, especially in terms of their OSS experience and their software development skills. Thus, in order to calibrate the findings of this study it should be repeated in a more homogeneous context. The ideal setting would be a number of firms from which all software developers participate in the study. Second, future work should try to capture the dependent variable of the research model (reuse of internet code in a way (potentially) violating resulting obligations) in objective fashion and not through self-reporting by individual developers. Given the ethical nature of the topic at stake and the identification of mild social desirability effects in the current study such an objective measure would add robustness to the research model. Objective data regarding the dependent variable could be gathered from commercial firms scanning their software for reused internet code with software tools such as Black Duck Software’s Protex.266 Through the use of such tools objective accounts of the frequencies of obligation violations can be created. Extending this data with information gathered from the developers involved in the respective software development projects as independent variables would create a new dataset which could be employed to further test the research model. In addition to capturing the dependent variable in objective fashion such an approach would also allow predicting actual behavior instead of intention to engage in the behavior and could thus provide further proof for the research model. Finally, extending the value appropriation perspective on knowledge reuse beyond the context of commercial software development merits closer attention. There should exist other situations in which knowledge reuse may impede value appropriation and thus firm profitability. Rigby and Zook (2002) for example report that the reuse of existing external knowledge when creating a new theme park by their employees reduced the amount of value Disney could capture by $240 million because the obligations attached to this particular piece of knowledge did initially not permit reuse. More generally, interesting points of departure to investigate situations in which knowledge reuse impedes value appropriation could be R&D joint ventures and collaborations in which knowledge developed in a joint effort by multiple parties comes with obligations which restrict exclusive reuse by only one party. In addition to that, knowledge created in commercial 266
http://www.blackducksoftware.com/protex, last accessed 31.01.2010.
Commercial software developers’ perspectives on internet code reuse
227
firms with the help of government funding might also come with limitations regarding the reuse in other branches of the firms. Given the breadth and general applicability of the theories embedded in the research model of this study (TPB, expected utility theory, deterrence theory), it should be possible to transfer it from the context of software development to the new situations identified and test it there.
5.
Conclusion The research objectives of this dissertation were motivated by the Cisco/Linksys case
presented in Section 1.1. The case describes how a developer created software for Linksys’ WRT54G router which later became part of Cisco’s product portfolio when Linksys was acquired. Two aspects of this software development activity are noteworthy. First, the developer did not create all of the software from scratch, but reused existing OSS code freely available on the internet for parts of it. Second, the reuse of this existing OSS code later required Cisco to make the source code of the whole router software available and allow others to modify and pass on this software without having to obtain Cisco’s approval for doing so. The consequence of these terms was that some technology savvy consumers did not buy expensive enterprise-class routers from Cisco anymore but instead purchased the cheap WRT54G router and improved it with free new software which massively extended the functionality of the router. With code reuse the Cisco/Linksys case describes a particular instance of knowledge reuse and showcases that there is a value creation side and a value appropriation side to knowledge reuse. Value creation refers to establishing the “size of the pie” (Gulati & Wang 2003, p. 209) of both monetary (i.e. profits) and non-monetary (i.e. consumer surplus) benefits which firms create with their products and services. Additional value can be created if either the use value of a product or service as perceived by customers is increased or the opportunity costs required to create the product or service are reduced.267 In the Cisco/Linksys case the developer most likely created additional value by reducing opportunity costs because she saved time and consequently development budget by reusing existing code and not implementing the respective functionality herself. Furthermore, in case she reused a popular piece of OSS code she probably also increased use value because she might not have been able to create software of such high quality herself in equal time. Once value has been created, value appropriation determines who is able to capture which “share of the pie” (Gulati & Wang 2003, p. 209) and describes how profits and consumer surplus are distributed among the actors involved in value creation. For each party the amount of value it is able to capture is determined by its bargaining power versus the other contestants. In the Cisco/Linksys case the terms under which Cisco had to make 267
Obviously, also both levers can be applied simultaneously.
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8_5, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
Conclusion
229
available the source code of the whole router software as a consequence of the developer’s code reuse reduced their bargaining power versus their customers and allowed those to capture a larger share of value while Cisco itself had to give up this portion of value. Besides pointing to the value creation and the value appropriation side of knowledge reuse the Cisco/Linksys case also stresses that in the case of knowledge reuse it is individual developers who with their actions affect value creation and value appropriation of their firms. Only if individual developers choose to reuse existing knowledge their firms can tap into the resulting positive value creation effects and if individual developers reuse existing knowledge in a careless way their firms may be faced with the resulting value appropriation issues. Given this importance of individual developers in successfully exploiting knowledge reuse for firms and the lack of large-scale quantitative scholarly work focusing on this topic, this dissertation aimed at investigating the perspectives of individual developers on knowledge reuse from a value creation angle (in Chapter 3) and from a value appropriation view (in Chapter 4). Both analyses were conducted in the context of code reuse as one particular instance of knowledge reuse and chose the reuse of OSS and other code available on the internet as the empirical setting. Besides the general relevance of the findings for knowledge reuse research, the results of this thesis also contribute to scholarly work on OSS development and research addressing the reuse of internet code in commercial software development. Value creation perspective on knowledge reuse. To investigate the perspectives of individual developers on the value creation side of knowledge reuse, the code reuse behavior of developers in public OSS projects was analyzed. Code reuse in OSS development is an interesting context for this endeavor, first because the particularities of OSS and especially its licenses explicitly allow code reuse and second because much is known about OSS development processes and OSS developers. This knowledge provided a solid base for this research to build upon. Beyond contributing to knowledge reuse research in general, the findings of this analysis also further scholarly work on the development of OSS which has so far rather focused on the provision of OSS code but largely neglected its reuse. The specific goal of the analysis was to understand if, how and why individual OSS developers leverage existing code or not and what determines the extent to which they reuse existing code.
230
Conclusion
Based on 12 qualitative interviews and a large-scale survey with 684 participants the study presented in Chapter 3 of this dissertation paints a rich picture of the code reuse behavior of individual OSS developers. First, code reuse seems to be important for OSS developers’ work. About 30% of the functionality they contribute to their projects is based on reused code. The primary motivations for developers to reuse existing code are the resulting efficiency and effectiveness benefits while developers see a potential loss of control over their work as the main issue of code reuse. Interestingly, neither license nor programming language conflicts are perceived as major impediments to code reuse by OSS developers. On a more fine-grained level, OSS developers reuse both components and snippets with snippets accounting for about ten percent of the lines of code developers submit to their projects. Whether developers favor component or snippet reuse is among other reasons influenced by their motivations to contribute to their project (e.g. whether challenge seeking or creative pleasure is more important for them). Interestingly, a large number of developers (about 50%) modify the components they reuse, partly for skill improvement reasons. Finally, OSS developers most frequently turn to general purpose search engines, OSS repositories and code example web pages when searching for existing code to reuse. Yet, they consider means of local search such as their personal networks or other OSS projects they have been involved in as more efficient. The investigation of determinants of OSS developers’ code reuse behavior finds that developers with better access to local search due to a larger personal network or more exposure to different projects reuse more, presumably because their costs of searching for, understanding, adapting and integrating existing knowledge are lower. Further, developers convinced of the benefits of code reuse (efficiency and effectiveness gains, enhanced software quality, and the chance to work on preferred tasks) practice more code reuse, as do developers who can use code reuse to support their goal of serving the OSS community. Moreover, developers see code reuse as a means to kick-start new projects as it helps them deliver a “plausible promise”. Lastly, the study finds partial support for the hypothesis that those developers who desire to solve technical problems for the satisfaction of it rather refrain from reuse and, thus, make their projects less efficient and effective than they could be. In addition to contributing to research on knowledge reuse in general and on OSS development, the findings presented also hold managerial implications through which firms can increase their value creation. First, the high level of code reuse within the OSS community should motivate firms to explore reusing OSS code in their own software
Conclusion
231
development activities. In order to do so they might want to encourage and support their employees to build personal OSS networks and become involved in various OSS projects to enhance their access to local search for OSS code. Second and also applicable beyond the scope of software development, firms should consider modifying the incentive structures of their developers to encourage them to reuse existing knowledge. Levers in this course might be allowing developers to select tasks themselves according to preference, compensating developers according to the results delivered and not based on time spent at work and requiring the delivery of “credible promises” in the early phase of new projects. Third, firms should train their developers to reuse existing knowledge and create an engineering culture which endorses knowledge reuse. Finally, in order to accommodate developers’ desire to tackle difficult technical challenges firms should consider job enrichment (Herzberg 1968) as a means to integrate such challenges into developers’ work which are in the best interest of the firm. Value appropriation perspective on knowledge reuse. In order to study the perspectives of individual developers on the value appropriation side of knowledge reuse, the ad-hoc reuse of internet code by software developers in commercial firms was analyzed. Ad-hoc reusing of internet code in commercial software development is an interesting context in this endeavor because internet code is typically free to access and reuse but may still come with obligations which need to be complied with. These obligations may affect value appropriation in such a way that they demand that other code originally proprietary to a firm has to be made available for free modification and redistribution. Beyond contributing to knowledge reuse research in general, the findings of this analysis also further scholarly work on code reuse in commercial software development which, when dealing with internet code, has so far rarely focused on individual developers and how they deal with the obligations from internet code reuse. The specific goal of the analysis was to understand how important ad-hoc reusing existing internet code is for individual commercial software developers, how well aware they are of the potentially resulting obligations and what determines whether commercial software developers run the risk of reusing internet code in a way potentially violating resulting obligations. Based on 20 qualitative interviews and a large-scale survey with 1,133 participants the study presented in Chapter 4 of this dissertation paints a rich picture of the ad-hoc internet code reuse behavior of individual commercial software developers with a special focus on how they deal with obligations potentially resulting from this. First, reusing internet code
232
Conclusion
in ad-hoc fashion seems to have grown in recent years and has become an inherent part of commercial software developers’ work today. Yet, commercial software developers’ knowledge regarding the obligations which may come with internet code reuse appears not to have kept track with the growing importance of ad-hoc internet code reuse. Many commercial software developers are not fully aware of some of the potentially resulting obligations. Additionally, providing developers with information on the topic seems not to be high on the agendas of firms developing software and institutions of education. As a result of this, developers mainly rely on information they find on the internet regarding internet code reuse obligations. Further, only a minority of the firms developing software offers guardrails in the form of policies to their developers which provide them with specific instructions on how to deal with internet code. As a consequence of this overall situation, between 15 and 20 percent of the developers surveyed have potentially violated obligations of internet code in their past work for their firms. The investigation of determinants leading commercial software developers to reuse internet code in a way potentially or actually violating resulting obligations finds that developers with a more positive attitude toward the behavior and those who feel that their social environment is more positive toward the behavior appear more likely to intend to engage in the behavior. Moreover, in the specific setting of reusing components from the internet, developers who are less confident of their capabilities to do so seem less inclined to reuse internet components in a way potentially violating resulting obligations. On a more detailed level, developers who perceive more severe consequences from missing deadlines in their firms and partly those who find it more difficult to investigate potential obligations of internet code find it more acceptable to reuse it in a way potentially or actually violating resulting obligations. Further, those developers who feel stronger negative consequences for their firms and themselves from violating obligations of internet code hold a less positive attitude toward reusing internet code in a way potentially or actually violating resulting obligations. Finally, developers working in firms with a more ethical work climate of complying with laws and codes perceive their social environment more negative toward reusing internet code in a way potentially or actually violating resulting obligations. In addition to contributing to research on knowledge reuse in general and internet code reuse in commercial software development, the findings presented also hold managerial implications through which firms can avoid value appropriation issues from the ad-hoc reuse of internet code by their developers. First, the high level of ad-hoc internet code
Conclusion
233
reuse by individual commercial software developers combined with developers’ partial lack of knowledge regarding the potentially resulting obligations from this behavior should serve as a wake-up call to all those firms developing software who have been ignoring this topic so far.268 Reacting to this call firms should strive to proactively manage the ad-hoc internet code reuse of their developers. Means in this effort should be mandatory firminternal trainings and information for developers but also engaging in dialogue with institutions of education to help them integrate the practical issues of internet code reuse into the curricula of potential future software developers. Furthermore and on the internal side again, all firms developing software with more than a handful of developers should also introduce mandatory easy-to-read policies providing guidance to their developers how to deal with internet code and the potentially resulting obligations. Second and more specific, firms should manipulate the levers identified to influence commercial software developers’ intention to reuse internet code in a way potentially or actually violating resulting obligations. In order to influence developers’ attitude toward this behavior firms should strive to establish an understanding among their developers of the severity of the consequences for both firm and individual developers which can result from violating obligations of internet code. In the course of this firms might need to define specific internal punishments for developers who endanger their firm by reusing internet code in a careless way. As a further lever to influence developers’ attitude firms should try to set realistic deadlines for their developers and find a balance between putting pressure on developers to keep deadlines for all sake and the potential issues resulting from high pressure. In addition to that, also the deployment of measures to reduce developers’ costs of investigating and complying with obligations from internet code reuse should help to tip developers’ attitude in the right direction. Such measures could be the appointment of an internet code expert within each developer group or the provision of an internal database in form of Wikis and FAQs which can support developers trying to find out how to deal with a particular piece of internet code. Beyond the factors influencing developers’ attitude, firms should also attempt to affect developers’ social environment (especially their colleagues) in such a way that developers’ perceive negative peer norms toward reusing internet code in a way potentially or actually violating resulting obligations. Third, in addition to the implications for firms’ internal software development activities, companies should also be aware of internet code reuse without accounting for 268
Qualitative evidence from the interviews conducted in the course of the study as well as existing literature (e.g. Palamida 2005; waters 2006; Bennett & Ivers 2008) suggests that the number of firms which have not yet acknowledged this fact is rather high.
234
Conclusion
the resulting obligations outside of their own boundaries and react accordingly. This implies urging software suppliers, partners and outsourced developers to follow processes and policies of the same level of sophistication as employed internally. Moreover, firms should also make use of legal means to allocate damages from violated internet code obligations they have to bear to those outside parties actually responsible for the issue. Finally, on a more strategic level, both buyers and sellers in financial transactions such as mergers or acquisitions which involve software need to be aware of the risks potentially hidden in code bases and act accordingly. Future research. This dissertation set out to shed light on the perspectives of individual developers on the value creation and value appropriation side of knowledge reuse. By analyzing this topic in the context of code reuse as one particular instance of knowledge reuse and in the empirical setting of reusing OSS and other code available on the internet this thesis has also contributed to scholarly work on the development of OSS and the stream of research dealing with the reuse of internet code in commercial software development. Both the answers to the research objectives pursued in this dissertation and the limitations of the two studies presented suggested new research avenues which merit exploration. While the detailed suggestions for further research are discussed in the conclusions of Chapters 3 and 4, two general topics which apply to both studies of this dissertation are raised in the following. First, in both studies of this dissertation all variables required to test the respective research models were captured in questionnaires in which developers self-reported their behaviors and beliefs. While the data collected do not appear to be influenced by common method bias strongly, the robustness of the results presented would benefit from repeating the analyses with some variables captured in objective fashion. For both studies it seems feasible to employ objective measures for the dependent variables. In the case of code reuse in OSS projects the share of reused code could be assessed by analyzing the code base of the developers’ work. In the second study the number of obligations violated in commercial software development projects could be captured with the output of tools firms employ to scan their software projects for reused internet code. Second, both studies investigated code reuse as one particular instance of knowledge reuse and chose the reuse of OSS and other code available on the internet as the empirical setting. Building on the resulting findings, extending the perspectives which individual
Conclusion
235
developers hold on the value creation and value appropriation side of knowledge reuse beyond this context should merit closer attention since also in other situations it should be individual developers who with their knowledge reuse behavior heavily influence how much value firms create and which share of this value firms can appropriate. Given the breadth of the theories applied to build the two research models employed in this dissertation, it should be possible to transfer these models to new contexts and leverage them there.
Appendix A.1.
Code reuse in open source software development ............................................ 238
A.1.1. Survey questionnaire among OSS developers .................................................. 238 A.1.2. Multivariate analyses of determinants of code reuse ........................................ 248 A.2.
Code reuse in commercial software development............................................. 250
A.2.1. ACM code of ethics and professional conduct: IP case.................................... 250 A.2.2. Survey questionnaire among commercial software developers ........................ 251 A.2.3. Internet code reuse quiz .................................................................................... 259 A.2.4. Discriminant validity of model constructs ........................................................ 260
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
238
Appendix
A.1. Code reuse in open source software development A.1.1. Survey questionnaire among OSS developers Figure A-1: OSS developer survey questionnaire
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
Appendix
239
240
Appendix
Appendix
241
242
Appendix
Appendix
243
244
Appendix
Appendix
245
246
Appendix
Appendix
247
248
Appendix
A.1.2. Multivariate analyses of determinants of code reuse Table A-1: Standardized coefficients of OSS developer code reuse models 1) ImpRePast bStdXY Std. Err. Attitude toward code reuse (research model group A) BenefitEffectiveness (H1a) 0.10*** 0.03 BenefitEfficiency (H1b) 0.31*** 0.04 BenefitQuality (H1c) 0.14*** 0.04 BenefitTaskSelection (H1d) 0.07** 0.04 IssueControlLoss (H1e) -0.02 0.03 Access to local search (research model group D) DevOSSNetsize (log) (H2a) 0.07* 0.04 DevOtherProjects (H2b) 0.05* 0.03 Project maturity (research model group E) -0.08** 0.04 ProjPhase (H3) Compatibility with developers’ goals (research model group F) MotChallenge (H4a) -0.07* 0.04 MotCreaPleasure (H4b) 0.04 0.04 MotLearning (H4c) 0.01 0.04 MotCommunity (H4d) 0.08** 0.04 MotOSSReputation (H4e) 0.00 0.04 MotSignaling (H4f) -0.04 0.04 Subjective norm (research model group B) DevNorm 0.07** 0.04 Perceived behavioral control (research model group C) DevSkill -0.03 0.04 ProjPolSupport 0.09** 0.04 ProjPolDiscourage -0.08** 0.04 ConditionLack -0.21*** 0.04 ConditionLicense 0.05 0.04 ConditionLanguage 0.02 0.04 ConditionArchitecture 0.02 0.04 Additional control variables (research model group G) ProjSize -0.01 0.02 ProjComplexity 0.05 0.04 ProjStandalone 0.05 0.04 DevOSSExperience 0.03 0.04 DevProjTime 0.08** 0.04 DevProjShare 0.05 0.04 DevProf 0.01 0.04 DevEduReuse -0.04 0.03 DevProfEduReuse 0.08** 0.04 Residence-N. America -0.03 0.04 Residence-S. America 0.02 0.03 Residence-Asia & RoW -0.02 0.04 Observations 632 * significant at 10%, ** significant at 5%, *** significant at 1% Note: Reported standard errors are robust standard errors.
5) ReuseSharePast bStdXY Std. Err.
9) ImpReFut bStdXY Std. Err.
0.09*** 0.21*** 0.07* 0.12*** -0.01
0.03 0.04 0.04 0.04 0.04
0.08** 0.28*** 0.13*** 0.07** 0.00
0.03 0.04 0.03 0.03 0.03
0.09** 0.07**
0.04 0.03
0.12*** 0.09***
0.04 0.03
-0.13***
0.04
-0.13***
0.04
-0.10** 0.01 -0.05 0.07* 0.01 0.02
0.04 0.04 0.04 0.04 0.04 0.04
-0.04 0.02 0.00 0.08** 0.06 0.01
0.04 0.04 0.04 0.04 0.04 0.04
0.10**
0.04
0.13***
0.04
0.00 0.02 -0.03 -0.16*** 0.01 -0.01 0.03
0.04 0.04 0.03 0.04 0.04 0.04 0.04
-0.01 0.08** -0.11*** -0.17*** 0.01 0.04 0.02
0.04 0.04 0.04 0.04 0.04 0.04 0.04
-0.04** 0.08* 0.02 0.00 -0.01 0.05 -0.01 -0.02 0.07* -0.05 -0.03 0.02
0.02 0.05 0.04 0.04 0.04 0.05 0.04 0.04 0.04 0.04 0.03 0.04
-0.04** 0.01 0.07* 0.01 0.05 0.02 0.03 -0.07** 0.06* 0.03 0.00 -0.01
0.02 0.05 0.04 0.04 0.04 0.04 0.04 0.03 0.03 0.04 0.03 0.04
632
632
Appendix
249
Table A-2: Marginal effects of OSS developer code reuse models (Marginal effects conditional on being uncensored) 1) ImpRePast 5) ReuseSharePast 9) ImpReFut dy/dx Std. Err. dy/dx Std. Err. dy/dx Std. Err. Attitude toward code reuse (research model group A) BenefitEffectiveness (H1a) 0.14*** 0.04 1.75*** 0.64 0.12** 0.05 0.41*** 0.05 3.95*** 0.75 0.39*** 0.05 BenefitEfficiency (H1b) BenefitQuality (H1c) 0.19*** 0.05 1.22* 0.68 0.18*** 0.05 0.10** 0.05 2.31*** 0.68 0.10** 0.05 BenefitTaskSelection (H1d) IssueControlLoss (H1e) -0.02 0.04 -0.26 0.68 0.01 0.05 Access to local search (research model group D) DevOSSNetsize (log) (H2a) 0.09* 0.05 1.54** 0.74 0.16*** 0.05 DevOtherProjects (H2b) 0.01* 0.01 0.24** 0.11 0.02*** 0.01 Project maturity (research model group E) ProjPhase (H3) -0.09** 0.05 -2.09*** 0.67 -0.16*** 0.05 Compatibility with developers’ goals (research model group F) MotChallenge (H4a) -0.09* 0.05 -1.79** 0.76 -0.05 0.06 MotCreaPleasure (H4b) 0.05 0.05 0.20 0.73 0.03 0.05 MotLearning (H4c) 0.01 0.05 -0.76 0.70 0.00 0.05 MotCommunity (H4d) 0.11** 0.05 1.35* 0.75 0.12** 0.05 MotOSSReputation (H4e) 0.00 0.04 0.10 0.49 0.05 0.03 MotSignaling (H4f) -0.04 0.04 0.22 0.54 0.01 0.04 Subjective norm (research model group B) 0.08** 0.04 1.49** 0.60 0.14*** 0.04 DevNorm Perceived behavioral control (research model group C) DevSkill -0.05 0.06 -0.09 0.83 -0.02 0.06 ProjPolSupport 0.25** 0.11 0.94 1.80 0.24** 0.12 ProjPolDiscourage -0.70** 0.32 -3.21 3.15 -0.98*** 0.33 ConditionLack -0.15*** 0.03 -1.61*** 0.42 -0.13*** 0.03 ConditionLicense 0.04 0.03 0.14 0.40 0.01 0.03 ConditionLanguage 0.02 0.04 -0.10 0.51 0.04 0.04 ConditionArchitecture 0.02 0.03 0.40 0.48 0.02 0.03 Additional control variables (research model group G) ProjSize 0.00 0.00 -0.02** 0.01 0.00** 0.00 ProjComplexity 0.06 0.06 1.54* 0.88 0.01 0.06 0.23* 0.12 ProjStandalone 0.15 0.12 0.83 1.67 DevOSSExperience 0.01 0.01 0.02 0.18 0.00 0.01 DevProjTime 0.01** 0.00 -0.01 0.06 0.01 0.00 DevProjShare 0.00 0.00 0.02 0.02 0.00 0.00 DevProf 0.02 0.11 -0.37 1.61 0.10 0.11 -0.21** 0.10 DevEduReuse -0.11 0.10 -0.94 1.45 0.33** 0.13 3.75* 2.21 0.26** 0.13 DevProfEduReuse Residence-N. America -0.09 0.11 -2.22 1.58 0.09 0.11 Residence-S. America 0.15 0.20 -2.25 2.57 0.00 0.19 Residence-Asia & RoW -0.07 0.14 0.96 2.09 -0.05 0.14 Observations 632 632 632 * significant at 10%, ** significant at 5%, *** significant at 1% Notes: Reported standard errors are robust standard errors; due to the censored nature of Tobit models marginal effects cannot be calculated for all observations simultaneously (Greene 1999; Cong 2001). Because of that the data reported here shows the marginal effects of the explanatory variables on the dependent variables under the condition that the observation has not been censored.
250
Appendix
A.2. Code reuse in commercial software development A.2.1. ACM code of ethics and professional conduct: IP case Case 1: Intellectual property of the ACM’s 1992 code of ethics and professional conduct as presented by Anderson et al. (1993, p. 99): “Jean, a statistical database programmer, is trying to write a large statistical program needed by her company. Programmers in this company are encouraged to write about their work and to publish their algorithms in professional journals. After months of tedious programming, Jean has found herself stuck on several parts of the program. Her manager, not recognizing the complexity of the problem, wants the job completed within the next few days. Not knowing how to solve the problems, Jean remembers that a coworker had given her source listings from his current work and from an early version of a commercial software package developed at another company. On studying these programs, she sees two areas of code which could be directly incorporated into her own program. She uses segments of code from both her coworker and the commercial software, but does not tell anyone or mention it in the documentation. She completes the project and turns it in a day ahead of time.”
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
Appendix
A.2.2. Survey questionnaire among commercial software developers Figure A-2: Commercial software developer survey questionnaire – scenario 1 (Full questionnaire with scenario 1: Not checking thoroughly for snippet reuse obligations)269
269
See Figure A-3 and Figure A-4 for the other scenarios employed.
251
252
Appendix
Appendix
253
254
Appendix
Appendix
255
256
Appendix
Appendix
257
258
Appendix
Figure A-3: Commercial software developer survey questionnaire – scenario 2 (Questionnaire excerpt with scenario 2: Not checking thoroughly for component reuse obligations)270
Figure A-4: Commercial software developer survey questionnaire – scenario 3 (Questionnaire excerpt with scenario 3: Knowingly ignoring obliations from snippet reuse)271
270
The other parts of the questionnaire do not differ from those presented in Figure A-2.
271
The other parts of the questionnaire do not differ from those presented in Figure A-2.
Appendix
259
A.2.3. Internet code reuse quiz The quiz was designed after the qualitative pre-study (see Chapter 4.3.1) and reflects typical situations which commercial software developers encounter when reusing internet code. Table A-3: Quiz on commercial software developers’ internet code knowledge Quiz questions and answers
Percentage
Which open source license(s) could in certain situations require a developer who integrates code under this/these license(s) into proprietary code to also make available the proprietary code as open source? GNU General Public License (GPL)* 55% Berkeley Software Distribution (BSD) License 1% Mozilla Public License (MPL) 1% Both GPL and MPL 27% None of the licenses listed above 1% Do not know 15% Which open source license(s) demand(s) that every software product that has integrated its/their code includes its/their license text(s)? GNU General Public License (GPL) 17% Berkeley Software Distribution (BSD) License 3% Mozilla Public License (MPL) 0% GPL, BSD and MPL 59% None of the licenses listed above 1% Do not know 20% Which open source license demands that its code is only used in private or academic software development? GNU General Public License (GPL) Berkeley Software Distribution (BSD) License None of the licenses listed above Do not know
5% 10% 67% 18%
Somebody posts a code snippet in the newsgroups or on a tutorial website. Under which conditions is it completely safe to integrate this snippet? If the poster does not mention any obligations that come with the snippet 9% If the poster explicitly declares that he does not demand any obligations from using the snippet 39% If the snippet is not part of any program 1% If any one of the conditions above mentioned is true, integration would be safe 15% None of the conditions mentioned above would be enough 20% Do not know 16% If open source code available on the internet violates a patent, can the patent holder only sue the original developer of the open source code or also other parties that have integrated this code into their products? Only the original developer 6% Original developer and other parties that have integrated the code 52% Nobody can be sued because most open source licenses deter patent infringement law suits 3% Do not know 39% *While this answer is not fully correct, developers did still receive 0.5 credits for it in the calculation of the quiz scores. Notes: Correct answers are bolded; N=869.
Scenario 2 (N=256)
Scenario 3 (N=297)
Attitude
Punishment certainty Punishment severity (developer) Punishment severity (firm) Cost of compliance
Severity of time pressure
Usefulness of internet code Ethical work climate: Rules Ethical work climate: Law & code Perceived behavioral control
Subjective norm
Intention
Attitude
0.09 0.10 0.17 0.60 -0.12 -0.11 0.04 -0.04 0.02 -0.24 -0.20 -0.21 0.10 0.03 0.27 0.70 -0.11 -0.12 -0.05 -0.02 0.09 -0.14 -0.22 -0.25 0.04 0.02 0.16 0.68 -0.03 -0.09 0.08 -0.12 0.00 -0.20 -0.15 -0.21
-0.19 -0.17 -0.30 -0.13 0.89 0.61 -0.07 0.10 -0.20 0.24 0.27 -0.03 -0.22 -0.27 -0.23 -0.06 0.88 0.52 0.11 0.06 -0.02 0.19 0.22 0.09 -0.27 -0.26 -0.24 -0.03 0.89 0.54 0.00 0.05 0.00 0.19 0.19 0.03
-0.21 -0.18 -0.32 -0.15 0.92 0.64 -0.05 0.04 -0.16 0.23 0.30 0.00 -0.25 -0.30 -0.22 -0.07 0.91 0.61 0.12 0.02 0.02 0.22 0.21 0.09 -0.24 -0.23 -0.15 -0.06 0.89 0.52 0.05 0.11 0.05 0.15 0.16 0.03
-0.21 -0.24 -0.31 -0.15 0.92 0.68 -0.04 0.05 -0.09 0.25 0.28 0.03 -0.21 -0.25 -0.22 -0.07 0.89 0.60 0.10 0.00 -0.01 0.17 0.18 0.09 -0.23 -0.24 -0.20 -0.11 0.88 0.67 0.03 0.13 0.03 0.21 0.19 -0.01
-0.12 -0.14 -0.21 -0.12 0.77 0.63 0.08 0.10 -0.03 0.24 0.22 -0.02 -0.21 -0.25 -0.21 -0.05 0.82 0.55 0.11 0.12 0.12 0.24 0.18 0.11 -0.18 -0.21 -0.12 -0.07 0.75 0.51 0.02 0.15 0.09 0.20 0.21 0.05
-0.13 -0.13 -0.20 -0.18 0.65 0.92 0.07 0.09 -0.05 0.13 0.22 0.03 -0.13 -0.13 -0.06 -0.03 0.58 0.87 -0.02 0.11 0.16 0.18 0.16 0.06 0.00 0.02 -0.05 -0.07 0.58 0.83 0.08 0.31 0.11 0.17 0.14 0.00
-0.14 -0.13 -0.17 -0.13 0.60 0.90 0.05 0.13 -0.11 0.13 0.25 0.06 -0.15 -0.08 -0.10 -0.04 0.56 0.84 -0.06 0.02 0.04 0.19 0.14 -0.03 -0.06 -0.06 -0.12 -0.12 0.57 0.88 0.06 0.26 0.07 0.19 0.13 0.09
-0.13 -0.14 -0.18 -0.18 0.59 0.81 0.02 0.10 -0.02 0.18 0.23 -0.01 -0.11 -0.14 -0.14 -0.04 0.49 0.79 0.06 -0.01 0.09 0.14 0.24 0.14 -0.07 -0.04 -0.07 0.00 0.51 0.79 0.10 0.24 0.05 0.12 0.12 0.00
-0.12 -0.13 -0.21 -0.18 0.67 0.86 0.02 0.06 -0.11 0.21 0.24 0.04 -0.18 -0.20 -0.21 -0.09 0.59 0.90 0.03 -0.01 0.08 0.21 0.17 0.17 -0.15 -0.11 -0.15 -0.11 0.59 0.91 0.04 0.15 -0.07 0.19 0.12 0.00
CLIM2
CLIM3
CLIM4
CLIM5
CLIM6
CLIM7
CLIM8
0.16 0.17 0.27 0.81 -0.14 -0.20 0.01 0.04 0.05 -0.20 -0.18 -0.24 0.18 0.10 0.23 0.80 -0.03 -0.05 0.01 0.01 0.11 -0.16 -0.21 -0.25 0.05 0.07 0.15 0.76 -0.06 -0.06 0.09 -0.09 0.04 -0.21 -0.19 -0.12
CONT3
CLIM1
0.11 0.13 0.08 0.60 -0.10 -0.13 0.02 -0.01 -0.08 -0.03 -0.09 -0.03 0.15 0.05 0.20 0.74 -0.06 -0.03 0.02 -0.03 0.09 -0.21 -0.19 -0.11 0.04 0.04 0.07 0.69 -0.02 -0.03 -0.01 0.01 0.10 -0.08 -0.05 0.08
CONT2
CONT4
0.18 0.24 0.22 0.84 -0.11 -0.12 -0.01 -0.06 0.06 -0.19 -0.16 -0.12 0.18 0.07 0.26 0.90 -0.06 -0.04 0.07 -0.02 0.20 -0.15 -0.20 -0.22 0.08 0.09 0.14 0.87 -0.09 -0.10 0.02 -0.01 0.02 -0.09 -0.09 0.03
CONT1
NORM4 0.46 0.58 0.88 0.26 -0.34 -0.24 0.04 0.10 0.23 -0.36 -0.50 -0.13 0.51 0.54 0.87 0.26 -0.30 -0.22 0.11 0.14 0.11 -0.43 -0.34 -0.22 0.43 0.47 0.87 0.16 -0.26 -0.19 0.10 0.05 0.20 -0.35 -0.45 -0.16
NORM3 0.43 0.58 0.86 0.28 -0.38 -0.26 0.06 0.09 0.22 -0.41 -0.49 -0.13 0.46 0.49 0.86 0.33 -0.26 -0.15 0.08 0.06 0.20 -0.38 -0.38 -0.25 0.45 0.51 0.87 0.19 -0.26 -0.17 0.12 0.02 0.20 -0.32 -0.44 -0.15
NORM2 0.47 0.61 0.86 0.19 -0.20 -0.13 0.06 0.15 0.28 -0.30 -0.43 -0.16 0.46 0.44 0.83 0.22 -0.13 -0.10 0.08 0.06 0.12 -0.26 -0.36 -0.24 0.46 0.49 0.85 0.10 -0.11 -0.03 0.00 0.02 0.17 -0.29 -0.40 -0.14
NORM1 0.48 0.58 0.86 0.17 -0.21 -0.11 0.08 0.15 0.20 -0.27 -0.41 -0.12 0.42 0.40 0.82 0.19 -0.12 -0.08 0.11 0.04 0.10 -0.28 -0.29 -0.25 0.43 0.44 0.84 0.13 -0.10 -0.04 0.01 0.00 0.17 -0.26 -0.40 -0.17
0.51 0.85 0.58 0.24 -0.14 -0.12 0.05 0.09 0.19 -0.38 -0.44 -0.11 0.56 0.83 0.42 0.02 -0.14 -0.08 0.10 0.21 0.14 -0.29 -0.28 -0.20 0.71 0.88 0.47 0.09 -0.23 -0.09 0.05 0.07 0.10 -0.34 -0.37 -0.15
Intention
0.61 0.90 0.66 0.14 -0.18 -0.12 0.17 0.19 0.24 -0.35 -0.36 -0.08 0.61 0.87 0.52 0.10 -0.28 -0.16 0.04 0.13 0.13 -0.38 -0.26 -0.18 0.70 0.89 0.56 0.07 -0.23 -0.07 0.16 0.16 0.13 -0.39 -0.40 -0.14
Subjective norm
ATT3
Usefulness of internet code Ethical work climate: Rules Ethical work climate: Law & code Perceived behavioral control
0.49 0.81 0.49 0.22 -0.22 -0.15 0.05 0.13 0.24 -0.38 -0.37 -0.06 0.52 0.84 0.49 0.09 -0.37 -0.20 0.04 0.10 0.10 -0.43 -0.33 -0.18 0.53 0.77 0.38 0.03 -0.25 -0.03 0.04 0.17 0.15 -0.24 -0.29 -0.06
Severity of time pressure
ATT2
Punishment certainty Punishment severity (developer) Punishment severity (firm) Cost of compliance
0.91 0.58 0.52 0.15 -0.20 -0.12 0.22 0.18 0.36 -0.28 -0.36 -0.07 0.91 0.62 0.47 0.16 -0.21 -0.11 0.06 0.15 0.26 -0.27 -0.25 -0.10 0.93 0.70 0.46 0.09 -0.25 -0.09 0.14 0.18 0.17 -0.31 -0.31 -0.08
Attitude
ATT1
Intention
0.90 0.58 0.48 0.20 -0.20 -0.15 0.16 0.17 0.28 -0.27 -0.30 -0.03 0.90 0.58 0.54 0.17 -0.28 -0.22 0.09 0.06 0.10 -0.33 -0.27 -0.13 0.88 0.68 0.47 0.02 -0.28 -0.13 0.20 0.14 0.13 -0.30 -0.38 -0.12
Subjective norm
INT3
Usefulness of internet code Ethical work climate: Rules Ethical work climate: Law & code Perceived behavioral control
0.92 0.56 0.45 0.18 -0.18 -0.14 0.15 0.11 0.32 -0.32 -0.34 -0.08 0.94 0.63 0.51 0.22 -0.22 -0.13 0.07 0.10 0.23 -0.29 -0.25 -0.19 0.95 0.74 0.49 0.11 -0.22 -0.10 0.14 0.15 0.20 -0.33 -0.34 -0.09
Severity of time pressure
INT2
Punishment certainty Punishment severity (developer) Punishment severity (firm) Cost of compliance
INT1
Item
Scenario 1 (N=316)
260 Appendix
A.2.4. Discriminant validity of model constructs
Table A-4: Loadings of internet code reuse model items
Notes: Figures in bold and with gray shading are loadings on a-priori constructs; see Table 4-7 for the full text of the respective items; item loadings for reverse coded items are depicted after reversing.
Scenario 2 (N=256)
Scenario 3 (N=297)
Subjective norm
Intention
Attitude
-0.04 -0.07 -0.14 -0.13 0.04 0.04 -0.04 0.14 -0.02 0.23 0.23 0.87 -0.14 -0.19 -0.24 -0.25 0.10 0.12 0.08 -0.10 -0.08 0.22 0.29 0.92 -0.09 -0.12 -0.17 -0.10 0.02 0.05 0.00 0.09 -0.20 0.22 0.14 0.89
-0.07 -0.08 -0.16 -0.25 0.03 0.03 -0.06 0.04 0.00 0.26 0.27 0.80 -0.10 -0.19 -0.27 -0.28 0.15 0.13 0.07 -0.06 -0.03 0.26 0.33 0.86 0.01 -0.03 -0.14 -0.14 0.05 0.05 0.06 0.10 -0.12 0.09 0.05 0.62
-0.04 -0.09 -0.09 -0.12 -0.06 0.02 -0.04 -0.03 -0.02 0.21 0.18 0.83 -0.17 -0.20 -0.23 -0.16 0.04 0.05 0.06 -0.04 -0.10 0.24 0.26 0.89 -0.11 -0.14 -0.15 0.01 0.02 0.00 -0.05 0.13 -0.13 0.16 0.12 0.90
CERT1
CERT2
CERT3
SEV_DEV3 -0.32 -0.39 -0.47 -0.20 0.29 0.28 0.02 0.10 -0.08 0.51 0.93 0.28 -0.24 -0.30 -0.39 -0.23 0.22 0.18 0.02 0.16 -0.17 0.56 0.92 0.33 -0.32 -0.37 -0.43 -0.15 0.17 0.11 -0.14 0.17 -0.06 0.52 0.94 0.12
SEV_DEV2 -0.33 -0.43 -0.49 -0.14 0.26 0.21 0.00 0.02 -0.05 0.49 0.88 0.20 -0.26 -0.34 -0.34 -0.20 0.19 0.21 0.06 0.09 -0.12 0.54 0.91 0.31 -0.34 -0.36 -0.49 -0.15 0.22 0.15 -0.11 0.13 -0.06 0.59 0.92 0.14
SEV_DEV1 -0.35 -0.43 -0.49 -0.24 0.30 0.25 -0.02 0.07 -0.07 0.50 0.93 0.27 -0.27 -0.28 -0.38 -0.27 0.21 0.18 0.00 0.21 -0.18 0.51 0.89 0.26 -0.39 -0.44 -0.48 -0.14 0.22 0.14 -0.15 0.14 -0.06 0.56 0.95 0.15
SEV_FIRM3 -0.30 -0.37 -0.36 -0.20 0.30 0.20 0.06 0.04 -0.03 0.90 0.49 0.26 -0.25 -0.35 -0.37 -0.19 0.24 0.21 -0.06 0.12 -0.06 0.89 0.56 0.25 -0.24 -0.28 -0.26 -0.18 0.17 0.21 -0.04 0.15 -0.11 0.88 0.53 0.18
SEV_FIRM2 -0.28 -0.39 -0.35 -0.20 0.21 0.16 -0.02 0.03 -0.01 0.91 0.49 0.24 -0.29 -0.38 -0.37 -0.19 0.22 0.21 -0.05 0.09 -0.05 0.91 0.48 0.21 -0.26 -0.31 -0.28 -0.20 0.22 0.19 -0.04 0.11 -0.01 0.89 0.49 0.13
SEV_FIRM1 -0.28 -0.39 -0.34 -0.20 0.22 0.15 -0.02 0.00 -0.05 0.87 0.48 0.25 -0.33 -0.44 -0.37 -0.19 0.18 0.17 -0.05 0.09 -0.17 0.92 0.57 0.26 -0.39 -0.41 -0.38 -0.11 0.19 0.15 0.01 0.02 -0.10 0.90 0.56 0.22
0.31 0.23 0.25 0.04 -0.14 -0.09 0.05 0.10 0.94 -0.06 -0.08 -0.04 0.22 0.14 0.19 0.16 0.00 0.07 -0.10 0.05 0.95 -0.12 -0.17 -0.11 0.19 0.17 0.22 0.05 0.00 -0.01 -0.05 0.13 0.97 -0.11 -0.09 -0.18
Usefulness of internet code Ethical work climate: Rules Ethical work climate: Law & code Perceived behavioral control
0.34 0.26 0.25 0.02 -0.14 -0.07 0.02 0.09 0.95 -0.01 -0.05 0.01 0.19 0.13 0.10 0.14 0.06 0.13 -0.02 0.11 0.94 -0.08 -0.15 -0.03 0.13 0.08 0.15 0.03 0.11 0.05 -0.03 0.14 0.87 -0.01 0.00 -0.13
Severity of time pressure
COST2*
Punishment certainty Punishment severity (developer) Punishment severity (firm) Cost of compliance
COST1*
Attitude
0.13 0.09 0.12 -0.09 0.07 0.07 0.14 0.83 0.10 0.00 0.03 0.09 0.11 0.14 0.10 -0.05 -0.01 0.02 0.09 0.86 0.12 0.14 0.11 -0.12 0.12 0.11 0.03 -0.06 0.12 0.18 0.11 0.86 0.13 0.13 0.17 0.07
Intention
DEAD3
Subjective norm
0.16 0.18 0.12 0.02 0.08 0.12 0.14 0.94 0.11 0.02 0.07 0.03 0.08 0.15 0.09 -0.02 0.08 0.04 0.15 0.93 0.06 0.09 0.16 -0.07 0.15 0.15 0.03 -0.04 0.10 0.23 0.04 0.93 0.15 0.08 0.14 0.14
Usefulness of internet code Ethical work climate: Rules Ethical work climate: Law & code Perceived behavioral control
DEAD2
Severity of time pressure
0.16 0.14 0.14 -0.03 0.06 0.09 0.15 0.92 0.06 0.04 0.06 0.04 0.11 0.18 0.06 0.02 0.07 0.00 0.11 0.93 0.05 0.08 0.17 -0.03 0.19 0.16 0.02 -0.06 0.11 0.24 0.05 0.95 0.11 0.07 0.13 0.13
Punishment certainty Punishment severity (developer) Punishment severity (firm) Cost of compliance
DEAD1
Attitude
0.18 0.09 0.05 0.05 -0.05 0.02 0.93 0.13 0.03 0.01 0.01 -0.05 0.10 0.03 0.11 0.01 0.13 0.01 0.93 0.08 -0.06 -0.03 0.06 0.05 0.18 0.12 0.06 0.03 0.03 0.08 0.97 0.06 -0.05 -0.01 -0.10 -0.03
Intention
USE3
Subjective norm
0.20 0.13 0.08 0.02 0.00 0.07 0.97 0.17 0.06 0.00 -0.02 -0.06 0.06 0.08 0.12 0.01 0.11 0.02 0.96 0.13 -0.07 -0.05 0.03 0.08 0.12 0.04 0.06 0.08 0.03 0.07 0.92 0.09 -0.01 -0.01 -0.15 0.00
Usefulness of internet code Ethical work climate: Rules Ethical work climate: Law & code Perceived behavioral control
USE2
Severity of time pressure
0.16 0.08 0.06 -0.03 -0.06 0.02 0.94 0.14 0.01 0.02 0.01 -0.02 0.08 0.07 0.10 0.05 0.13 -0.01 0.96 0.13 -0.05 -0.07 0.02 0.08 0.16 0.09 0.08 0.06 0.02 0.05 0.95 0.06 -0.06 -0.04 -0.17 -0.02
Punishment certainty Punishment severity (developer) Punishment severity (firm) Cost of compliance
USE1
Item
Scenario 1 (N=316)
Appendix 261
Table A-4: Loadings of internet code reuse model items – continued from previous page
*For scenarios 1 and 2 COST1a and COST2a are displayed, for scenario 3 COST1b and COST2b are used. Notes: Figures in bold and with gray shading are loadings on a-priori constructs; see Table 4-7 for the full text of the respective items; item loadings for reverse coded items are depicted after reversing.
Bibliography Abernathy, W. J., Utterback, J. M. (1978): Patterns of Industrial Innovation. Technology Review 80(7), p. 40-47. Adner, R., Zemsky, P. (2006): A Demand-Based Perspective on Sustainable Competitive Advantage. Strategic Management Journal 27(3), p. 215-239. Ajila, S. A., Wu, D. (2007): Empirical Study of the Effects of Open Source Adoption on Software Development Economics. Journal of Systems and Software 80(9), p. 15171529. Ajzen, I. (1985): From Intentions to Actions: A Theory of Planned Behavior. In: Kuhl, J., Beckmann, J. (Ed.), Action Control: From Cognition to Behavior. Springer, Heidelberg, p. 11-39. Ajzen, I. (1988): Attitudes, Personality, and Behavior. Open University Press, Milton Keynes. Ajzen, I. (1991): The Theory of Planned Behavior. Organizational Behavior and Human Decision Processes 50(2), p. 179-211. Ajzen, I. (2002): Constructing a TpB Questionnaire: Conceptual and Methodological Considerations. Manuscript. Retrieved 29.06.2009, from http://people.umass.edu/aizen/pdf/tpb.measurement.pdf. Ajzen, I., Fishbein, M. (1980): Understanding Attitude and Predicting Behavior. PrenticeHall, Englewood Cliffs, NJ. Ajzen, I., Madden, T. J. (1986): Predicting of Goal Directed Behavior: Attitudes, Intentions, and Perceived Behavioral Control. Journal of Experimental Social Psychology 22(5), p. 453-474. Alavi, M., Leidner, D. E. (1999): Knowledge Management Systems: Issues, Challenges and Benefits. Communications of the AIS 1(1), p. 1-37. Alexy, O. (2009): Free Revealing: How Firms Can Profit from Being Open. Gabler, Wiesbaden. Alvarez, S. A., Barney, J. B. (2004): Organizing Rent Generation and Appropriation: Toward a Theory of the Entrepreneurial Firm. Journal of Business Venturing 19(5), p. 621-635. Amabile, T. M. (1983): The Social Psychology of Creativity. Springer, New York, NY. Amabile, T. M. (1996): Creativity in Context. Westview Press, Boulder, CO. Amabile, T. M., Hill, K. G., Hennessey, A., Tighe, E. M. (1994): The Work Preference Inventory: Assessing Intrinsic and Extrinsic Motivational Orientations. Journal of Personality and Social Psychology 66(5), p. 950-967. Amit, R., Schoemaker, P. J. H. (1993): Strategic Assets and Organizational Rent. Strategic Management Journal 14(1), p. 33-46.
M. Sojer, Reusing Open Source Code, DOI: 10.1007/978-3-8349-6135-8, © Gabler Verlag | Springer Fachmedien Wiesbaden GmbH 2011
264
Bibliography
Amit, R., Zott, C. (2001): Value Creation in E-Business. Strategic Management Journal 22(6/7), p. 493-520. Anderson, J. C., Gerbing, D. W. (1998): Structural Equation Modeling in Practice: A Review and Recommended Two-Step Approach. Psychological Bulletin 103(3), p. 411-423. Anderson, R. E., Johnson, D. G., Gotterbarn, D., Perrolle, J. (1993): Using the New ACM Code of Ethics in Decision Making. Communications of the ACM 36(2), p. 98-107. Andreoli, N., Lefkowitz, J. (2009): Individual and Organizational Antecedents of Misconduct in Organizations. Journal of Business Ethics 85(3), p. 309-332. Appleyard, M. M. (1996): How Does Knowledge Flow? Interfirm Patterns in the Semiconductor Industry. Strategic Management Journal 17(Winter), p. 137-154. Apte, U., Sankar, C. S., Thakur, M., Turner, J. E. (1990): Reusability-Based Strategy for Development of Information Systems: Implementation Experience of a Bank. MIS Quarterly 14(3), p. 421-433. Argote, L., Ingram, P. (2000): Knowledge Transfer: A Basis for Competitive Advantage in Firms. Organizational Behavior and Human Decision Processes 82(1), p. 150-169. Argote, L., Ingram, P., Levine, J. M., Moreland, R. L. (2000): Knowledge Transfer in Organizations: Learning from the Experience of Others. Organizational Behavior and Human Decision Processes 82(1), p. 1-8. Armitage, C., Conner, M. (2001): The Theory of Planned Behavior. British Journal of Social Psychology 40(4), p. 471-499. Armstrong, J. S., Overton, T. S. (1977): Estimating Nonresponse Bias in Mail Surveys. Journal of Marketing Research 14(3), p. 396-402. Arne, P. H. (2008): Jacobsen v. Katzer - Open Source License Validation: How Far Does It Go? The Computer & Internet Lawyer 25(11), p. 27-31. Arrow, K. (1962): Economic Welfare and the Allocation of Resources for Invention. In: Nelson, R. R. (Ed.), The Rate and Direction of Inventive Activity. Princeton University Press, Princeton, NJ, p. 609-625. Austin, R. D. (2001): The Effects of Time Pressure on Quality in Software Development. Information Systems Research 12(2), p. 195-207. Backhaus, K., Erichson, B., Plinke, W., Weiber, R. (2008): Multivariate Analysemethoden. Springer-Verlag, Berlin, 12th Edition. Bagozzi, R. P., Baumgartner, H. (1994): The Evaluation of Structural Equation Models and Hypothesis Testing. In: Bagozzi, R. P. (Ed.), Principles of Marketing Research. Blackwell Publishers, Cambridge, MA, p. 386-422. Bagozzi, R. P., Dholakia, U. M. (2006): Open Source Software User Communities: A Study of Participation in Linux User Groups. Management Science 52(7), p. 10991115. Bagozzi, R. P., Yi, Y. (1988): On the Evaluation of Structural Equation Models. Journal of the Academy of Marketing Science 16(1), p. 74-94.
Bibliography
265
Baldwin, C. Y., Clark, K. B. (2006): The Architecture of Participation: Does Code Architecture Mitigate Free Riding in the Open Source Development Model? Management Science 52(7), p. 1116-1127. Bandura, A. (1997): Self-Efficacy: The Exercise of Control. H. W. Freeman, New York, NY. Banerjee, D., Cronan, T. P., Jones, T. W. (1998): Modeling IT Ethics: A Study in Situational Ethics. MIS Quarterly 22(1), p. 31-60. Banker, R. D., Kauffman, R. J., Zweig, D. (1993): Repository Evaluation on Software Reuse. IEEE Transactions of Software Engineering 19(4), p. 379-389. Barnes, B. H., Bollinger, T. B. (1991): Making Reuse Cost-Effective. IEEE Software 8(1), p. 13-24. Barney, J. B. (1986): Strategic Factor Markets: Expectations, Luck, and Business Strategy. Management Science 32(10), p. 1231-1241. Barney, J. B. (1991): Firm Resources and Sustained Competitive Advantage. Journal of Management 17(1), p. 99-120. Barney, J. B. (2001): Is the Resource-Based View a Useful Perspective for Strategic Management Research? Yes. Academy of Management Review 26(1), p. 41-56. Barney, J. B. (2003): Gaining and Sustaining Competitive Advantage. Pearson Education, Upper Saddle River, NJ, 3rd Edition. Barraclough, E. (2008): Beware the Open Source Software Risks. Managing Intellectual Property, February 2008, p. 20-22. Becerra, M. (2008): A Resource-Based Analysis of the Conditions for the Emergence of Profits. Journal of Management 34(6), p. 1110-1126. Beck, L., Ajzen, I. (1991): Predicting Dishonest Actions Using the Theory of Planned Behavior. Journal of Research in Personality 25(3), p. 285-301. Bennett, M. P., Ivers, K. K. (2008): Open Source Software: Your Company's Legal Risks. Retrieved 15.06.2009, from http://www.linuxinsider.com/story/64378.html. Bergquist, M., Ljungberg, J. (2001): The Power of Gifts: Organising Social Relationships in Open Source Communities. Information Systems Journal 11(4), p. 305-320. Besanko, D., Dranove, D., Shanley, M. (2000): Economics of Strategy. John Wiley & Sons, New York, 2nd Edition. Bessen, J. (2002): What Good Is Free Software. In: Hahn, R. W. (Ed.), Government Policy toward Open Source Software. Brookings Institution Press, Washington, DC. Black Duck Software (2007): The Quest for an “Open Source Genome”. Retrieved 07.01.2009, from http://www.blackducksoftware.com/media/_wp/Open-SourceGenome.pdf. Black Duck Software (2009a): Black Duck Open Source Resource Center. Retrieved 06.10.2009, from http://www.blackducksoftware.com/oss/licenses#top20.
266
Bibliography
Black Duck Software (2009b): Black Duck Software Analysis of Open Source Reveals Reuse of Code Representing 316,000 Staff Years. Retrieved 24.04.2009, from http://www.blackducksoftware.com/news/releases/2009-03-30. Black Duck Software (2009c): Estimating the Development Cost of Open Source Software. Retrieved 02.02.2010, from http://www.blackducksoftware.com/development-cost-of-open-source. Blankenhorn, D. (2005): Cisco's Rejection of Open Source Success. Retrieved 09.11.2009, from http://blogs.zdnet.com/open-source/?p=491. Blyler, M., Coff, R. W. (2003): Dynamic Capabilities, Social Capital, and Rent Appropriation: Ties That Split Pies Strategic Management Journal 24(7), p. 677-686. Boehm, B. W. (1981): Software Engineering Economics. Prentice-Hall, Englewood Cliffs, NJ. Boh, W. (2008): Reuse of Knowledge Assets from Repositories: A Mixed Methods Study. Information & Management 45(6), p. 365-375. Bommer, M., Gratto, C., Gravander, J., Tuttle, M. (1987): A Behavoiral Model of Ethical and Unethical Decision Making. Journal of Business Ethics 6(4), p. 265-280. Bonaccorsi, A., Rossi, C. (2003): Why Open Source Software Can Succeed. Research Policy 32(7), p. 1243-1258. Bortz, J., Döring, N. (2003): Forschungsmethoden und Evaluation für Human- und Sozialwissenschaftler. Springer, Berlin, 3rd Edition. Bowman, C., Ambrosini, V. (2000): Value Creation Versus Value Capture: Towards a Coherent Definition of Value in Strategy. British Journal of Management 11(1), p. 115. Bowman, C., Ambrosini, V. (2001): "Value" In the Resource-Based View of the Firm: A Contribution to the Debate. Academy of Management Review 26(1), p. 501-502. Boyle, J. (2009): What Intellectual Property Law Should Learn from Software. Communications of the ACM 52(9), p. 71-76. Brandenburger, A., Stuart, H. W. (1996): Value-Based Business Strategy. Journal of Economics & Management Strategy 5(1), p. 5-24. Brewer, M. B. (1979): Ingroup Bias in the Minimal Intergroup Situation: A CognotiveMotivational Analysis. Psychological Bulletin 86(2), p. 307-324. Brooks, F. P. (1975): The Mythical Man-Month: Essays on Software Engineering. Addison-Wesley Publishing, Reading, MA. Brooks, F. P. (1987): No Silver Bullet: Essence and Accidents of Software Engineering. IEEE Computer 20(4), p. 10-19. Brown, A. W., Booch, G. (2002): Reusing Open-Source Software and Practices: The Impact of Open-Source on Commercial Vendors. In: Gacek, C. (Ed.), Software Reuse: Methods, Techniques, and Tools. Springer, Berlin / Heidelberg, p. 123-136. Buchan, H. F. (2005): Ethical Decision Making in the Public Accounting Profession: An Extension of Ajzen's Theory of Planned Behavior. Journal of Business Ethics 61(2), p. 165-181.
Bibliography
267
Card, D., Comer, E. (1994): Why Do So Many Reuse Programs Fail? IEEE Software 11(5), p. 114-115. Carver, B. W. (2005): Share and Share Alike: Understanding and Enforcing Open Source and Free Software Licenses. Berkeley Technology Law Journal 20, p. 443-481. Castanias, R. P., Helfat, C. E. (1991): Managerial Resources and Rents. Journal of Management 17(1), p. 155-171. Cavanagh, G. F., Fritzsche, D. J. (1985): Using Vignettes in Business Ethics Research. In: Preston, L. E. (Ed.), Research in Corporate Social Performance and Policy. JAI Press, Greenwich, CT, p. 279-293. Chang, H.-F. A., Mockus, A. (2008): Evaluation of Source Code Copy Detection Methods on FreeBSD. International Working Conference on Mining Software Repositories, Leipzig, Germany. Chang, M. K. (1998): Predicting Unethical Behavior: A Comparison of the Theory of Reasoned Action and the Theory of Planned Behavior. Journal of Business Ethics 17(16), p. 1825-1834. Chen, W., Li, J., Ma, J., Conradi, R., Ji, J., Liu, C. (2008): An Empirical Study on Software Development with Open Source Components in the Chinese Software Industry. Software Process Improvement and Practice 13(1), p. 89-100. Chesbrough, H. W. (2003): Open Innovation. The New Imperative for Creating and Profiting from Technology. Harvard Business School Press, Boston, MA. Chin, W. W. (1998a): Issues and Opinion on Structural Equation Modeling. MIS Quarterly 22(1), p. vii-xvi. Chin, W. W. (1998b): The Partial Least Squares Approach to Structural Equation Modeling. In: Marcoulides, G. A. (Ed.), Modern Methods for Business Research. Lawrence Erlbaum, Mahwah, NJ, p. 295-358. Chin, W. W., Newsted, P. R. (1999): Structural Equation Modeling Analysis with Small Samples Using Partial Least Squares. In: Hoyle, R. H. (Ed.), Statistical Strategies for Small Sample Research. Sage Publications, Thousand Oaks, CA, p. 308-337. Churchill, G. A. (1979): A Paradigm for Developing Better Measures of Marketing Constructs. Journal of Marketing Research 16(1), p. 64-73. Cisco Systems (2003): Cisco Systems Accounces Agreement to Acquire the Linksys Group, Inc. Retrieved 09.11.2009, from http://newsroom.cisco.com/dlls/corp_032003.html. Coff, R. W. (1999): When Competitive Advantage Doesn't Lead to Performance: The Resource-Based View and Stakeholder Bargaining Power. Organization Science 10(1), p. 119-133. Cohen, D. V. (1998): Moral Climate in Business Firms: A Conceptual Framework for Analysis and Change. Journal of Business Ethics 17(11), p. 1211-1226. Cohen, J., Cohen, P., West, S. G., Aiken, L. S. (2002): Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Lawrence Erlbaum, Mahwah, NJ, 3rd Edition.
268
Bibliography
Cohen, W. M., Levinthal, D. A. (1990): Absorptive Capacity: A New Perspective on Learning and Innovation. Administrative Science Quarterly 35(1), p. 128-152. Cohn-Sfetcu, S., Mayer, R. (2009): Practicing Safe Software. Linux+(1), p. 60-61. Coleman, J. S. (1990): Foundtions of Social Theory. Harvard University Press, Cambrigde, MA. Collis, D. J., Montgomery, C. A. (1995): Competing on Resources: Strategy in the 1990s. Harvard Business Review 73(4), p. 118-128. Cong, R. (2001): Marginal Effects of the Tobit Model. Stata Technical Bulletin 10(56), p. 27-34. Conner, K. R. (1991): A Historical Comparison of Resource-Based Theory and Five Schools of Thought within Industrial Organisation Economics: Do We Have a New Theory of the Firm? Journal of Management 17(1), p. 121-154. Conner, K. R., Prahalad, C. K. (1996): A Resource Based Theory of the Firm: Knowledge Versus Opportunism. Organization Science 7(5), p. 477-501. Corbet, J. (2007): Who Wrote 2.6.23. Retrieved 22.09.2009, from http://lwn.net/Articles/247582/. Couper, M. P. (2000): Web Surveys: A Review of Issues and Approaches. Public Opinion Quarterly 64(4), p. 464-494. Coyle, J. R., Gould, S. J., Gupta, P., Gupta, R. (2009): "To Buy or to Priate": The Matrix of Music Consumers' Acquisition-Mode Decision-Making. Journal of Business Research 62(10), p. 1031-1037. Cronan, T. P., Al-Rafee, S. (2008): Factors That Influence the Intention to Pirate Software and Media. Journal of Business Ethics 78(4), p. 527-545. Crowne, D. P., Marlowe, D. (1960): A Scale of Social Desirability Independent of Psychopathology. Journal of Consulting Psychology 24(4), p. 349-254. Crowston, K., Li, Q., Wei, K., Eseryel, U. Y., Howison, J. (2007): Self-Organization of Teams for Free/Libre Open Source Software Development. Information and Software Technology 49(6), p. 564-575. Crowston, K., Scozzi, B. (2008): Bug Fixing Practices within Free/Libre Open Source Software Development Teams. Journal of Database Management 19(2), p. 1-30. Crowston, K., Wei, K., Howison, J., Wiggins, A. (2009): Free/Libre Open Source Software Development: What We Know and What We Do Not Know (07.07.2009). Working Paper. Retrieved 10.09.2009, from http://floss.syr.edu/StudyP/Review%20Paper_070709.pdf. Csíkszentmihályi, M. (1975): Beyond Boredom and Anxiety. Jossey-Bass, San Francisco, CA. Csíkszentmihályi, M. (1990): Flow: The Psychology of Optimal Experience. Harper and Row, New York, NY. Culnan, M. (1993): How Did You Get My Name? An Exploratory Investigation of Consumer Attitudes toward Secondary Information. MIS Quarterly 17(3), p. 341-363.
Bibliography
269
Cusumano, M. (1991): Japan's Software Factories. Oxford University Press, New York, NY. Cusumano, M., Kemerer, C. (1990): A Quantitative Analysis of U.S. And Japanese Practice in Software Development. Management Science 36(11), p. 1384-1406. Cusumano, M., MacCormack, A., Kemerer, C. F., Crandall, B. (2003): Software Development Worldwide: The State of the Practice. IEEE Software 20(6), p. 28-34. d'Astous, A., Colbert, F., Montpetit, D. (2005): Music Piracy on the Web - How Effective Are Anti-Piracy Arguments? Evidence from the Theory of Planned Behavior. Journal of Consumer Policy 28(3), p. 289-310. Davenport, T. H., Prusak, L. (1997): Working Knowledge: How Organizations Manage What They Know. Harvard Business School Press, Boston, MA. David, P. A., Rullani, F. (2008): Dynamics of Innovation in an “Open Source” Collaboration Environment: Lurking, Laboring, and Launching Floss Projects on Sourceforge. Industrial and Corporate Change 17(4), p. 647–710. David, P. A., Shapiro, J. S. (2008): Community-Based Production of Open-Source Software: What Do We Know About the Developers Who Participate? Information Economics and Policy 20(4), p. 364-398. Davidson, S. J. (2006): Open Source Technical Due Diligence. The Computer & Internet Lawyer 23(8), p. 1-4. Davies, S. P. (1989): Skill Level and Strategic Differences in Plan Comprehension and Implementation in Programming. 5th Conference of the British Computer Society Human-Computer Interaction Specialist Group, Nottingham, UK. Davis, F. D., Bagozzi, R. P., Warshaw, R. P. (1989): User Acceptance of Computer Technology: A Comparison of Two Theoretical Models. Management Science 35(8), p. 982-1002. de Laat, P. B. (2005): Copyright or Copyleft?: An Analysis of Property Regimes for Software Development Research Policy 34(10), p. 1511-1532 de Pay, D. (1995): Informationsmanagement von Innovationen. Gabler, Wiesbaden. Deci, E. L., Ryan, R. M. (1985): Intrinsic Motivation and Self-Determination in Human Behavior. Plenum, New York, NY. DeConinck, J. B., Lewis, W. F. (1997): The Influence of Deontological and Teleological Considerations and Ethical Climate on Sales Managers’ Intentions to Reward or Punish Sales Force Behavior. Journal of Business Ethics 16(5), p. 497-506. Deshpande, A., Riehle, D. (2008): The Total Growth of Open Source. 4th International Conference on Open Source Systems, Milan, Italy, p. 197-209. Desouza, K. C., Awazu, Y., Tiwana, A. (2006): Four Dynamics for Bringing Use Back into Software Reuse. Communications of the ACM 49(1), p. 96-100. DiBona, C. (2005): Open Source and Proprietary Software Development. In: DiBona, C., Cooper, D., Stone, M. (Ed.), Open Source 2.0: The Continuing Evolution. O'Reilly Media, Sebastopol, CA.
270
Bibliography
DiBona, C., Ockerbloom, J., Stone, M. (1999): Introduction. In: DiBona, C., Ockman, S., Stone, M. (Ed.), Open Sources: Voices of the Open Source Revolution. O'Reilly & Associates, Sebastopol, CA, p. 1-17. Dierickx, I., Cool, K. (1989): Asset Stock Accumulation and the Sustainability of Competitive Advantage. Management Science 35(12), p. 1504-1511. Dilman, D. A. (1978): Mail and Telephone Surveys. Wiley, New York, NY. Dixon, N. M. (2000): Common Knowledge: How Companies Thrive by Sharing What They Know. Harvard Business School Press, Boston, MA. Dosi, G. (1982): Technological Paradigms and Technological Trajectories. Research Policy 11(3), p. 147-162. Dougherty, C. (2002): Introduction to Econometrics. Oxford University Press, New York, NY, 2nd Edition. Ducheneaut, N. (2005): Socialization in an Open Source Software Community: A SocioTechnical Analysis. Computer Supported Cooperative Work 14(4), p. 323-368. Due, R. T. (1995): The Economics of Reuse. Information Systems Management 12(1), p. 70 - 74. Eagly, A., Chaiken, S. (1996): Attitude Structure and Function. In: Gilbert, D., Fiske, S., Lindzey, G. (Ed.), The Handbook of Social Psychology. McGraw-Hill, New York, NY, p. 269-322. Ebert, C. (2008): Open Source Software in Industry. IEEE Software 25(3), p. 52-53. Egger, D., Hogg, M. (2006): Open Source Software IP Risk Audits: The Emerging Due Diligence Standard for Technology M&A Transactions. Retrieved 13.01.2009, from http://www.osriskmanagement.com/downloads/ Open%20Source%20Software%20IP%20Risk%20Audits.pdf. Ehrlich, I. (1973): Participation in Illegitimate Activities: A Theoretical and Empirical Investigation. Journal of Political Economy 81(3), p. 521-565. Ehrlich, I. (1996): Crime, Punishment, and the Market for Offenses. Journal of Economic Perspectives 10(1), p. 43-67. Fafchamps, D. (1994): Organizational Factors and Reuse. IEEE Software 11(5), p. 31- 41. Fauscette, M. (2009): Worldwide Open Source Software 2009-2013 Forecast. Market Analysis. Fershtman, C., Gandal, N. (2007): Open Source Software: Motivation and Restrictive Licensing. International Economics and Economic Policy 4(2), p. 209-225. Fershtman, C., Gandal, N. (2009): R&D Spillovers: The 'Social Network' of Open Source (16.05.2009). Working Paper. Retrieved 01.10.2009, from http://www.tau.ac.il/~gandal/OSS.pdf. Fichman, R. G., Kemerer, C. F. (1997): The Assimilation of Software Process Innovations: An Organizational Learning Perspective. Management Science 43(10), p. 1345-1363. Fichman, R. G., Kemerer, C. F. (2001): Incentive Compatibility and Systematic Software Reuse. Journal of Systems and Software 57(1), p. 45-60.
Bibliography
271
Fischer, T., Henkel, J. (2009): Patent Trolls on Markets for Technology - an Empirical Analysis of Trolls' Patent Acquisitions (14.12.2009). Working Paper. Retrieved 30.04.2010, from http://ssrn.com/paper=1523102. Fishbein, M., Ajzen, I. (1975): Belief, Attitudes, Intentions, and Behavior: An Introduction to Theory and Research. Addison-Wesley, Reading, MA. Fishburn, P. (1970): Utility Theory for Decision Making. John Wiley & Sons, New York, NY. Fitzgerald, B., Bassett, G. (2005): Legal Issues Relating to Free and Open Source Software. In: Fitzgerald, B., Bassett, G. (Ed.), Legal Issues Relating to Free and Open Source Software. Queensland University of Technology, Bisbane, QLD, p. 11-36. Flannery, B. L., May, D. R. (2000): Environmental Ethical Decision Making in the U.S. Metal-Finishing Industry. Academy of Management Journal 43(4), p. 642-662. Fleming, L., Waguespack, D. M. (2007): Brokerage, Boundary Spanning, and Leadership in Open Innovation Communities. Organization Science 18(2), p. 165-180. Fornell, C., Bookstein, F. L. (1982): Two Structural Equations Models: LISREL and PLS Applied to Consumer Exit-Voice Theory. Journal of Marketing Research 19(4), p. 440-452. Fornell, C., Larcker, F. (1981): Evaluating Structural Equation Models with Unobservable Variables and Measurement Error. Journal of Marketing Research 13(1), p. 39-50. Forrest, E. (2003): Internet Marketing Research: Resources and Techniques. McGrawHill, Sydney. Frakes, W. B., Fox, C. J. (1995): Sixteen Questions About Software Reuse. Communications of the ACM 38(6), p. 75-87. Frakes, W. B., Fox, C. J. (1996): Quality Improvements Using a Software Reuse Failure Modes Model. IEEE Transactions of Software Engineering 22(4), p. 274-279. Frakes, W. B., Isoda, S. (1994): Success Factors of Systematic Reuse. IEEE Software 11(5), p. 14-19. Frakes, W. B., Kang, K. (2005): Software Reuse Research: Status and Future. IEEE Transactions of Software Engineering 31(7), p. 529 - 536 Fredrickson, J. W. (1986): An Exploratory Approach to Measurig the Perceptions of Strategic Decision Process Constructs. Strategic Management Journal 7(5), p. 473483. Free Software Foundation (2009a): Frequently Asked Questions About the GNU Licenses. Retrieved 18.12.2009, from http://www.fsf.org/licensing/licenses/ gpl-faq.html#GPLAndPlugins. Free Software Foundation (2009b): The GNU Operating System. Retrieved 05.10.2009, from http://www.gnu.org/. Frey, B. (1997): Not Just for They Money: An Economic Theory of Personal Motivation. Edward Elgar, Brookfield, VT. Frey, B., Meier, S. (2004): Pro-Social Behavior in a Natural Setting. Journal of Economic Behavior and Organization 54(1), p. 65-68.
272
Bibliography
Garlan, D., Allen, R., Ockerbloom, J. (1995): Architectural Mismatch: Why Reuse Is So Hard. IEEE Software 12(6), p. 17-26. Garlan, D., Allen, R., Ockerbloom, J. (2009): Architectural Mismatch: Why Reuse Is Still So Hard. IEEE Software 26(4), p. 66-69. Gefen, D., Straub, D. W., Boudreau, M.-C. (2000): Structural Equation Modelling and Regression: Guidelines for Research Practice. Communications of the Association for Information Systems 4(7), p. 1-78. German, D. M. (2005): Software Engineering Practices in the GNOME Project. In: Feller, J., Fitzgerald, B., Hissam, S. A., Lakhani, K. R. (Ed.), Perspectives on Free and Open Source Software. MIT Press, Cambrindge, MA, p. 211-225. German, D. M. (2007): Using Software Distributions to Understand the Relationship among Free and Open Source Software Projects. 4th International Workshop on Mining Software Repositories, Minneapolis, MN. German, D. M., Gonzalez-Barahona, J. M. (2009): An Empirical Study of the Reuse of Software Licensed under the GNU General Public License. 5th International Conference on Open Source Systems, Skövde, Sweden, p. 185-198. German, D. M., Hassan, A. E. (2009): License Integration Patterns: Dealing with License Mismatches in Component-Based Development. 31st International Conference on Software Engineering, Vancouver, Canada, p. 188-198. Ghemawat, P. (1991): Commitment: The Dynamic of Strategy. Free Press, New York, NY. Ghosh, R. A. (1998): What Motivates Free Software Developers? Retrieved 22.09.2009, from http://outreach.lib.uic.edu/www/issues/issue3_3/torvalds/index.html. Ghosh, R. A., Glott, R., Krieger, B., Robles, G. (2002): Free/Libre and Open Source Software: Survey and Study - Deliverable D18: Final Report - Part IV: Survey of Developers. Retrieved 18.03.2009, from http://www.infonomics.nl/FLOSS/report/ FLOSS_Final4.pdf. Gibbs, W. W. (1994): Software's Chronic Crisis. Scientific American 271(3), p. 86-95. Giuri, P., Rullani, F., Torrisi, S. (2008): Explaining Leadership in Virtual Teams: The Case of Open Source Software. Information Economics and Policy 20(4), p. 305-315. Goles, T., Jayatilaka, B., George, B., Parsons, L., Chambers, V., Taylor, D., Brune, R. (2008): Softlifting: Exploring Determinants of Attitude. Journal of Business Ethics 77(4), p. 481-499. Goodman, P. S., Darr, E. D. (1998): Computer-Aided Systems and Communities: Mechanisms for Organizational Learning in Distributed Environments. MIS Quarterly 22(4), p. 417-430. Grant, R. M. (1996): Toward a Knowledge-Based Theory of the Firm. Strategic Management Journal 17(Winter), p. 109-122. Greene, J. C., Caracelli, V. J., Graham, W. F. (1989): Toward a Conceptual Framework for Mixed-Method Evaluation Designs. Educational Evaluation and Policy Analysis 11(3), p. 255-274. Greene, W. H. (1999): Marginal Effects in the Censored Regression Model. Economics Letters 64(1), p. 43-49.
Bibliography
273
Greene, W. H. (2003): Econometric Analysis. Prentice Hall, Upper Saddle River, NJ, 5th Edition. Griss, M. L. (1995): Software Reuse: Objects and Frameworks Are Not Enough. Object Magazine 5(2), p. 77-87. Gruber, M., Henkel, J. (2005): New Ventures Based on Open Innovation - an Empirical Analysis of Start-up Firms in Embedded Linux. International Journal of Technology Management 33(4), p. 354-372. Gulati, R., Wang, L. O. (2003): Size of the Pie and Share and the Pie: Implications of Network Embeddedness and Business Relatedness for Value Creation and Value Appropriation. Research in the Sociology of Organizations 20, p. 209-242. Haefliger, S., von Krogh, G., Spaeth, S. (2008): Code Reuse in Open Source Software. Mangement Science 54(1), p. 180-193. Haines, R., Leonard, L. N. K. (2007): Individual Characteristics and Ethical DecisionMaking in an IT Context. Industrial Management & Data Systems 107(1), p. 5-20. Hair, J. F., Jr., Tataham, R. L., Anderson, J. E., Black, W. (2006): Multivariate Data Analysis. Pearson Prentice Hall, Upper Saddle River, NJ. Hallberg, N. L. (2009): Towards a Resource-Based Theory of Value Appropriation: An Appropriation Factor Framework. Working Paper. Retrieved 18.05.2009, from http://uk.cbs.dk/content/download/108796/1385107/file/ Presentation%20%20Towards%20a%20resource-based%20theory%20of%20value%20 appropriation%20090422%20.pdf. Hann, I., Roberts, J. A., Slaughter, S. A., Fielding, R. T. (2002): Economic Incentives for Participating in Open Source Software Projects. International Conference on Information Systems, Barcelona, Spain. Hardgrave, B. C., Davis, F. D., Riemenschneider, C. K. (2003): Investigating Determinants of Software Developers' Intentions to Follow Methodologies. Journal of Management Information Systems 20(1), p. 123-151. Hardgrave, B. C., Johnson, R. A. (2003): Toward an Information Systems Development Acceptance Model: The Case of Object-Oriented Systems Development. IEEE Transactions on Engineering Management 50(3), p. 322-336 Hars, A., Ou, S. (2002): Working for Free? Motivations for Participating in Open-Source Projects. International Journal of Electronic Commerce 6(3), p. 25-39. Hauben, M., Hauben, R. (1997): Netizens: On the History and Impact of Usenet and the Internet. IEEE Computer Socienty Press, Los Alamitos, CA. Hemel, A. (2008): The GPL Compliance Engineering Guide. Retrieved 22.12.2008, from http://www.loohuis-consulting.nl/downloads/compliance-manual.pdf. Henkel, J. (2006): Selective Revealing in Open Innovation Processes: The Case of Embedded Linux. Research Policy 35(7), p. 953-969. Henkel, J. (2007): Offene Innovationsprozesse - Die Kommerzielle Entwicklung von OpenSource-Software. Deutscher Universitäts-Verlag, Wiesbaden. Henkel, J. (2009): Champions of Revealing - the Role of Open Source Developers in Commercial Firms. Industrial and Corporate Change 18(3), p. 435-471.
274
Bibliography
Henkel, J., Baldwin, C. Y. (2009): Modularity for Value Appropriation: Drawing the Boundaries of Intellectual Property (March 2009). Working Paper. Henley, M. (2009): Jacobsen V Katzer and Kamind Associates - an English Legal Perspective. International Free and Open Source Software Law Review 1(1), p. 41-44. Henseler, J., Ringle, C. M., Sinkovics, R. (2009): The Use of Partial Least Squares Modeling in International Marketing. Advances in International Marketing 20, p. 277320. Herrmann, A., Huber, F., Kressmann, F. (2006): Varianz- und Kovarianzbasierte Strukturgleichungsmodelle - Ein Leitfaden zu deren Spezifikation und Beurteilung. Zeitschrift für betriebswirtschaftliche Forschung 58(2), p. 34-66. Hertel, G., Niedner, S., Hermann, S. (2003): Motivation of Software Developers in the Open Source Projects: An Internet-Based Survey of Contributors to the Linux Kernel. Research Policy 32(7), p. 1159-1177. Herzberg, F. (1968): One More Time: How Do You Motivate Employees? Harvard Business Review 46(1), p. 53-62. Herzberg, F. (1982): The Managerial Choice. Olympus, Salt Lake City, UT. Hogle, S. (2008): Update: Jacobsen v. Katzer Represents a Major Victory for Open Source. The Computer & Internet Lawyer 25(10), p. 1-3. Homburg, C., Baumgartner, H. (1995): Beurteilung von Kausalmodellen: Bestandsaufnahme und Anwendungsempfehlungen. Marketing - Zeitschrift für Forschung und Praxis 17(3), p. 162-176. Homburg, C., Giering, A. (1996): Konzeptualisierung und Operationalisierung Komplexer Konstrukte - Ein Leitfaden für die Marketingforschung. Marketing - Zeitschrift für Forschung und Praxis 18(1), p. 5-24. Hoopes, D. G., Madsen, T. L., Walker, G. (2003): Guest Editors' Introduction to the Special Issue: Why Is There a Resource-Based View? Toward a Theory of Competitive Heterogeneity Strategic Management Journal 24(10), p. 889-902. Howison, J., Conklin, M., Crowston, K. (2006): FLOSSmole: A Collaborative Repository for Floss Research Data and Analyses. International Journal of Information Technology and Web Engineering 1(3), p. 17-26. Hummel, O., Janjic, W., Atkinson, C. (2008): Code Conjurer: Pulling Reusable Software out of Thin Air. IEEE Software 25(5), p. 45-52. IPX (2004): Intellectual Property Risk Management. Retrieved 21.12.2009, from http://www.ipxco.com/pdf/Risk_Management.pdf. Isoda, S. (1995): Experience of a Software Reuse Project. Journal of Systems and Software 30, p. 171-186. Jacobides, M. G., Knudsen, T., Augier, M. (2006): Benefiting from Innovation: Value Creation, Value Appropriation and the Role of Industry Architectures Research Policy 35(8), p. 1200-1221. Jaeger, T. (2010): Enforcement of the GNU GPL in Germany and Europe. Journal of Intellectual Property, Information Technology & E-Commerce Law 1(1), p. 34-39.
Bibliography
275
Jones, C. (2003): Variations in Software Development Practices. IEEE Software 20(6), p. 22-27. Jones, T. M. (1991): Ethical Decision Making by Individuals in Organizations: An IssueContingent Model. Academy of Management Review 16(2), p. 366-395. Joos, R. (1994): Software Reuse at Motorola. IEEE Software 11(5), p. 42-47. Kaiser, H., Rice, J. (1974): Little Jiffy, Mark IV. Educational and Psychological Measurement 34(1), p. 111-117. Kaneshige, T. (2008): Are You in Violation of Open Source Licenses? Retrieved 15.06.2009, from http://www.itbusiness.ca/it/client/en/home/ DetailNewsPrint.asp?id=49969. Katz, R., Allen, T. J. (1982): Investigating the Not Invented Here (NIH) Syndrome: A Look at the Performance, Tenure, and Communication Patterns of 50 R&D Project Groups. R&D Management 12(1), p. 7-20. Kim, J., Mahoney, J. T. (2002): Resource-Based and Property Rights Perspectives on Value Creation: The Case of Oil Field Unitization. Managerial and Decision Economics 23(4/5), p. 225-245. Kim, Y. E., Stohr, E. A. (1992): Software Reuse : Issues and Research Directions. 25th Hawaii International Conference on System Sciences, Kauai, HI, p. 612-623. Kim, Y. E., Stohr, E. A. (1998): Software Reuse: Survey and Research Directions. Journal of Management Information Systems 14(4), p. 113-147. Klein, B., Crawford, R. G., Alchian, A. A. (1978): Vertical Integration, Appropriable Rents, and the Competitive Contracting Process. Journal of Law and Economics 21(2), p. 297-326. Knoll, A. (2009): Packungsbeilage Immer Aufmerksam Lesen! Markt & Technik 14, p. 1214. Kogut, B., Zander, U. (1992): Knowledge of the Firm, Combinative Capabilities, and the Replication of Technology. Organization Science 3(3), p. 383-397. Kogut, B., Zander, U. (1993): Knowledge of the Firm and the Evolutionary Theory of the Multinational Corporation. Journal of International Business Studie 24(4), p. 625-645. Kohlberg, L. (1969): Stage and Sequence: The Cognitive Development Approach to Socialization. In: Goslin, D. (Ed.), Handbook of Socialization. Rand McNally, Chicago, IL, p. 347-480. Koohgoli, M. (2008): Practicing Safe Software: Good Software Record. Retrieved 21.12.2009, from http://www.talentfirstnetwork.org/wiki/images/d/dd/ Practicing_safe_software_June_18.pdf. Kotler, P. (1991): Marketing Management. Prentice-Hall, Englewood Cliffs, NJ, 7th Edition. Krebs, D. L. (1970): Altruism - Examination of Concept and a Review of Literature. Psychological Bulletin 73(4), p. 258-302. Krijnen, W. P., Dijkstra, T. K., Gill, R. D. (1998): Conditions for Factor (In)determinancy in Factor Analysis. Psychonometrika 63(4), p. 359-367.
276
Bibliography
Krueger, C. W. (1992): Software Reuse. ACM Computer Surveys 24(2), p. 131-183. Kurland, N. (1995): Ethical Intentions and the Theories of Reasoned Action and Planned Behavior. Journal of Applied Social Psychology 25(4), p. 297-313. Lakhani, K. R., Wolf, R. G. (2005): Why Hackers Do What They Do: Understanding Motivation and Effort in Free/Open Source Software Projects. In: Feller, J., Fitzgerald, B., Hissam, S., Lakhani, K. R. (Ed.), Perspectives on Free and Open Source Software. MIT Press, Cambridge, MA, p. 3-22. Langlois, R. N. (1999): Scale, Scope, and the Reuse of Knowledge. In: Dow, S. C., Earl, P. E. (Ed.), Economic Organization and Economic Knowledge. Edward Elgar, Cheltenham, UK, p. 239-254. Lau, K.-K., Wang, Z. (2007): Software Components Models. Software Engineering 33(10), p. 709-724. Lavie, D. (2007): Alliance Portfolios and Firm Performance: A Study of Value Creation and Appropration in the U.S. Software Industry. Strategic Management Journal 28(12), p. 1187-1212. Lee, G. K., Cole, R. E. (2003): From a Firm-Based to a Community-Based Model of Knowledge Creation: The Case of the Linux Kernel Development. Organization Science 14(6), p. 633-649. Lee, N.-Y., Litecky, C. R. (1997): An Empirical Study of Software Reuse with Special Attention to Ada. Transactions on Software Engineering 23(9), p. 537-549. Lefkowitz, J. (2006): The Constancy of Ethics Amidst the Changing World of Work. Human Resource Management Review 16(2), p. 245-268. Leonard, L. N. K., Cronan, T. P., Kreie, J. (2004): What Influences IT Ethical Behavior Intentions - Planned Behavior, Reasoned Action, Perceived Importance, or Individual Characteristics? Information & Management 42(1), p. 143-158. Lepak, D. P., Smith, K. G., Taylor, M. S. (2007): Value Creation and Value Capture: A Multilevel Perspective. Academy of Management Review 32(1), p. 180-194. Lerner, J., Tirole, J. (2002): Some Simple Economics of Open Source. The Journal of Industrial Economics 50(2), p. 197-234. Lerner, J., Tirole, J. (2005): The Scope of Open Source Licensing The Journal of Law, Economics, and Organization 21(1), p. 20-56. Levi, S. D., Woodard, A. (2004): Open Source Software: How to Use It and Control It in the Corporate Environment. Computer & Internet Lawyer 21(8), p. 8-13. Liao, C., Lin, H.-N., Liu, Y.-P. (2009): Predicting the Use of Pirated Software: A Contrigency Model Integrating Perceived Risk with the Theory of Planned Behavior. Journal of Business Ethics(IN PRESS). Lichtenthaler, U., Ernst, H. (2006): Attitudes to Externally Organising Knowledge Management Tasks: A Review, Reconsideration and Extension of the NIH Syndrome. R&D Management 36(4), p. 367-386. Lim, W. C. (1994): Effects of Reuse on Quality, Productivity, and Economics. IEEE Software 11(5), p. 23-30.
Bibliography
277
Limayem, M., Khalifa, M., Chin, W. W. (2004): Factors Motivating Software Piracy: A Longitudinal Study. IEEE Transactions on Engineering Management 51(4), p. 414425. Lin, C.-P., Ding, C. G. (2003): Modeling Information Ethics: The Joint Moderating Role of Locus of Control and Job Insecurity. Journal of Business Ethics 48(4), p. 335-346. Lippman, S. A., Rumelt, R. P. (2003a): A Bargaining Perspective on Resource Advantage. Strategic Management Journal 24(11), p. 1069-1086. Lippman, S. A., Rumelt, R. P. (2003b): The Payments Perspective: Micro-Foundations of Resource Analysis. Strategic Management Journal 24(10), p. 903-927. Loch, K. D., Conger, S. (1996): Evaluating Ethical Decision Making and Computer Use. Communications of the ACM 39(7), p. 74-83. Lovgren, R. H., Racer, M. J. (2000): Group-Dynamics in Projects: Don't Forget the Social Aspects. Journal of Professional Issues in Engineering Education and Practice 126(4), p. 156-165. Lynex, A., Layzell, P. J. (1997): Understanding Resistance to Software Reuse. 8th IEEE International Workshop on Software Technology and Engineering Practice, London, UK, p. 339-349. Lynex, A., Layzell, P. J. (1998): Organisational Considerations for Software Reuse. Annals of Software Engineering 5(1), p. 105-124. Lyons, D. (2003): Linux's Hit Men. Retrieved 09.11.2009, from http://www.forbes.com/ 2003/10/14/cz_dl_1014linksys.html. MacCormack, A., Rusnak, J., Baldwin, C. Y. (2006): Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code. Management Science 52(7), p. 1015-1030. MacDonald, G., Ryall, M. D. (2004): How Do Value Creation and Competition Determine Whether a Firm Appropriates Value? Management Science 50(10), p. 1319-1333. Madanmohan, T. R., De, R. (2004): Open Source Reuse in Commercial Firms. IEEE Software 21(6), p. 62-69. Maiden, N. A., Sutcliffe, A. G. (1993): People-Oriented Software Reuse: The Very Thought. Proceedings of the 2nd International Workshop on Software Reusability (IWSR-2), Los Alamitos, CA, p. 176-185. Majchrak, A., Cooper, L. P., Neece, O. P. (2004): Knowledge Reuse for Innovation. Management Science 50(2), p. 174-188. Makadok, R. (2001): A Pointed Commentary on Priem and Butler. Academy of Management Review 26(4), p. 498-499. Makadok, R., Coff, R. (2002): The Theory of Value and Value of Theory: Breaking New Ground Versus Reinventing the Wheel. Academy of Management Review 27(1), p. 1012. Mäki-Asiala, P., Matinlassi, M. (2006): Quality Assurance of Open Source Components: Integrator Point of View. 30th Annual International Computer Software and Applications Conference, COMPSAC '06, Chicago, IL, p. 189-194.
278
Bibliography
Margono, J., Lindsey, L. (1991): Software Reuse in the Air Traffic Control Advanced Automation System. Software Reuse and Reengineering Conference, Washington, D.C. Markus, M. L. (2001): Towards a Theory of Knowledge Reuse: Types of Knowledge Reuse Situations and Factors in Reuse Success. Journal of Management Information Systems 18(1), p. 57-93. Markus, M. L., Manville, B., Agres, C. E. (2000): What Makes a Virtual Organization Work? Sloan Management Review 42(1), p. 13-26. Marshall, K. P. (1999): Has Technology Introduced New Ethical Problems? Journal of Business Ethics 19(1), p. 81-90. Maslow, A. H. (1987): Motivation and Personality. Harper, New York, NY. Mason, R. O. (1986): Four Ethical Issues of the Information Age. MIS Quarterly 10(1), p. 5-12. Mathieson, K. (1991): Predicting User Intentions: Comparing the Technology Acceptance Model with the Theory of Planned Behavior. Information Systems Research 2(3), p. 173-191. McGhee, D. D. (2007): Free and Open Source Software Licenses: Benefits, Risks, and Steps toward Ensuring Compliance. Intellectual Property & Technology Law Journal 19(11), p. 5-9. McIlroy, M. D. (1968): Mass Produced Software Components. In: Naur, P., Randall, D. (Ed.), Software Engineering; Report on a Conference by the Nato Science Committee. NATO Scientific Affairs Division, Brussels, Belgium, p. 138-150. Meeker, H. J. (2008): The Open Source Alternative. John Wiley & Sons, Hoboken, NJ. Mehrwald, H. (1999): Das 'Not Invented Here'-Syndrom in Forschung und Entwicklung. Deutscher Universitäts-Verlag, Wiesbaden. Mellarkod, V., Appan, R., Jones, D. R., Sherif, K. (2007): A Multi-Level Analysis of Factors Affecting Software Developers' Intention to Reuse Software Assets: An Empirical Investigation. Information & Management 44(7), p. 613-625. Menon, T., Pfeffer, J. (2003): Valuing Internal vs. External Knowledge: Explaining the Preference for Outsiders. Management Science 49(4), p. 497-513. Mertzel, N. J. (2008): Copying 0.03 Percent of Software Code Base Not "De Minimis". Journal of Intellectual Property Law & Practice 9(3), p. 547-548. Michailova, S., Husted, K. (2003): Knowledge-Sharing Hostility in Russian Firms. California Management Review 45(3), p. 59-77. Michlmayr, M., Hill, B. M. (2003): Quality and the Reliance on Individuals in Free Software Projects. 2nd Workshop on Open Source Software Engineering. Orlando, FL. Miles, M. B., Huberman, A. M. (1994): Qualitative Data Analysis. SAGE Publications, Thousand Oaks, CA. Mili, A., Mili, R., Mittermeir, R. T. (1998): A Survey of Software Reuse Libraries. Annals of Software Engineering 5(1), p. 349-414.
Bibliography
279
Mili, A., Yacoub, S., Addy, E., Mili, H. (1999): Toward an Engineering Discipline of Software Reuse. IEEE Software 16(5), p. 22-31 Mili, H., Mili, F., Mili, A. (1995): Reusing Software: Issues and Research Directions. IEEE Transactions of Software Engineering 21(6), p. 528-562. Miller, G. A. (1956): The Magical Number Seven Plus or Minus Two: Some Limits on Our Capacity for Precessing Information. Psychological Review 63(2), p. 81-97. Mocciaro Li Destri, A., Dagnino, G. B. (2005): The Development of the Resource-Based Firm between Value Appropriation and Value Creation. Advances in Strategic Management 22, p. 153-188. Mockus, A. (2007): Large-Scale Code Reuse in Open Source Software. 1st International Workshop on Emerging Trends in FLOSS Research and Development, Minneapolis, MN. Mockus, A., Fielding, R. T., Herbsleb, J. D. (2005): Two Case Studies of Open Source Software Development: Apache and Mozilla. In: Feller, J., Fitzgerald, B., Hissam, S. A., Lakhani, K. R. (Ed.), Perspectives on Free and Open Source Software. MIT Press, Cambridge, MA, p. 163-209. Montgomery, C. A., Wernerfelt, B. (1988): Diversification, Ricardian Rents, and Tobin's Q. The RAND Journal of Economics 19(4), p. 623-632. Moore, G. C., Benbasat, I. (1991): Development of an Instrument to Measure the Perceptions of Adopting an Information Technology Innovation. Information Systems Research 2(3), p. 192-222. Moores, T. T., Chang, J. C.-J. (2006): Ethical Decision Making in Software Piracy: Initial Development and Test of a Four-Component Model. MIS Quarterly 30(1), p. 167-180. Morad, S., Kuflik, T. (2005): Conventional and Open Source Software Reuse at Orbotech an Industrial Experience. IEEE International Conference on Software - Science, Technology and Engineering, Herzelia, Isreal, p. 110-117. Moran, P., Ghoshal, S. (1999): Markets, Firms, and the Process of Economic Development. Academy of Management Review 24(3), p. 390-412. Morisio, M., Ezran, M., Tully, C. (2002): Success and Failure Factors in Software Reuse. IEEE Transactions on Software Engineering 28(4), p. 340-357. Morisio, M., Tully, C., Ezran, M. (2000): Diversity in Reuse Processes. IEEE Software 17(4), p. 56-63. Moskin, J., Wettan, H. (2009): The Little License That Could – Dangers of Using Open Source Code after Jacobsen v. Katzer. The Intellectual Property Strategist 15(7). Murray, G. F. (2009): Categorization of Open Source Licenses: More Than Just Semantics. Computer & Internet Lawyer 26(1), p. 1-11. Nahapiet, J., Goshal, S. (1998): Social Capital, Intellectual Capital, and the Organizational Advantage. Academy of Management Review 23(2), p. 242-266. Naur, P., Randell, B. (1968): Software Engineering; Report on a Conference by the Nato Science Committee. NATO Science Affairs Division, Brussels, Belgium.
280
Bibliography
Nederhof, A. (1985): Methods of Coping with Social Desirability Bias: A Review. European Journal of Social Psychology 15(3), p. 263-280. Nelson, R. R. (2006): Reflections of David Teece's "Profiting from Technological Innovation...". Research Policy 35(8), p. 1107-1109. Net Applications (2010): Top Browser Share Trend. Retrieved 02.02.2010, from http://marketshare.hitslink.com/browser-market-share.aspx?qprid=1. Netcraft (2010): January 2010 Web Server Survey. Retrieved 02.02.2010, from http://news.netcraft.com/archives/web_server_survey.html. Neter, J., Kutner, M. H., Wasserman, W., Nachtsheim, C. J. (1996): Applied Linear Regression Models. Irwin, Homewood, IL, 3rd Edition. Nonaka, I. (1994): A Dynamic Theory of Organizational Knowledge Creation. Organization Science 5(1), p. 14-37. Norris, J. S. (2004): Mission-Critical Development with Open Source Software: Lessons Learned. IEEE Software 21(1), p. 42-49. Nunnally, J. C. (1978): Psychonometric Theory. McGraw-Hill, New York, NY. O'Dell, C., Grayson, C. J. (1998): If Only We Knew What We Know: Identification and Transfer of Internal Best Practices. California Management Review 40(3), p. 154-174. O'Fallon, M. J., Butterfield, K. D. (2005): A Review of the Empirical Ethical DecisionMaking Literature. Journal of Business Ethics 59(4), p. 375-413. O'Mahony, S. (2003): Guarding the Commons: How Community Managed Software Projects Protect Their Work. Research Policy 32(7), p. 1179-1198 O’Mahony, S. (2003): Guarding the Commons: How Community Managed Software Projects Protect Their Work. Research Policy 32(7), p. 1179-1198 Ofek, E., Sarvary, M. (2001): Leveraging the Customer Base: Creating Competitive Advantage through Knowledge Management. Management Science 47(11), p. 14411456. Olson, G. (2008): Open Source Software Intellectual Property Management. Retrieved 13.01.2009, from https://fossbazaar.org/filemanager/active?fid=45. Open Source Initiative (2009a): History of the OSI. Retrieved 05.10.2009, from http://www.opensource.org/history. Open Source Initiative (2009b): The License Review Process. Retrieved 02.10.2009, from http://www.opensource.org/approval. Open Source Initiative (2009c): The Open Source Definition. Retrieved 02.10.2009, from http://www.opensource.org/docs/osd. Open Source Initiative (2010): Licenses by Name. Retrieved 13.04.2010, from http://www.opensource.org/licenses/alphabetical. Oreg, S., Nov, O. (2008): Exploring Motivations for Contributing to Open Source Initiatives: The Roles of Contribution Context and Personal Values. Computers in Human Behavior 24(5), p. 2055–2073.
Bibliography
281
Pace, C. R. (1939): Factors Influencing Questionnaire Returns from Former University Students. Journal of Applied Psychology 23(June), p. 388-397. Palamida (2005): Best Practices of Securing Your Software Intellectual Property. Retrieved 08.12.2009, from http://factpoint.com/IPManagementBestPractices.pdf. Palamida (2008): Weathering the Economic Crisis in Engineering. Retrieved 08.12.2009, from http://www.palamida.com/themes/resources/ Palamida-WhitePaper-WeatheringTheCrisis.pdf. Paradice, D. B. (1990): Ethical Attitudes of Entry-Level MIS Personnel. Information & Management 18(3), p. 143-151. Paté-Cornell, M. E. (1990): Organizational Aspects of Engineering System Safety: The Case of Offshore Platforms. Science 30, p. 1210-1217. Peace, A. G., Galletta, D. F., Thong, J. Y. L. (2003): Software Piracy in the Workplace: A Model and Empirical Test. Journal of Management Information Systems 20(1), p. 153177. Penrose, E. T. (1959): The Theory of the Growth of the Firm. Oxford University Press, Oxford. Perens, B. (1999): The Open Source Definition. In: DiBona, C., Ockman, S., Stone, M. (Ed.), Open Sources: Voices of the Open Source Revolution. O'Reilly & Associates, Sebastopol, CA, p. 171-189. Perrow, C. (1997): Organizing for Environmental Destruction. Organization & Environment 10(1), p. 66-72. Perugini, M., Bagozzi, R. P. (2001): The Role of Desires and Anticipated Emotions in Goal-Directed Behavior: A Model of Goal-Directed Behavior. British Journal of Social Psychology 40(1), p. 79-98. Peteraf, M. A. (1993): The Cornerstones of Competitive Advantage: A Resource-Based View. Strategic Management Journal 14(3), p. 179-191. Peteraf, M. A. (1994): The Two Schools of Thought in Resource-Based Theory. In: Shrivasta, P., Huff, A. S., Dutton, J. (Ed.), Advances in Strategic Management. JAI Press, Greenwich, CT. Peteraf, M. A., Barney, J. B. (2003): Unravelling the Resource-Based Tangle. Managerial and Decision Economics 24(4), p. 309-323. Pierce, M. A., Henry, J. W. (2000): Judgements About Computer Ethics: Do Individual, Co-Worker, and Company Judgements Differ? Do Company Codes Make a Difference? Journal of Business Ethics 28(4), p. 307-322. Pisano, G. (2006): Profiting from Innovation and the Intellectual Property Revolution. Research Policy 35(8), p. 1122-1130. Pisano, G. P., Teece, D. J. (2007): How to Capture Value from Innovation: Shaping Intellectual Property and Industry Architecture. California Management Review 50(1), p. 278-296. Pitelis, C. (2008): Value Capture from Organisational Advantages and Sustainable Value Creation (06/2008). Working Paper. Retrieved 12.05.2009, from http://www.jbs.cam.ac.uk/research/working_papers/2008/wp0806.pdf.
282
Bibliography
Pliskin, N., Balaila, I., Kenigshtein, I. (1991): The Knowledge Contribution of Engineers to Software Development: A Case Study. IEEE Transactions on Engineering Management 38(4), p. 344-348. Podsakoff, P. M., MacKenzie, S. B., Lee, J., Podsakoff, N. P. (2003): Common Method Biases in Behavioral Research: A Critical Review of the Literature and Recommended Remedies. Journal of Applied Psychology 88(5), p. 879-903. Porter, M. (1985): Competitive Advantage: Creating and Sustaining Superior Performance. Free Press, New York, NY. Poulin, J. S. (1995): Populating Software Repositories: Incentives and Domain-Specific Software. Journal of Systems and Software 30(3), p. 187-199. Priem, R. L. (2001): "The" Business-Level RBV: A Great Wall or Berlin Wall? Academy of Management Review 26(4), p. 499-501. Priem, R. L. (2007): A Consumer Perspective on Value Creation. Academy of Management Review 32(1), p. 219-235. Priem, R. L., Butler, J. E. (2001a): Is the Resource-Based "View" A Useful Perspective for Strategic Management Research? Academy of Management Review 26(1), p. 22-40. Priem, R. L., Butler, J. E. (2001b): Tautology in the Resource-Based View and the Implications of Externally Determined Resource Value. Academy of Management Review 26(1), p. 57-66. Protecode (2009): Developer IP Assistant User Guide. Retrieved 21.12.2009, from http://update.protecode.com/userguide/DA/Protecode_UG_DA_3.2.pdf. Randall, D., Fernandes, M. F. (1991): The Social Desirability Response Bias in Ethics Research. Journal of Business Ethics 10(11), p. 805-817. Randall, D., Gibson, A. (1990): Methodology in Business Ethics Research: A Review and Critical Assessment. Journal of Business Ethics 9(6), p. 457-471. Randall, D., Gibson, A. (1991): Ethical Decision Making in the Medical Profession: An Application of the Theory of Planned Behavior. Journal of Business Ethics 10(2), p. 111-122. Ravichandran, T., Rothenberger, M. A. (2003): Software Reuse Strategies and Component Markets. Communications of the ACM 46(8), p. 109-114. Raymond, E. S. (1999a): A Brief History of Hackerdom. In: DiBona, C., Ockman, S., Stone, M. (Ed.), Open Source: Voices of the Open Source Revolution. O'Reilly & Associates, Sebastopol, CA, p. 19-29. Raymond, E. S. (1999b): The Revenge of the Hackers. In: DiBona, C., Ockman, S., Stone, M. (Ed.), Open Sources: Voices of the Open Source Revolution. O'Reilly & Associates, Sebastopol, CA, p. 207-219. Raymond, E. S. (2001): The Cathedral and the Bazaar. O'Reilly & Associates, Sebastopol, CA, 2nd Edition. Reed, R., DeFillippi, R. J. (1990): Causal Ambiguity, Barriers to Imitation, and Sustainable Competitive Advantage. Academy of Management Review 15(1), p. 88-102.
Bibliography
283
Rest, J. R. (1986): Moral Development: Advances in Research and Theory. Praeger, New York, NY. Riemenschneider, C. K., Hardgrave, B. C. (2001): Explaining Software Development Tool Use with the Technology Acceptance Model. Journal of Computer Information Systems 41(4), p. 1-8. Riemenschneider, C. K., Hardgrave, B. C., Davis, F. D. (2002): Explaining Software Developer Acceptance of Methodologies: A Comparison of Five Theoretical Models. IEEE Transactions on Software Engineering 28(12), p. 1135-1145 Rigby, D., Zook, C. (2002): Open-Market Innovation. Harvard Business Review 80(10), p. 80-89. Rine, D. C., Sonneman, R. M. (1998): Investments in Reusable Software: A Study of Software Reuse Investment Success Factors. Journal of Systems and Software 41(1), p. 17-32. Ringle, C. M., Wende, S., Will, S. (2005): SmartPLS 2.0 (M3). Retrieved 18.01.2010, from http://www.smartpls.de. Roberts, J. A., Hann, I., Slaughter, S. A. (2006): Understanding the Motivations, Participation, and Performance of Open Source Software Developers: A Longitudinal Study of the Apache Projects. Management Science 52(7), p. 984-999. Robles, G., Gonzalez-Barahona, J. M., Michlmayr, M., Amor, J. J. (2006): Mining Large Software Compilations over Time: Another Perspective on Software Evolution. MSR 2006 International Workshop on Mining Software Repositories, Shanghai, China. Rosen, L. (2004): Open Source Licensing: Software Freedom and Intellectual Property Law. Prentice-Hall, Englewood Cliffs, NJ. Rothenberger, M. A., Dooley, K. J., Kulkarni, U. R., Nada, N. (2003): Strategies for Software Reuse: A Principal Component Analysis of Reuse Practices. IEEE Transactions of Software Engineering 29(9), p. 825-837. Ruffin, C., Ebert, C. (2004): Using Open Source Software in Product Development: A Primer. IEEE Software 21(1), p. 82-86. Rumelt, R. P. (1984): Towards a Strategy Theory of the Firm. In: Lamb, B. (Ed.), Competitive Strategic Management. Prentice-Hall, Englewood Cliffs, NJ. Rumelt, R. P. (1987): Theory, Strategy, and Entrepreneurship. In: Teece, D. J. (Ed.), The Competitive Challenge. Ballinger, Cambridge, MA, p. 137-158. Ryan, R. M., Deci, E. L. (2000): Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. Contemporary Educational Psychology 25(1), p. 54-67. Sambamurthy, V., Subramani, M. (2005): Special Issue on Information Technologies and Knowledge Management. MIS Quarterly 29(1), p. 1-7. Samuelson, P. (1990): Benson Revisited: The Case against Patent Protection for Algorithms and Other Computer Program-Related Inventions. Emory Law Journal 39(4), p. 1025-1154. Sarvary, M. (1999): Knowledge Management and Competition in the Consulting Industry. California Management Review 41(2), p. 95-107.
284
Bibliography
Savage, L. J. (1972): The Foundations of Statistics. Dover Publications, New York, NY. Scacchi, W. (2002): Understanding the Requirements for Developing Open Source Software Systems. IEE Proceedings Software 149(1), p. 24-39. Scacchi, W. (2004): Free and Open Source Development Practices in the Game Community. IEEE Software 21(1), p. 59-66. Schneider, B. (1975): Organizational Climate: An Essay. Personnel Psychology 28(4), p. 447-479. Schnell, R., Hill, P. B., Esser, E. (2005): Methoden der Empirischen Sozialforschung. Oldenburg-Verlag, München, 5th Edition. Schoemaker, P. J. H. (1982): The Expected Utility Model: Its Variants, Purposes, Evidence and Limitations. Journal of Economic Literature 20(2), p. 529-563. Schumpeter, J. A. (1942): Capitalism, Socialism, and Democracy. Harper & Row, New York, NY. Scozzi, B., Crowston, K., Eseryel, U. Y., Li, Q. (2008): Shared Mental Models among Open Source Software Developers. 41st Annual Hawaii International Conference on System Sciences, Waikoloa, HI. SecuritySpace (2008): Mail (MX) Server Survey. Retrieved 02.02.2010, from http://www.securityspace.com/s_survey/data/man.200801/mxsurvey.html. Sen, A. (1997): The Role of Opportunism in the Software Design Reuse Process. IEEE Transactions of Software Engineering 23(7), p. 418-436. Sen, R., Subramaniam, C., Nelson, M. L. (2008): Determinants of the Choice of Open Source Software License. Journal of Management Information Systems 25(3), p. 207239. Senyard, A., Michlmayr, M. (2004): How to Have a Successful Free Software Project. 11th Asia-Pacific Software Engineering Conference, Busan, South Korea. Shah, S. (2006): Motivation, Governance, and the Viability of Hybrid Forms in Open Source Software Development. Management Science 52(7), p. 1000-1014. Sherif, K., Appan, R., Lin, Z. (2006): Ressources and Incentives for the Adoption of Systematic Software Reuse. International Journal of Information Management 26(1), p. 70-80. Sherif, K., Vinze, A. (2003): Barriers to Adoption of Software Reuse: A Qualitative Study. Information & Management 41(2), p. 159-175. Sirmon, D. G., Hitt, M. A., Duane, I. R. (2007): Managing Firm Resources in Dynamic Environments to Create Value: Looking inside the Black Box. Academy of Management Review 32(1), p. 273-292. Sojer, M., Alexy, O., Henkel, J. (2010): Ethical Considerations in Internet Code Reuse: A Model and Empirical Test. Working Paper. Sojer, M., Henkel, J. (2010a): Code Reuse in Open Source Software Development: Quantitative Evidence, Drivers, and Impediments (09.03.2010). Working Paper. Retrieved 09.03.2010, from http://ssrn.com/paper=1489789.
Bibliography
285
Sojer, M., Henkel, J. (2010b): License Risks from Ad-Hoc Reuse of Code from the Internet: An Empirical Investigation (22.04.2010). Working Paper. Retrieved 22.04.2010, from http://ssrn.com/paper=1594641. Soloway, E., Ehrlich, K., Bonar, J., Greenspan, J. (1982): What Do Novices Know About Programming? In: Schneiderman, B., Badre, A. (Ed.), Directions in Human-Computer Interaction. Ablex, Norwood, NJ. SourceForge.net (2009): About. Retrieved 13.04.2010, from http://sourceforge.net/about. SourceForge.net (2010): Sourceforge.Net: Software Search. Retrieved 12.04.2010, from http://sourceforge.net/softwaremap/?. Spaeth, S., Stuermer, M., Haefliger, S., Von Krogh, G. (2007): Sampling in Open Source Software Development: The Case for Using the Debian GNU/Linux Distribution. 40th Annual Hawaii International Conference on System Sciences, Waikoloa, HI. Spence, J. T., Robbins, A. S. (1992): Workaholism: Definition, Measurement, and Preliminary Results. Journal of Personality Assessment 58(1), p. 160-178. Spender, J. C. (1996): Making Knowledge the Basis of a Dynamic Theory of the Firm. Strategic Management Journal 17(Winter), p. 45-62. Spinellis, D., Szyperski, C. (2004): How Is Open Source Affecting Software Development? IEEE Software 21(1), p. 28-33. St. Laurent, A. M. (2004): Understanding Open Source and Free Software Licensing. O'Reilly Media, Sebastopol, CA. Stabell, C. B., Fjeldstad, Ø. D. (1998): Configuring Value for Competitive Advantage: On Chains, Shops, and Networks. Strategic Management Journal 19(5), p. 413-437. Stallman, R. (1999): The GNU Operating System and the Free Software Movement. In: DiBona, C., Ockman, S., Stone, M. (Ed.), Open Sources: Voices from the Open Source Revolution. O'Reilly, Sebastopol, CA, p. 53-70. Stewart, K. J., Gosain, S. (2006): The Impact of Ideology on Effectiveness in Open Source Software Teams. MIS Quarterly 30(2), p. 291-314. Strahan, R., Gerbasi, K. C. (1972): Short, Homogeneous Versions of the Marlow-Crowne Social Desirability Scale. Journal of Clinical Psychology 28(2), p. 191-193. Straub, D. W. (1990): Effective Is Security: An Empirical Study. Information Systems Research 1(3), p. 255-276. Strod, E. (2009): Hybrid Software Development: Producing Results. Retrieved 16.01.2009, from http://openwebdeveloper.sys-con.com/node/792921. Stroustrup, B. (1996): Language-Technical Aspects of Reuse. 4th International Conference on Software Reuse, Orlando, FL. Subramanyam, R., Xia, M. (2008): Free/Libre Open Source Software Development in Developing and Developed Countries: A Conceptual Framework with an Exploratory Study. Decision Support Systems 46(1), p. 173-186. Szulanski, G. (1996): Exploring Internal Stickiness: Impediments to the Transfer of Best Practice within the Firm. Strategic Management Journal 17(Winter Special Issue), p. 27-43.
286
Bibliography
Szulanski, G. (2000): The Process of Knowledge Transfer: A Diachronic Analysis of Stickiness. Organizational Behavior and Human Decision Processes 82(1), p. 9-27. Tajfel, H., Turner, J. (1986): Social Identity Theory of Intergroup Behavior. In: Worschel, S., Austin, W. G. (Ed.), Psychology of Intergroup Relations. Nelson-Hall, Chicago, IL. Takeishi, A. (2002): Knowledge Partitioning in the Inter-Firm Division of Labor: The Case of Automotive Product Development. Organization Science 13(3), p. 321–338. Teece, D. J. (1986): Profiting from Technological Innovation: Implications for Integration, Collaboration, Licensing and Public Policy. Research Policy 15(6), p. 285-305. Teece, D. J. (2006): Reflections On "Profiting from Innovation". Research Policy 35(8), p. 1131-1146. Teece, D. J., Pisano, G., Shuen, A. (1997): Dynamic Capabilities and Strategic Management. Strategic Management Journal 18(7), p. 509-533. Tetlock, P. E. (1985): Accountability: The Neglected Social Context of Judgement and Choice. In: Cummings, L. L., Staw, B. M. (Ed.), Research in Organizational Behavior. JAI Press, Greenwich, CT. Thompson, E. R., Phua, F. T. T. (2005): Reliability among Senior Managers of the Marlowe-Crowne Short-Form Social Desirability Scale. Journal of Business and Psychology 19(4), p. 541-554. Thong, J. Y. L., Yap, C.-S. (1998): Testing an Ethical Decision-Making Theory: The Case of Softlifting. Journal of Management Information Systems 15(1), p. 213-237. Thornton, D., Gunningham, N. A., Kagan, R. A. (2005): General Deterrence and Corporate Environmental Behavior. Law & Policy 27(2), p. 262-288. Tittle, C. R. (1980): Sanctions and Social Deviance: The Question of Deterrence. Praeger, New York, NY. Tracz, W. (1995): Confessions of a Used Program Salesman: Institutionalizing Software Reuse. Addison-Wesley, Reading, MA. Trevino, L. (1986): Ethical Decision Making in Organizations: A Person-Situation Interactionist Model. Academy of Management Review 11(3), p. 601-617. Trevino, L., Butterfield, K. D., McCabe, D. L. (1998): The Ethical Context in Organizations: Influences on Employee Attitudes and Behaviors. Business Ethics Quarterly 8(3), p. 447-477. Vallerand, R. J., Deshaies, P., Cuerrier, J., Pelletier, L. G., Mongeau, C. (1992): Ajzen and Fishbein's Theory of Reasoned Action as Applied to Moral Behavior: A Confirmatory Analysis. Journal of Personality and Social Psychology 62(1), p. 98-109. VanSandt, C. V., Shepard, J. M., Zappe, S. M. (2006): An Examination of the Relationship between Ethical Work Climate and Moral Awareness. Journal of Business Ethics 68(4), p. 409-432. Ven, K., Mannaert, H. (2008): Challenges and Strategies in the Use of Open Source Software by Independent Software Vendors. Information and Software Technology 50(9-10), p. 991-1002.
Bibliography
287
Venkatesh, V., Morris, M. G., Davis, G. B., Davis, F. D. (2003): User Acceptance of Information Technology: Toward a Unified View. MIS Quarterly 27(3), p. 425-478. Victor, B., Cullen, J. B. (1987): A Theory and Measure of Ethical Climate in Organizations. In: Frederick, W. C., Preston, L. E. (Ed.), Research in Corporate Social Performance and Policy. JAI Press, Greenwich, CT, p. 51-71. Victor, B., Cullen, J. B. (1988): The Organizational Bases of Ethical Work Climates. Administrative Science Quarterly 33(1), p. 101-125. Vilares, M. J., Almeida, M. H., Coelho, P. S. (2009): Comparison of Likelihood and PLS Estimators for Structural Equation Modeling: A Simulation with Customer Satisfaction Data. In: Esposito Vinzi, V., Chin, W. W., Henseler, J., Wang, H. (Ed.), Handbook of Partial Least Squares: Concepts, Methods, and Applications. Springer, Berlin (in print). Vixie, P. (1999): Software Engineering. In: DiBona, C., Ockman, S., Stone, M. (Ed.), Open Sources: Voices of the Open Source Revolution. O'Reilly & Associates, Sebastopol, CA, p. 91-101. VMware (2008): Quarterly Report (June 30, 2008) - Form 10-Q (08.08.2008). SEC Filing. Retrieved 14.12.2009, from http://ccbn.10kwizard.com/xml/ download.php?repo=tenk&ipage=5820653&format=PDF. von Hippel, E., Von Krogh, G. (2003): Open Source Software and the 'Private Collective' Innovation Model: Issues for Organization Science. Organization Science 14(2), p. 209-223. von Krogh, G., Spaeth, S., Haefliger, S. (2005): Knowledge Reuse in Open Source Software: An Exploratory Study of 15 Open Source Projects. 38th Annual Hawaii International Conference on System Sciences, Big Island, HI. von Krogh, G., Spaeth, S., Haefliger, S., Wallin, M. (2008): Open Source Software: What We Know (and Do Not Know) About Motives to Contribute (April 2008). Working Paper. Retrieved 03.09.2009, from http://www.dime-eu.org/files/active/0/ WP38_vonKroghSpaethHaefligerWallin_IPROSS.pdf. von Krogh, G., Spaeth, S., Lakhani, K. R. (2003): Community, Joining, and Specialization in Open Source Software Innovation: A Case Study. Research Policy 32(7), p. 12171241. von Krogh, G., Von Hippel, E. (2006): The Promise of Research on Open Source Software. Management Science 52(7), p. 975-983. Walton, R. E. (1975): The Diffusion of New Work Structures: Explaining Why Success Didn't Take. Organizational Dynamics 3(3), p. 3-21. waters (2006): Taming the Open-Source Monster. Retrieved 03.03.2009, from http://www.watersonline.com/public/showPage.html?page=331247. Watson, S., Hewett, K. (2006): A Multi-Theoretical Model of Knowledge Transfer in Organizations: Determinants of Knowledge Contribution and Knowledge Reuse. Journal of Management Studies 43(2), p. 141-173. Weber, S. (2004): The Success of Open Source. Harvard University Press, Cambridge, MA.
288
Bibliography
Weiss, A. (2005): The Open Source WRT54G Story. Retrieved 09.11.2009, from http://www.wi-fiplanet.com/tutorials/article.php/3562391. Wenger, E. (1998): Communities of Practice: Learning, Meaning and Identity. Cambridge University Press, Cambridge, UK. Wernerfelt, B. (1984): A Resource-Based View of the Firm. Strategic Management Journal 5(2), p. 171-180. Werts, C. E., Linn, R. L., Jöreskog, K. G. (1974): Intraclass Reliability Estimates: Testing Structural Assumptions. Educational and Psychological Measurement 34(1), p. 25-33. West, J. (2003): How Open Is Open Enough? Melding Proprietary and Open Source Platform Strategies. Research Policy 32(7), p. 1259-1285. West, J., Gallagher, S. (2006): Challenges of Open Innovation: The Paradox of Firm Investments in Open-Source Software. R&D Management 36(3), p. 319-331. Wierenga, B., Oude Ouphuis, P. A. M. (1997): Marketing Decision Support Systems: Adoption, Use, and Satisfaction. International Journal of Research in Marketing 14(3), p. 275-290. Winter, S. J., Stylianou, A. C., Giacalone, R. A. (2004): Individual Differences in the Acceptability of Unethical Information Technology Practices: The Case of Machiavellianism and Ethical Ideology. Journal of Business Ethics 54(3), p. 275-296. Wold, H. (1985): Partial Least Squares. In: Kotz, S., Johnson, N. L. (Ed.), Encyclopedia of Statistical Sciences. Wiley, New York, NY, p. 581-591. Wold, H. (1989): Introduction to Second Generation Multivariate Analysis. In: Wold, H. (Ed.), Theoretical Empirism. Paragon House, New Your, NY, p. VII-XI. Woolley, D. J., Eining, M. M. (2006): Software Piracy among Accounting Students: A Longitudinal Comparison of Changes and Sensitivty. Journal of Information Systems 20(1), p. 49-63. Wu, C.-G., Gerlach, J. H., Young, C. E. (2007): An Empirical Analysis of Open Source Software Developers’ Motivations and Continuance Intentions. Information & Management 44(3), p. 253-262. Wyld, D. C., Jones, C. A. (1997): The Importance of Context: The Ethical Work Climate Construct and Models of Ethical Decision Making – an Agenda for Research. Journal of Business Ethics 16(4), p. 465-472. Ye, Y., Fischer, G. (2005): Reuse-Conducive Development Environments. Automated Software Engineering 12(2), p. 199-235. Zander, U., Kogut, B. (1995): Knowledge and the Speed of the Transfer and Imitation of Organizational Capability: An Empirical Test. Organization Science 6(1), p. 76-92. Zeitlyn, D. (2003): Gift Economies in the Development of Open Source Software: Anthropological Reflections. Research Policy 32(7), p. 1287-1291.