An Introduction To Open Source Software Development Steffen Evers Berlin, 13th August 2000
c 2000 by Steffen Evers Cop...
100 downloads
1774 Views
511KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
An Introduction To Open Source Software Development Steffen Evers Berlin, 13th August 2000
c 2000 by Steffen Evers Copyright
This is a revision of my diploma thesis with the original title “Development Environments For Open Source Software”. It was supervised by Prof. Dr. Bernd Mahr and Dipl.-Inf. Magnus Niemann and handed in on June 22nd, 2000 at the
Technische Universität Berlin Fachbereich Informatik Fachgebiet Formale Modelle, Logik und Programmierung (FLP) Franklinstr. 28-29 D - 10587 Berlin.
Any kind of feedback is welcome. Please send it to: Steffen Evers
Contents
5
Contents
1
Introduction
2
The Open Source Phenomenon
11
2.1
Historical Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.1.1
Unix
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.1.2
The GNU Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
Intellectual Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
2.2.1
Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.2.2
Trade Secret . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2.2.3
Patents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
2.2.4
Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.3.1
Free Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.3.2
Open Source Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.3.3
Open Source Software Licenses . . . . . . . . . . . . . . . . . . . . . . . .
21
2.4
Theories about Open Source Software . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.5
Approaching the Open Source Community . . . . . . . . . . . . . . . . . . . . . . .
25
2.5.1
Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
2.5.2
Software Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
2.5.3
Power, Trust and Observation . . . . . . . . . . . . . . . . . . . . . . . . .
30
Open Source Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
2.6.1
Developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
2.6.2
Typical History of a Project Start . . . . . . . . . . . . . . . . . . . . . . . .
33
2.6.3
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
2.2
2.3
2.6
9
6 3
Contents Organization of Open Source Projects
37
3.1
Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
3.1.1
Unrestricted Information and Software . . . . . . . . . . . . . . . . . . . .
37
3.1.2
Restricted Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.1.3
Restricted Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.1.4
Individual Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.1.5
Computer Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
3.1.6
Human Actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
3.1.7
Financial Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
3.1.8
Service and Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
3.1.9
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
3.2.1
Shared Resources
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
3.2.2
Producer/Consumer Relationships . . . . . . . . . . . . . . . . . . . . . . .
46
3.2.3
Simultaneity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
3.2.4
Task/Subtask Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . .
47
3.2.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
Identified Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
3.3.1
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
3.3.2
Basic Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
53
3.3.3
Objects and Data Organization . . . . . . . . . . . . . . . . . . . . . . . . .
53
3.3.4
Project Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
3.3.5
Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
3.2
3.3
4
Basic Concepts of Open Source Software
59
4.1
Computer Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
4.1.1
Major Components of a Computer System . . . . . . . . . . . . . . . . . . .
59
4.1.2
Task Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
4.1.3
Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
4.1.4
Dependencies and the Complexity of Software Systems
. . . . . . . . . . .
64
Improvement of Open Source Software Components . . . . . . . . . . . . . . . . .
65
4.2.1
Major Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.2.2
Balance between Delivery and Payback . . . . . . . . . . . . . . . . . . . .
68
4.2
Contents 4.3
4.4 5
A Simple Framework for Open Source Software . . . . . . . . . . . . . . . . . . . .
68
4.3.1
Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
4.3.2
Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
4.3.3
Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
Software Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
75
Technical Support for Open Source Projects
77
5.1
Weak Points of Former Support Systems . . . . . . . . . . . . . . . . . . . . . . . .
77
5.1.1
Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
5.1.2
Mailing Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
78
5.1.3
Multiple Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
5.1.4
Pending Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
5.1.5
No Written Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
5.1.6
Release Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
Basic Support Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
5.2.1
Protection of Stored Data and Performed Actions . . . . . . . . . . . . . . .
82
5.2.2
Observation of Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
5.2.3
Version Management of Stored Data . . . . . . . . . . . . . . . . . . . . . .
84
5.2.4
Online-Offline Management . . . . . . . . . . . . . . . . . . . . . . . . . .
84
5.2.5
Distribution of New Information
. . . . . . . . . . . . . . . . . . . . . . .
84
5.2.6
Communication between Participants . . . . . . . . . . . . . . . . . . . . .
85
5.2.7
Issue Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
5.2.8
Automation of Frequent Processes . . . . . . . . . . . . . . . . . . . . . . .
86
Technical Support for Distributed Development . . . . . . . . . . . . . . . . . . . .
86
5.3.1
Groupware: Tools for Traditional Collaboration
. . . . . . . . . . . . . . .
86
5.3.2
Present Means of Communication . . . . . . . . . . . . . . . . . . . . . . .
87
5.3.3
Support for Open Source Software . . . . . . . . . . . . . . . . . . . . . .
88
5.2
5.3
6
7
Conclusion
91
Acronyms
95
Bibliography
97
German Abstract
103
8
Contents
9
1 Introduction
Open source software is defined by its attached license which abandons essential rights granted to the original creator by copyright law. This procedure gives anyone the opportunity to redistribute and modify any received open source software. The aimed target here is the creation of a model as a base for technical support of open source software. The following thesis intends to provide a first approach for this undertaking by identifying structures of involved parties and processes. My starting point was the observation of an insufficient technical environment for usage and development of open source software. To help improve the current situation I first tried to identify suitable software tools. After investigating discussion forums, Groupware and other communication and collaboration technologies, it became clear that well-founded results could only be achieved with a detailed prior analysis of the underlying structures of open source software. The study of existing models turned out to be indispensable in order to gain a theoretical background for a successful identification of these abstract structures. ODP1 , TINA2 , FEST3 , ISO 90004 and proprietary software production have been the most important ones. Although many common features were found, a direct applicability of any of these models could not be proven based on available information about open source software. The acquisition of basic knowledge about the open source phenomenon and a following validation of the collected information was required. Therefore philosophy of science, structuralism, global models, sociology and various other theories about Darwinian evolution, chaos, coordination and human action were investigated. All the mentioned material helped to establish a solid base to examine the following four aspects of open source software: 1. Social Background First of all, I investigated the question whether the open source movement as a whole is actually based on common structures or if it is a chaotic collection of many different persons, parties and entities that hardly can be considered as organized. Additionally, special features of open source software are presented, covering history, law, economy, software engineering and others. Furthermore open source projects are introduced as the basic organizational unit. 1 ODP stands for ’Open Distributed Processing’.
See [ODP P1], [ODP P2], [ODP P3] and [ODP P4] for details about ODP. stands for ’Telecommunications Information Networking Architecture’. See [TINA95] and [TINA97] for details. 3 FEST stands for ’Framework for European Services in Telemedicine’. See [FEST95] for details. 4 ISO 9000 is a collection of international standards. See [ISO9000-1], [ISO9000-3], [ISO9004-1] and [ISO9004-2] for details.
2 TINA
10
1. Introduction 2. Project Organization The work of open source projects is investigated and major resulting organizational structures are illustrated. 3. Involved Processes Firstly, computer systems are presented as the technical context of (open source) software. Secondly, various identified processes of the development, deployment and usage of open source software are illustrated. 4. Technical Support Currently used and emerging software tools and support services are investigated, covering experienced problems, major tasks and some examples for special Internet services.
11
2 The Open Source Phenomenon
This chapter presents special features of open source in the field of history, law, economy, software engineering and others. Furthermore open source projects are introduced as the basic organizational unit.
2.1
Historical Overview
When people talk about open source software, they normally refer to the operating system GNU/Linux and its applications. This system has a long history that goes back to the creation of Unix (1969) and beyond. Therefore the historical context of open source software is important to understand the present situation. 2.1.1 Unix
The following quotations, if not stated different, are taken from the book “A Quarter Century of Unix”, written by Peter H. Salus [Salus94]. This book gives an excellent picture of the Unix history. The Massachussets Institute of Technology (MIT), General Electric Company (GE) and Bell Telephone Labs (BTL) started the MULTICS (Multiplexed Information and Computing Service) project which had the objective to develop a new interactive, multiuser operating system in 1965 [Goodheart94]. In 1969, BTL withdrew their resources from the MULTICS project as success seemed to get out of reach. “[T]he problem was the increasing obviousness of the failure of Multics to deliver promptly any sort of usable system” [Ritchie79]. However, the project seemed to be a failure they learned “lots of fundamental things” from it: 1. “a tree structure file system 2. a separate, identifiable program to do command interpretation[:] the ’shell’ [...] 3. [...] the structure of files [...] 4. [the nature of] text files [...] 5. the semantics of I/O operations [...]” Some of the BTL researchers which participated in the declined project did not want to give up all of the comfortable computing environment that was promised by MULTICS. “[They] didn’t want to lose the pleasant niche [they] occupied, because no similar ones were available” [Ritchie79]. So,
12
2. The Open Source Phenomenon
they began trying to find an alternative in several ways: Firstly, they proposed the purchase of a new medium-scale computer which they promised to write the operating system for which was finally denied. Secondly, they developed the basic design for a file system and, thirdly, K. Thompson started to write some programs for a computer that was available at that time, a GE645. When it became clear that the machine would be removed from BTL within the following months he stopped that work [Ritchie79]. K. Thompson had developed a game called ‘Space Travel’ at that time which was originally written for MULTICS and then ported to GECOS, the operation system used on the GE645. This program was not running very well on this machine and was expensive to run as well because one game cost 75 $US for CPU time. For this reason he wanted to get it running on another machine. It did not take him long to find a little-used PDP-7 which was suitable for this project. So, D. Ritchie and K. Thompson rewrote the game on the PDP-7 and thereby got an introduction on how to prepare programs for the machine. Thompson implemented the already designed file system soon after and continued with all other requirements for a working operating system [Ritchie79]. Promising to develop a text processing tool for the system they got a new PDP-11 computer in 1970 but had to share it with others. “With several BTL staff members from outside the research group using the typesetting facilities of the PDP-11, the need to document the operating system grew. The result was the first Unix Programmer’s Manual by Thompson and Ritchie, which was dated November 3, 1971.” [Salus94b]. The manual gave the first complete release its name: “First Edition”. It introduced most of the fundamental ideas of the operating system that we call Unix. Many commands like ’mv’, ’su’ or ’find’ have been invented at that time, nearly thirty years ago, and are still used today in modified versions. “Inside certain parts of Bell Telephone Laboratories, Unix was a success” between 1971 and 1973. “[T]he number of people spending an appreciable amount of time writing Unix software [had] increased.” The main improvements were the invention of the C programming language, the porting of the whole operating system to C and the invention of pipes1 . One result of these efforts was the ’Third Edition’ that was released in February 1973. Since they used C for the kernel Unix was the first operating system not written in a hardware dependent assembly language. An important step towards portability which made it interesting for people with different hardware. Another attraction were the powerful tools of the system that emerged out of the “Unix Philosophy” which McIllroy, the inventor of pipes, formulated as follows: 1. “Write programs that do one thing and do it well. 2. Write programs to work together. 3. Write programs that handle text streams, because that is a universal interface.” “The [third edition] had been installed on 16 sites (all within AT&T/Western Electric)” in 1973 and the development work had been going on for four years at that time. However, it was not known much outside AT&T. When Thompson gave a paper developed by the research group at the “Symposium on Operating Systems Principles” in Yorktown Heights, NY, in October 1973 Unix became popular in the field of computer science. “Within six months of [the paper’s] delivery, the number of installations 1A
pipe is the connection between the standard output of one program with the standard input of another. This method allows the combination of many commands in one line without having to transfer the data between them manually. McIllroy, one of the developers of Unix, got this idea by thinking about connecting different data streams [Salus94].
2.1. Historical Overview
13
had trebled.” They also published a revised version in the July 1974, Communications of ACM which “caused an explosion in demand for the fledgling operating system” [Salus94b][Ritchie78]. AT&T was accused to have violated the ’Sherman Antitrust Act’ by the Antitrust Division of the Department of Justice of the USA in 1949. This suit was settled by a ’consent decree’ in 1956, however, it was not the final decision. Roughly speaking, it prohibited AT&T to start any other business than telephone or telegraph services. This had a very special effect on the development of Unix. After people outside the company had started to get interested in the operating system AT&T had to avoid any conflict with the decree. Their policy was to license the software (allowed by the decree) but not to pursue software as a business. So, Unix was provided “[a]s is, no support, payment in advance!” or in the words of Andy Tannenbaum on a USENIX/Uniforum meeting: 1. “no advertising 2. no support 3. no bug fixes 4. payment in advance” Without support and bug fixes the growing community of Unix users were forced to help themselves. So, they started “to share with one another. They shared ideas, information, programs, bug fixes, and hardware fixes.” User groups were created where ever Unix was introduced. Among them were universities in many other countries than the USA like Australia, United Kingdom, Germany, Japan. “At ten years of age, Unix was genuinely being used worldwide.” It was a paradox situation. The researchers at BTL had invented a great operating system, but the company’s management could not legally enter this business nor let the copyrights go. On the other side, there was a growing community of users that wanted to use the system and “the spirit that bonded Unix users together in the 1970s and which continues today had its roots in an ’us-againstthem’ attitude combined with a sense of humor.” In the following years users and developers of Unix cooperated very closely: “Something was created at BTL. It was distributed in source form. A user in the UK created something from it. Another user in California improved on both the original and the UK version. It was distributed to the community at cost [mainly for the distribution effort]. The improved version was incorporated into the next BTL release. There was no way the AT&T’s patent-and-licensing office could control this, and the system just got better and more widely used all the time.” [Salus94b] The “Seventh Edition” of the system dated January 1979 was the first portable Unix that already ran on computers produced by DEC, IBM and Interdata. “Portability was born” [Salus94b]. Unfortunately, the new license “prohibited the source code from being studied in courses” [Salus94b] and many universities simply dropped the study of Unix. Additionally, the development became more and more divergent from this point on. The Berkeley Software Distribution V3 (3BSD) emerged out of this situation. It was a complete, improved software system based on a tool set named 2BSD and a port for DEC VAX workstation derived from the Seventh Edition of Unix and has been constantly developed until today in some form.
14
2. The Open Source Phenomenon
With the years, many Unix derivates followed. Some of them were based on original AT&T versions, other on the BSD line, but all of them required costly licenses from AT&T. In the beginning of the ’90 other systems like BSDI, 386/BSD and NetBSD showed up that did not require these licenses. All this was accompanied by long-lasting legal battles and the users were confused and annoyed by the dozens of incompatible Unix versions and an uncertain future. Today, most of these systems have no remarkable market cap anymore and for a while it looked like as if the days of Unix and similar systems were counted, but the situation is changing at the time of this writing and the future is looking a little bit brighter for these systems. However, the history of Unix teaches us that there are many circumstances that influence the rise and fall of software systems.
2.1.2 The GNU Project
The following quotations are taken from [DiBona99, The GNU Operating System and the Free Software Movement]. Starting his job at the MIT Artificial Intelligence Lab in 1971, Richard Stallman joined a “softwaresharing community that had existed for many years”: Anytime you stumbled over an interesting program you asked the creator for the source code and read it, changed it or used parts of it to write a new program. They used the Digital PDP-10 series as their computer system at that time. Unfortunately, it was discontinued in the early 1980s, all the created programs were unusable as they were written in machine depending assembler language and modern computers of that era also had their own proprietary operating system. “[Y]ou had do sign a nondisclosure agreement even to get an executable copy”2 . A cooperative community was not possible anymore. According to Stallman, the rule behind proprietary software was “If you share with your neighbor, you are a pirate. If you want any changes, beg us to make them.”. The given situation forced Stallman to make “a stark moral choice” between the following three options: 1. He could “join the proprietary software world, signing nondisclosure agreements and promising not to help [his] fellow [programmer]”. This would mean to spend his life “building walls to divide people”. 2. Another option was to “leave the computer field” to avoid the misuse of his skills, but they would be wasted, too, and someone else would ‘build the walls’. 3. Looking for a possibility to be a programmer and work on the establishment of a new cooperative community. He chose the last option and decided that the crucial component was an operating system as you cannot use a computer without it. Fortunately, he had already been an “operating system developer” and therefore was the right person to do this job. Stallman describes the spirit behind this decision with the words of Hillel: 2 See
section 2.2.4 on page 19 for details about nondisclosure agreements.
2.1. Historical Overview
15
“If I am not for myself, who will be for me? If I am only for myself, what am I? If not now, when?” He chose to make his new system compatible with Unix which was owned and controlled by AT&T at that time. Therefore he called his new project “GNU” which stands for the recursive acronym “GNU is Not Unix” and coined the term “free software”3 in opposition to proprietary software. In order to speed up the project he decided to adapt existing components of free software wherever it was possible, e.g. TEX as text formatter and X as window system. In January 1984 he gave up his job at MIT and focused entirely on the GNU project. To his delight, other people started to help him with his first step, the GNU Emacs, soon after. It became clear that simply producing software for the public domain4 would not serve the primary goal of the project to “give users freedom” as the programs could be slightly modified (e.g. by porting it to a specific machine) and turned into proprietary software. Therefore they wrote the ’GNU General Public License’ (GPL). This license is based on a method that is called “Copyleft”5 . It uses copyright law to keep software free. In 1985, Stallman and other people engaged with the GNU project decided to found a tax-exempt charity called “Free Software Foundation” (FSF) to handle the business area of the project like donations, selling copies of free software or offering other related services. Although the original intention was to complete the system first and then release it as a whole, people started using single finished components on the various compatible Unix systems. This process had the advantage of improving the software and extending the user community, but “probably delayed completion of a minimal working system by several years”. Another problem was the projects choice to base Hurd, the ’heart’ of the system (the kernel), on Mach6 because they had to wait for the Mach technology to be finished and their own part of implementation turned out to be much more difficult than expected as well. So “[b]y 1990, the GNU system was almost complete; the only major missing component was the kernel.” And today, ten years later, Hurd still has not been finished. Fortunately, Linus Torvalds, an unknown Finish university student then, started to develop his own Unix-compatible kernel “Linux”7 in 1991 which used a different approach to the problem that was considered outdated at that time8 , but was able to deliver usable results much faster. An important aspect was that Torvalds provided Linux under copyleft and invited anyone to help him develop and improve the kernel. The developer community grew quickly and advances were made incredibly fast with the help of this extraordinary development model and the used infrastructure, the Internet. 3 See
section 2.3.1 on page 20 for details about the term free software. domain is not protected by intellectual property laws. Anyone can do with the corresponding material what he wants. 5 See section 2.3 on page 19 and [Copyleft] for details about Copyleft and the GPL. 6 Mach is a micro kernel developed at Carnegie Mellon University and then at the University of Utah. 7 More details about the history of Linux can be found here: [Prasad99, A Brief History of Linux], [Thies99], [LinuxInt, Linux History] and [DiBona99, The Linux Edge]. 8 Linus Torvalds’ kernel Linux is a so called ’monolithic kernel’. The idea of micro kernels had already been developed and was considered to be superior then, but history taught us a different lesson. See [DiBona99, Appendix A - The Tanenbaum-Torvalds Debate] for details on this issue. 4 Public
16
2. The Open Source Phenomenon
“Around 1992, combining Linux with the not-quite-complete GNU system resulted in a complete free operating system.” Heavily improved and extended versions of the Linux kernel9 and the GNU software tools have been released since then, millions of people have joined the GNU/Linux community and today it is the largest growing operating system.
2.2
Intellectual Property
Normally, looking at property we think of things like land, money, equipment, cars and other physical entities, but there is also another kind: intellectual property. This legal term is defined as “intangible property that is the result of creativity, such as patents, copyrights, etc.” [Oxford98, intellectual property] and grants individuals or groups certain control over valuable information. This is explained by the special feature of information: it can be reproduced at low costs and the original creation of these entities usually consume a significant amount of resources of some kind. This is not only an issue of the software field, but also of many kinds of creative work like writing, movies, music and many others as they all show similar qualities. Therefore most societies have got laws in order to protect the investment of the original creator. The cost of copying a video tape is relatively small compared with several million dollars of the production of the actual movie. However, it still takes some resources. You need the equipment and the storage media for the transfer which normally is significant compared with the regular price of a legal copy. Besides, the duplication process normally has some undesired side effects like loss of quality. This is not the case with digital information. The duplicate normally cannot even be distinguished from the original afterwards and the cost is often negligible. For this reason, intellectual property laws have an important effect on software value and thereby its development. There are no uniform international laws about intellectual property but most countries have got treaties with each other in order to provide a minimum protection for their citizens. Still, it would be necessary for an accurate description of the legal framework to concentrate on single countries and their laws instead of examining the global situation. However, as I only want to illustrate the legal background of creative work, I just give a rough picture. There are four legal instruments to protect intellectual property: Patents “a government grant of the exclusive right to make, use, or sell an invention, usually for a
limited period. Patents are granted to new and useful machines, manufactured products, and industrial processes and to significant improvements of existing ones. Patents are also granted to new chemical compounds, foods, and medicinal products, as well as to the processes for producing them. Patents can even be granted to new plant or animal forms developed through genetic engineering.” [Britannica, patents] Copyright “the exclusive, legally secured right to publish, reproduce, and sell the matter and form of
a literary, musical, dramatic, or artistic work.” [Britannica, copyright] Trade Secret “information, including a formula, pattern, compilation, program, device, method,
technique, or process that derives independent economic value from not being generally known 9 1994
- Linux 1.0, 1996 - Linux 2.0, 1999 - Linux 2.2
2.2. Intellectual Property
17
and not being readily ascertainable and is subject to reasonable efforts to maintain secrecy.” (Uniform Trade Secrets Act) Trademarks “any visible sign or device used by a business enterprise to identify its goods and dis-
tinguish them from those made or carried by others. Trademarks may be words or groups of words, letters, numerals, devices, names, the shape or other presentation of products or their packages, color combinations with signs, combinations of colors, and combinations of any of the enumerated signs.” [Britannica, trademark] Software cannot be protected by trademarks. In the recent past several patents have been granted in the information technology, but software is normally governed by copyright and trade secret laws. Looking at proprietary software, the source code is usually considered a trade secret and the binary code is protected by some special copyright laws for computer programs like the Software Act of 1980 in the USA [Breslow86][Burk94]. 2.2.1 Copyright
The best example of the use of copyright laws are books where the law originally comes from. When you buy a book you buy the actual physical item (the paper) and not its contents, but you obtain the legal right to use the contents with the purchase of the information media. The text printed in the book is protected by copyright laws which make it illegal to produce duplicates. Although things like software and music are different from text in various aspects these laws work for them as well. Therefore buying a CD-ROM with software is legally not very different from the action described above. The following is a short summary of the laws’ contents. Copyright laws give the creator five exclusive rights over his work: Reproduction Right The right to duplicate the work in fixed form. Modification Right The right to modify the work to create something new. The result is called
’derivative work’. Distribution Right The right to distribute copies of the work to the public (e.g. sale or rental). Public Performance Right The right to play, dance, act, or show the work at public place or to trans-
mit it to the public. Public Display Right The right to show a copy of the work at a public place or to transmit it to the
public. These rights have certain limitations: Idea The idea that is expressed by the creative work is not protected. Copyright laws only cover the
specific expression not ideas themselves. E.g. in a cook book the recipes can be used without permission, but copying the text is prohibited. Facts Analogous to ideas the facts of copyrighted material are not protected either.
18
2. The Open Source Phenomenon
Independent Creation If an exact duplicate of the work is produced independently by someone else
it is not considered a copy and thereby does not violate copyright laws. Fair Use The ’fair use’ of creative work is not a violation of copyright laws even when it includes
some duplication. Although the term is not defined precisely it covers things like news reporting, research and criticism (for details see [FairUse99]). International protection is guaranteed by several treaties. The two most important ones are the Berne Convention (1886) and the Universal Copyright Convention (1952). Both agreements grant foreign authors automatically the same copyrights in the participating countries as local citizens. Today, most countries are members of at least one of these conventions [Brinson96, Copyright Law] [Britannica, copyright].
2.2.2 Trade Secret
Another possibility to protect intellectual property is trade secrecy. It mainly works by keeping the creative work confidential. When the information becomes publicly available without a criminal action (e.g. by independent discovery) the trade secret protection is lost. As many nations do not even have laws of trade secrecy or are not enforcing them resolutely there is also little protection in the case of criminal action. Therefore the actual protection of creative work is normally achieved by making sure that unauthorized persons cannot access it [Burk94]. An important aspect of trade secrets is reverse engineering as it is legal to take products apart and investigate them in order to obtain confidential information about the item and its creation [Burk94].
2.2.3 Patents
Patents are granted by governments as a payback for the disclosure of an invention. Therefore patent law protect ideas not specific expressions of them like copyright law. This difference is important as you do not even have to know about a patent for infringement10 . So, the usage of similar, independently achieved inventions are a violation of patent law. For this reason, patents are a powerful instrument in order to protect your ideas. However, patents are threatening when they are misused, especially in the area of software development as some developments simply depend on certain ideas. To give an example: Someone would be granted a patent on the basic arithmetical operation like + and -. What should the man at the checkout counter do when I give him a hundred dollar bill? He would not be allowed to calculate the change without permission by the patent holder. Therefore there are requirements and regulations for patents. In order to be granted a patent the invention must normally be: new, useful and non-obvious. An application must be filed followed by a complex process to decide whether the patent should be granted. Once the patent is granted the owner has the right to exclude others from making, using or selling his idea for a certain period of time in the specific country. The duration is between sixteen and twenty years in most countries. 10 derived
from infringe (verb): “actively break the terms of (a law, agreement, etc.)” [Oxford98, infrige]
2.3. Fundamentals
19
There are a lot of different jurisdictions ( more than 100) and no international law. A patent has to be filed separately in each country although there are some treaties to simplify this process [Brinson96, Patent Law][Britannica, patent]. 2.2.4
Contract
Contracts11 as an enforceable promise are a very important instrument in our economical system. As in other business areas it is also used in software development and its marketing. There are two major categories that are commonly used: License
Producers of creative work who are not satisfied with the legal regulation of copyright laws ask you to accept different conditions before they give you a copy of their work. This is done by a contract normally called ’license’12 . In most cases, its purpose is to place more restrictions on the use of the creative work than the law does. Therefore licenses are usually an advantage for the copyright holder and a disadvantage for the person who want to use the material compared to the regular legal conditions. As we will see later this is different with open source licenses. (Non-)Disclosure Agreement
The subject of non-disclosure agreements13 (NDA) are trade secrets. As described above, the protection for confidential information is mainly secrecy. Therefore a company has to make sure that their partners are keeping their information secret when they share it with them for some reason. Roughly speaking, a NDA is the promise to keep the provided information secret. Although this seems to be very simple at the first glance it is not at all and the contracts are sometimes very complex as in many cases the usage of the information requires the publication of some parts of it [Hoffmann00, Disclosure Agreements]. NDAs are usually not acceptable for open source projects because they want to publish their results with its source code. Since most information that was used to produce the software is somehow visible in the source code its publication usually would be a violation of such an agreement.
2.3
Fundamentals
This section gives a definition of open source software and related terms. 11 contract
(noun): “in the simplest definition, a promise enforceable by law. The promise may be to do something or to refrain from doing something. The making of a contract requires the mutual assent of two or more persons, one of them ordinarily making an offer and another accepting. If one of the parties fails to keep the promise, the other is entitled to legal recourse.” [Britannica, contract] 12 license (noun): “a permit from an authority to own or use something, do a particular thing, or carry on a trade” [Oxford98, licence] 13 Non-disclosure agreements and disclosure agreements are actually the same thing although the terms seem to be contradictory. This is explained by the two perspectives that can be taken as the information is disclosed and promised to be not disclosed to someone else.
20
2. The Open Source Phenomenon
2.3.1 Free Software
This term was formed by Richard M. Stallman. He demands that the user is granted the following four kinds of freedom in order to call it ’free software’ [FreeSoftware]: 1. “The freedom to run the program, for any purpose.” 2. “The freedom to study how the program works, and adapt it to your needs.” 3. “The freedom to redistribute copies so you can help your neighbor.” 4. “The freedom to improve the program, and release your improvements to the public, so that the whole community benefits.” Freedom 2 and 4 requires the availability of the entire source code as a precondition. It is important to understand that the term ’free’ is about freedom and not about price. The definition does not exclude charging money for the distribution of your software [GNUProject, Selling Free Software], but the paying customer must still have the described freedom, otherwise it would not be right to call it free software. The term is quite close to the following term ’open source software’, but they are not identical14 . 2.3.2 Open Source Software
The term ’open source software’ is often considered to be equivalent to ’access to the source code’, but this is not true. The Open Source Initiative has clearly defined the criteria the distribution terms of software have to comply in order to be called ’Open Source’ [OpenSource, The Open Source Definition (V1.7)]: Free Redistribution Anyone who received the software legally can share all of it with anyone he
likes without additional payments. Source Code The source code of the software must be distributed as well or be available at reason-
able reproduction cost. Derived Works The modification of the software and the distribution of this derived work must be
allowed. Integrity of the Author’s Source Code The distribution of modified source code must be allowed
although restrictions to ensure the possibility to distinguish the original source code from the derived work are tolerated, e.g. requirement of different names. No Discrimination Against Persons or Groups “The license must not discriminate against any per-
son or group of persons.” No Discrimination Against Fields of Endeavor The license must not forbid the usage of the soft-
ware in specific field of endeavor, e.g. business or genetic research. 14 See
[GNUProject, Why "Free Software" is better than "Open Source"] for details about the differences between the two terms.
2.3. Fundamentals
21
Distribution of License “The rights attached to the program must apply to all to whom the program
is redistributed without the need for execution of an additional license by those parties.” License Must Not Be Specific to a Product The rights given by the license must not be different for
the original distribution and any other one even when it takes place in a totally different context. License Must Not Contaminate Other Software The license must not demand any condition on the
software distributed along with the licensed software, e.g. ’distribution only with other opensource software’ is not allowed. 2.3.3 Open Source Software Licenses
There are many different software licenses in use. Various companies have created their own special license with sophisticated features representing their individual business model. Therefore it is very helpful to know which of them actually qualifies for open source or free software. Both the GNU project and the Open Source Initiative give some orientation in the license jungle (see [GNUProject, Various Licenses and Comments about Them] and [OpenSource, The Approved Licenses] for details). The following licenses are certified by the Open Source Initiative to conform the Open Source definition [OpenSource, The Approved Licenses]: GNU General Public License (GPL), GNU Library or ‘Lesser’ Public License (LGPL), BSD license, MIT license, Artistic license, Mozilla Public License (MPL), Q Public License (QPL), IBM Public License, MITRE Collaborative Virtual Workspace License (CVW License), Ricoh Source Code Public License, Python license, zlib/libpng license. In order to illustrate the general concept behind open source licenses I will give some more detailed information about major implementations: GNU Public License – GPL15 : “The licenses for most software are designed to take away your free-
dom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software–to make sure the software is free for all its users.” [GNUProject, GPL -Preamble] The GPL is the most important open source license as most open source software is distributed under its terms. A major reason for its popularity is its ’virus effect’ which requires anything linked16 with the corresponding software to be distributed as free software as well. The motivation of this strategy is to prevent people from exploiting free software resource without paying (back) to the community by providing their achievements as free software, too. This method is also called copyleft17 . E.g. the members of the Linux kernel project use this license for their developments. GNU Lesser General Public License – LGPL18 : This license is more or less the same as the GPL
except for one important feature: it permits linking with non-free modules. It was originally designed for standard libraries to speed up the adoption of free software since such licensed 15 See
http://www.gnu.org/copyleft/gpl.html for details about the GNU General Public License.
16 ’Linking’ describes a special process taking place while generating an executable program out of source code (compiling).
Roughly speaking, anything that is used explicitly and directly to get the program running needs to be linked with a piece of software. 17 Any software distributed under the terms of the GPL is protected by ’copyleft’. The term was introduced by Richard Stallman in contrast to the term ’copyright’. See [Copyleft] for details. 18 See http://www.gnu.org/copyleft/lesser.html for details about the GNU Lesser Public License.
22
2. The Open Source Phenomenon libraries provide an opportunity for proprietary software to run in a free software system. For this reason it is sometimes also called ’GNU Library General Public License’.
MIT License – X Consortium license19 : The MIT license does not really restrict the software or its
handling. The only condition is to include the copyright and permission notice in all copies. The Berkeley license (BSD license) is similar to this one. For instance, the XFree8620 project uses the MIT license for its work. Q Public License – QPL21 : This is an open source license written by Trolltech AS22 which prohibits
development of proprietary software based on the software licensed under the QPL. Anyone can make modifications and redistribute them in the form of patches23 along with original source code as modifications must be distinct from the original. Generated binaries are allowed to have the same name as the original which is important for dynamic libraries and similar components. Additionally, it forces the author of modification to grant the original producer the right to distribute the changes also under any other license, e.g. a proprietary one. The business model behind this license is interesting. Trolltech AS has two different licenses for the same software: an open source license (QPL- Qt Free Edition) and a proprietary one (Qt Professional Edition). Therefore they operate in both fields with the same library, but different legal conditions.
License Compatibility
Software systems consists of many components produced by different persons. As these modules are often distributed under different licenses, it is sometimes impossible to legally use them in combination. In this case the two licenses are called incompatible. This is a real problem which is becoming worse at the moment as many new companies start to participate in the open source movement and they create their own license instead of using an existing one. The major problem is the combination of the most popular license GPL with material that has other special restrictions like the QPL. Any material distributed under the unmodified GPL cannot be linked legally with any material licensed under the QPL because of its incompatibility [GNUProject, Various Licenses and Comments about Them]. Since these conflicts only serve the opponents of open source software, such incompatibilities should be avoided as far as possible. Therefore it is normally much better for all participating parties to use an existing license than to create another one. The GPL or LGPL provide the best protection against misuse and are compatible with most open source software already released as they are the most common licenses. 19 See
http://www.opensource.org/licenses/mit-license.html for a copy of the MIT license. http://www.xfree86.org/ for details about the XFree86 project. 21 See http://www.trolltech.com/products/download/freelicense/license.html for details about the Q Public License. 22 The company Trolltech AS is the producer of the Qt library which can be used to build graphical user interfaces. The famous K Desktop Environment (KDE) is based on this library. 23 Patches are files that describe differences between two files and can be used to reconstruct the second file having the first one and the patch file. 20 See
2.4. Theories about Open Source Software
Model Resources Period of Planning User Objective Enforcement Progress Collaboration Quality Assurance
Proprietary Software cathedral known whole project paying customer fulfill contract/specification strong private face to face management
23
Open Source Software bazaar unknown step by step co-developer solve problem weak public via Internet competition, peer review
Table 2.1: Open Source vs. Proprietary Software Production
2.4
Theories about Open Source Software
I have tried to identify the so-called ’Open Source Model’, but could not find any existing, consistent model for my purpose. However, I found many theoretical approaches to this subject that turned out to be helpful. Table 2.1 on page 23 gives a short introduction to the open source method by comparing it to proprietary software production. Eric Raymond describes the open source community and their method of writing software in his book “The Cathedral & the Bazaar” [Raymond99]. The title is a simile: proprietary software production as the carefully planned building of a cathedral and the open source software production as (the chaotic) interactions of the participants of an oriental bazaar. Although this analogy might be too extreme, it hints to a major difference between the two types of software creation: strong powerful management on one side and loosely related developers and users organized in several thousand seemingly independent projects on the other. In the following I will quote Raymond’s points I found the most helpful to understand the process of open source software development: 1. “Quality was maintained not by rigid standards or autocracy but by the naively simple strategy of releasing every week and getting feedback from hundreds of users within days, creating a sort of rapid Darwinian selection on the mutations introduced by developers.” [Raymond99, page 24] 2. “Linus Torvalds’ style of development [is:] release early and often, delegate everything you can, be open to the point of promiscuity” [Raymond99, page 30] 3. “Users are wonderful things to have, and not just because they demonstrate that you’re serving a need, that you’ve done something right. Properly cultivated, they can become co-developers.” [Raymond99, page 36] 4. “It is not only debugging that is parallelizable; development and (to a perhaps surprising extent) exploration of design space is, too. When your development mode is rapidly iterative, devel-
24
2. The Open Source Phenomenon opment and enhancement may become special cases of debugging–fixing ‘bugs of omission’ in the original capabilities or concept of the software.” [Raymond99, page 51] 5. “I don’t think it’s a coincidence that the gestation period of Linux coincided with the birth of the World Wide Web, and that Linux left its infancy during the same period in 1993-1994 that saw the takeoff of the [Internet service provider] industry and the explosion of mainstream interest in the Internet. Linus was the first person who learned how to play by the new rules that pervasive Internet made possible.” [Raymond99, page 63] 6. “The [Open Source] world behaves in many respects like a free market or an ecology, a collection of selfish agents attempting to maximize utility which in the process produces a selfcorrecting spontaneous order more elaborate and efficient than any amount of central planning could have achieved.” [Raymond99, page 64] 7. “[I]n a world of cheap PCs and fast Internet links, we find pretty consistently that the only really limiting resource is skilled attention. Open-source projects [...] die only when the developers themselves lose interest. That being the case, it’s doubly important that open-source [developers] organize themselves for maximum productivity by self-selection–and the social milieu selects ruthlessly for competence.” [Raymond99, page 71] 8. “A happy programmer is one who is neither underutilized nor weighed down with ill-formulated goals and stressful process friction. Enjoyment predicts efficiency.” [Raymond99, page 75] 9. “It may well turn out that one of the most important effects of open source’s success will be to teach us that play is the most economically efficient mode of creative work.” [Raymond99, page 75]
Many of Raymond’s theses are research topics of philosophy and other arts subjects. The idea expressed in point 9 is closely associated with the theory of the “homo ludens” (the playing human) in addition to the “homo faber” (the acting human) and the “homo sapiens” (the thinking human). Johan Huizinga introduced the important role of the playing human and the resulting “play element of culture” in his book “Homo Ludens” [Huizinga38]. Nikolai Bezroukov considers the presented bazaar model as “a too simplistic view of the open source software development process”. Instead he “tries to explore links between open source software development and academic research as a better paradigm [...]” and thinks it “should be better viewed as a special case of academic research.” [Bezroukov99, Abstract] (See also [Bezroukov99b].) Considering the open source phenomenon as academic research leads to the wide field of philosophy of science. When searching this discipline for suitable material I found the master thesis of John S. Wilkins titled “Evolutionary models of scientific theory change”: “[S]ociocultural or conceptual evolution [like in science] is not merely analogous to biological evolution, but is exactly the same process acting on a different kind of entity and environment. [A]s in biological evolution, there need be no role for teleological or intentional explanation in order to account for the evolution of concepts and that there is arguably no more than local progress (and that only in terms of the local ecology of social and intellectual functions and opportunities)” [Wilkins95, Abstract]
2.5. Approaching the Open Source Community
25
All the investigated theoretical approaches helped to understand open source software development better and provided a theoretical base to start the work. Additionally, the combination of the “rapid Darwinian selection” mentioned in point 1, Bezroukov’s parallel to scientific research, Huizinga’s homo ludens and Wilkins’ theory about Darwinian theory change might lead to fascinating achievements in the theory of software development, scientific research or creative work in general. However, as this is not a subject of a thesis in computer science, I have to leave these questions for someone else to answer.
2.5
Approaching the Open Source Community
The following section illustrates several practical aspects of open source software and involved parties. 2.5.1 Economy
Many people claim that open source software does not fit in our economical system because of its distribution terms. Although making profits might be more difficult today with open source software than with proprietary software, recent times have shown that you can earn your living with open source software. A good example is the Free Software Foundation that has existed for quite a while. Besides, there is no proof that the situation will remain in the future the way it is now as a lot of money has been invested in open source software and a certain ’hype’ can be observed these days, too. For a better understanding of the economical background I will present some ideas in this section on how to take direct or indirect economical advantage on open source software. Total Cost of Ownership
In order to understand the creation of selling prices for software we have to look at related costs as well. Before a business managers decides to introduce new software for his company he would like to know how much money this decision would cost in total. This amount is called ’total cost of ownership’ (TCO) and covers not only the selling price of the software, but any cost that is caused by this decision. In particular, when there are several alternatives to choose from this sum is an important aspect to compare the different options. I want to give some examples of what has to be considered to calculate the TCO: System Preparation Several additional components are usually required in order to get the new soft-
ware running, e.g. hardware devices, infrastructure or other software. Operation Efficiency All the different phases24 of usage of a software component requires costly
additional resources like time of human actors or hardware devices. When the operation process has a low efficiency for some reason total costs rise tremendously. This could be caused by a none-intuitive or too complex user interface, standard incompatibility of used data formats or bugs in the software. 24 See
section 4.1.3 on page 63 for more details about the use phases of software components.
26
2. The Open Source Phenomenon
Failures Since software operates in an environment failures could cause damages to entities not being
part of the regular process. For example a robot could smash a window or data could be erased to mention light incidents. Training As in the business area the users are paid employees with certain skills, introducing new
software normally means to spend extra money on improving the skills of your clerks, e.g. books or courses. Service Sometimes even the best training cannot transform a secretary into a computer expert in a
fortnight and additional services are required for certain situations like a system crash. Updates Although proprietary software is sold as a ’working product’ it normally contains a lot
of bugs. Additionally, related software components or data formats change frequently in the proprietary software world. Therefore frequent software updates are required in order to have a (better) ’working product’. They are relevant for costs as you usually pay extra for them. Purchase Finally, you have to pay the selling price to get permission to use a copy of a software
product in the proprietary software world.
Giving Away Software for Free
It is a common business strategy to bill expenses for provided products on related goods instead. Therefore the consumer does not have to pay directly for corresponding products and these companies still make profits. For instance, you do not have to pay for watching commercials, but the advertising company has to pay for producing and transmitting them. They are free for consumers and companies do not stop producing and paying them because they have another advantage than direct profits, e.g. popularity of their products. In the information technology business proprietary software is still sold, but related costs are much more important for the mentioned TCO. Therefore a good company could easily give their software away as long as they make enough profits with related services and products. Their business is then based on the knowledge they gained producing the software and their popularity as the original creator. Nobody has a better knowledge about software than its creator, at least at the beginning. As software business changes very fast, this knowledge lead gives a company an enormous advantage in comparison to its competitors. Therefore open source software production might be even profitable using similar business models as proprietary software producers. R. A. Ghosh goes even further and presents a new model for an Internet economy [Ghosh98]. However, the question is whether (open source) software development really is a business case. Considering it as “a special case of academic research” might be a more suitable approach [Bezroukov99] and gives another alternative to integrate open source software development in our society.
Making Money with Open Source Software
Nevertheless even software developers have to eat. Therefore I will give some examples of how present companies profit from the development of open source software:
2.5. Approaching the Open Source Community
27
Software Distributions Distributors simply sell copies of open source software. This business is
based on the idea that the regular user of open source software is willing to pay a small amount for comfortable access to the software. Service There are many different services in the software field as described in the section about TCO
above. Examples are support, training or simply paid bug-fixing. Hardware As hardware devices cannot be used without the appropriate software, vendors usually
spend a remarkable amount of financial resources on the production of driver software. It is the usual procedure to make this software available for free, but without source code. However, more and more companies also start to participate in open source projects to assure the compatibility and support for their products. Information Books, magazines and news services provide required information about open source
software for a reasonable price, e.g. nicely printed manuals. 2.5.2 Software Engineering
Ian Sommerville describes software engineering as follows: “The specification, development, management and evolution of [...] software systems make up the discipline of software engineering” [Sommerville96, Preface] There are several thousand open source projects (OSPs), they are more or less independent and nobody can force them to use a certain method to develop their software. Therefore, some projects use sophisticated strategies of software engineering while others just start working without any planning at all. Although there is no general method of producing software in OSPs, their special open distribution policy raises several interesting questions. Since it is no problem to write a long book on the software engineering aspect of open source, I will only face the prejudice that OSS lacks certain features in comparison to proprietary software. Security
Many people claim that open source software is not secure. Normally, this opinion is based on the theory that the availability of source code makes software less secure in any case. I cannot follow this argumentation as nobody could provide me any reasonable proof for this theory. Additionally, there are some issues that should be considered discussing this question: 1. Software does not work more secure because of the fact that the producing team is the only party who knows how it works. The source code could only reveal already integrated weaknesses of the algorithm which should be fixed anyway. 2. Trusting the creating party without appropriate observation of the result requires a large amount of trust in each single person and party that has access to the (secret) source code. Only one black sheep would destroy the protection provided by secrecy. Besides, you might not even know that you have already lost your protection.
28
2. The Open Source Phenomenon 3. The already mentioned parallels to academic research raises another interesting question: What is about science? Does a scientist use formulas that are based on unavailable secret information that was not challenged by comprehensive peer review, e.g. for constructing a nuclear power station? I do not think so.
I want to finish this point with a quotation from Jim Livermore25 : “It may seem a paradox, but one of the vital elements of security is the absence of secrecy. By this I mean that open access to algorithms, and to the source code that implement those algorithms, is essential if anyone is to rely on the Internet to be a safe forum for business” [Livermore00] Reliability
I could not find any reasonable relation between provision of the source code and bad reliability of the corresponding software, but I found a study of the University of Wisconsin about the “Reliability of UNIX Utilities and Services”: “[T]he reliability of the freelydistributed GNU and Linux software was surprisingly good, and noticeably better than the commercially produced software.” [Miller00, Conclusion] This examination does not mean that OSS is more reliable in general, but proves that at least sometimes OSS is more reliable than its commercial competitors. Reusability
The evolutionary process that is made possible by the permissive licenses helps to shape reusable software components. Therefore it is more reasonable to consider the distribution of the source code and the granted right to modify the software as a stimulating factor for the reusability of software components. Compatibility and Standards
Why should developers use provided standards? The members of OSPs tend to choose the most effective option they can find without much effort in advance. Standards are very handy for this strategy as they contain a lot of theoretical work and developers can concentrate on the actual task instead of spending most of their time thinking about theoretical frameworks and a fundament to provide consistency and compatibility to other parts of the related software system. Well designed standards are shortcuts for open source developers and there is no reason for them to protect their developments by making it incompatible to other software as they do not want to sell 25 Jim
Livermore is the corporate president of Freemont Avenue Software, Inc. They sell firewalls (a security tool for the Internet) and related services. The software is distributed under the QPL. See http://www.opensourcefirewall.com for details.
2.5. Approaching the Open Source Community
29
their software anyway. Actually, they want as many people to use their developments as possible. Therefore standard compliance is a natural choice for OSPs. The other question is: Who is willing to spend the required amount of resources to create standards for open source software (OSS)? First of all, many suitable standards are not made for OSS in particular, but can be used for it as well. Secondly, since many companies have come into being during the last decade which are based on OSS and many traditional software companies have joined the open source community more or less, there are a lot of financial resources available today for such undertakings. Finally, I want to give some examples for concrete standard efforts. Most of them are somehow affiliated and not concurrent: Free Standards Group – FSG26 : “[They] are a non-profit corporation organized to accelerate the use
and acceptance of open source technologies through the application, development and promotion of standards.” Linux Standard Base – LSB27 : “The goal of the Linux Standard Base is to develop and promote a
set of standards that will increase compatibility among Linux distributions and enable software applications to run on any compliant Linux system. In addition, the LSB will help coordinate efforts to recruit software vendors to port and write products for Linux.” The LSB is a member of the FSG. Current members of this project are Caldera Inc., Corel Corporation, Debian Project, delix Computer GmbH, Enhanced Software Technologies Inc., IBM, LinuxCare, Linux for PowerPC, Mandrake Soft, Metro Link Inc., Turbolinux Inc., Red Hat Software, Software in the Public Interest Inc., SuSE GmbH, VA Linux, WGS Inc., SGI. Linux Internationalization Initiative – LI18NUX28 : “Li18nux is a voluntary working group, con-
sisting of Linux and Open Source related contributors who are working on Globalization, a combination of Internationalization and Localization. The organization was formed in August 1999. The ultimate goal of the organization is to achieve software/application portability and interoperability in the International context for Linux and other open source projects. Its activities are focused on the internationalization of a core set of APIs29 and components of Linux distributions to achieve a common Linux environment. This will allow an internationalized Linux application to be executed regardless of different flavor of distributions are used. The results of the working group will be open to everyone, and be proposed for adoption to the Free Standards Group” X Desktop Group – Freedesktop.org30 : “The X Desktop Group is a free software project to work on
interoperability and shared technology among desktop environments for the X Window System. The most famous X desktops are GNOME and KDE.” Filesystem Hierarchy Standard – FHS31 : “FHS defines a common arrangement of the many files
and directories in Unix-like systems (the filesystem hierarchy) that many different developers and groups have agreed to use. [...] 26 See
http://www.freestandards.org/ for details about the Free Standards Groups. http://www.linuxbase.org/ for details about the Linux Standard Base. 28 See http://www.li18nux.net/ for details about the Linux Internationalization Initiative. 29 API stands for ’Application Programming Interface’ 30 See http://www.freedesktop.org/ for details about the X Desktop Group. 31 See http://www.pathname.com/fhs/ for details about the Filesystem Hierarchy Standard. 27 See
30
2. The Open Source Phenomenon The FHS specification is used by the implementors of Linux distributions and other Unix-like operating systems, application developers, and open-source writers. In addition, many system administrators and users have found it to be a useful resource. FHS [...] is currently implemented by most major Linux distributions, including Debian, Red Hat, Caldera, SuSE, and more.”
Austin Common Standards Revision Group – CSRG32 : “The Austin Common Standards Revision
Group (CSRG) is a joint technical working group established to consider the matter of a common revision of ISO/IEC 9945-1, ISO/IEC 9945-2, IEEE Std 1003.1, IEEE Std 1003.2 and the appropriate parts of the Single UNIX Specification. The approach to specification development is ’write once, adopt everywhere’, with the deliverables being a set of specifications that will carry both the IEEE POSIX designation and The Open Group’s Technical Standard designation, and if adopted an ISO/IEC designation. The new set of specifications will form the core of the Single UNIX Specification Version 3, with delivery in [the second quarter] 2001” Debian Policy Manual – Policies of the Debian Distribution33 : “This manual describes the policy
requirements for the Debian GNU/Linux distribution. This includes the structure and contents of the Debian archive, several design issues of the operating system, as well as technical requirements that each package must satisfy to be included in the distribution.” Linux Based Open Standards for Embedded Development – EL/IX34 : “The EL/IX pages are ded-
icated to the development and standardization of Linux for embedded devices.”
2.5.3 Power, Trust and Observation
Software is not only another new technology, but has a deep impact on our society. The virtual space that computer systems are forming is often called cyberspace35 . Software is something special as it controls everything that is happening in cyberspace. This was true from the very beginning of software, but the virtual computer world used to be very limited. First, it was only mathematics and calculations then text entered the cyberspace and other things followed. Today, this development reaches a new dimension with the Internet. So many things that used to take place in the real world are now ’moving’ into cyberspace. People are trading, communicating, arguing, and electing via the Internet and this just seems to be the beginning. Software contains the law of nature of cyberspace that is governing these activities. Therefore giving someone the power to control software in the real world gives him omnipotent power in the virtual world of cyberspace, e.g. it is easy to ’kill’ the virtual appearance of a user by closing his connection to the rest of the world. 32 See
http://www.opengroup.org/austin/ for details about the Austin Common Standards Revision Group. http://www.debian.org/doc/debian-policy/ for details about the Debian Policy Manual. 34 See http://sourceware.cygnus.com/elix/ for details about the EL/IX standard effort. 35 cyberspace (noun): “the notional environment in which communication over computer networks occurs.” [Oxford98, cyberspace] I use this term in an extended meaning as I also include the virtual space on a local computer with or without any connection to other ones. 33 See
2.6. Open Source Projects
31
Observation
Since software controls a lot of valuable goods, it is essential that it works correctly. Therefore it is important to observe the used software as well as possible. When choosing proprietary software a user has to trust the producing software company that they do their very best to provide him with a high quality product as he cannot really check it because the source code is secret and software is too complex to observe it only by empirical testing. I will give some aspects about software people might want to check: 1. if the software does exactly what it should do 2. if the software only does what it should do and nothing else what the user does not like 3. if all appropriate precautions are included for the control of potentially dangerous processes 4. if data and resources are sufficiently protected against undesired actions of ’hostile’ attackers The source code of open source software is available per definition and anyone can control what he wants or pay someone else to observe the software for him. Independence
Whether we like it or not, software is becoming more and more important in our life. This is true for the business world in particular. As long as there are many different compatible software systems we can choose from, this is not a threat to our independence, but compatibility of software systems has turned out to be a very difficult undertaking. Therefore even today, the replacement of installed software systems is difficult and in the future maybe impossible without consumption of enormous financial resources. So, choosing a proprietary software system is a far-reaching decision and means to lose a large part of your sovereignty as you further on depend on the producing software company and its future products. It should be a really trustworthy company as you may find yourself bankrupt otherwise because of technical difficulties. Although many members of the open source community claim it, open source projects and companies do not naturally provide the better technology. However, you can leave your service provider at any time and keep using the same software as it is open source. Then, you can look for someone else to do a better job. Iceland and other nations with too few inhabitants for a profitable software market have already experienced such problems when they were looking for native language support, though, they were willing to pay for the translation effort. Companies will not have it easier in the future to satisfy their needs. Many countries and companies do not like to depend that much on private companies and therefore look for alternatives. Open source software might turn out to be the best choice.
2.6
Open Source Projects
The open source movement is very hard to investigate as an abstract social phenomenon. It is difficult to decide who is part of it and who is not. Fortunately, the actual productive ’units’, open source
32
2. The Open Source Phenomenon
projects, can be observed and analyzed because of their presence on the Internet and their public communication there. Although such an examination can only cover a small part of open source phenomena, it provides a good starting point for further investigations.
Definition
Any group of people developing software and providing their results to the public under an open source license constitutes an open source project (OSP).
2.6.1 Developers
The first arising question is who is actually participating in OSPs. The following section covers the major categories of participating parties. Educational Institutions Universities and other institutions produce a lot of software for educational
and research purposes. Although some parts of it become proprietary software after they have finished many developments are released under legal terms that conform to the open source definition. Research Institutions Research is often closely associated with educational or public institutions by
their work, personnel or financing. Additionally, many projects are based on intensive collaboration between many different organizations. Therefore releasing the research results under a permissive license is often a natural choice because it allows any involved party to use them. Besides, such a license is sometimes also a condition for financial sponsorship. Software Distributors The producers of open source software distribution normally participate
somehow in several open source projects. Their motivation is normally the increase of their user base as users often demand specific features and they do not use distributions that cannot provide these features, e.g. sound support or a word processor. Commercial Companies Aside from distributors, any other business36 based on open source soft-
ware might participate one way or the other in OSPs. Today, this includes most of the large companies of the information technology, e.g. IBM, Intel or Hewlett Packard. Cooperate Users Considering the enormous financial resources that many companies or governmen-
tal administrations spend on their software systems (usually several million dollars for licenses only), sponsoring open source projects is often much cheaper than paying the license fees of proprietary software. Private Users Anyone using open source software is interested in improving it as it gives him a direct
benefit. For this reason, many users participate one way or the other in open source projects. Since many of them are also working in the IT business their participation is often essential for a project. 36 See
also ’Making Money with Open Source Software’ in section 2.5.1 on page 25 for commercial companies.
2.6. Open Source Projects
33
Governments Several governments in the world are concerned about the software business as the
future economy depends on information technology in essential part, e.g. the present financial market is not conceivable without computers. Therefore they prefer to have at least one alternative for their economy in the future in case present strategies turn out to be a failure. Besides, they do not like their economy to depend on single (foreign) companies. A good examples is the sponsoring of the GNU Privacy Guard (GPG) by the German Ministry of Economics37 .
2.6.2
Typical History of a Project Start
Although there is no standardized process or a document ’How to start an open source project’, many interesting parallels between most OSPs exist. In this section a typical beginning of a project is described. 1. A person has a certain concern and he is thinking about good solutions for the matter. 2. He asks some friends and colleagues what they know about the issue. Some of them have similar problems, but no solution either. 3. All interested persons start to exchange their knowledge on the topic and thereby create a vague picture of the central issue of the group. 4. Interested people who are willing to spend some resources on finding a solution for the given problem create an informal project and other persons leave the group. Therefore the central issue covers the concerns of all participating persons. 5. The project members work on the issue until they achieve some presentable results. 6. They make their work publicly available at a place where many people are able to access it. Maybe they announce their project at some places like mailing lists, newsgroups or online news services. 7. Other persons recognize some of their own concerns in the project and are interested in a convenient solution, too. Therfore they review the projects result (e.g. by using it). As they look at the matter from a different perspective, they have sometimes good suggestions for improvement and start communicating with the project and thereby join it. 8. The project grows and a lot of feedback helps to get a better understanding of the problem and possible strategies to solve it. 9. New information and resources are integrated into the research process. 10. The research cycle is closed and goes back to point 5. 11. The project’s community is established and will react to future changes the same way it emerged out of the society. 37 See
http://www.gnupg.org for details.
34
2. The Open Source Phenomenon
2.6.3 Examples
Open source is not only theory, but a lot of OSPs have already started and have been developing software for many years now. Since there are thousands of totally different projects, it is impossible to present them all, but I have selected several large projects that differ in several aspects in order to illustrate the current variety of open source development. XFree86 XFree86 Project, Inc: “XFree86 is a freely redistributable implementation of the X Window
System that runs on UNIX(R) and UNIX-like operating systems (and OS/2). The XFree86 Project has traditionally focused on Intel x86-based platforms (which is where the ‘86’ in our name comes from), but our current release also supports other platforms. One of our current goals is to increase the range of platforms that XFree86 runs on.“ [XFree86Org] KDE K Desktop Environment: “KDE is a powerful graphical desktop environment for Unix work-
stations. It combines ease of use, contemporary functionality and outstanding graphical design with the technological superiority of the Unix operating system. KDE is an Internet project and truly open in every sense. Development takes place on the Internet [...]. No single group, company or organization controls the KDE sources. [...] All KDE sources are [...] subject to the well known GNU licenses. [...] KDE has developed a high quality development framework for Unix, which allows for rapid and efficient application development. Applications developed with this framework include KOffice, a full-featured Office Suite, KDevelop, a C/C++ IDE (Integrated Development Environment), and many others.“ [KDE] (See also [Dalheimer99].) The Gimp GNU Image Manipulation Program: “The GIMP [...] is a freely distributed piece of soft-
ware suitable for such tasks as photo retouching, image composition and image authoring.[The GIMP home page] contains information about downloading, installing, using, and enhancing GIMP [and] serves as a distribution point for the latest releases, patches, plugins, and scripts. We also try to provide as much information about the GIMP community and related projects as possible.” [Gimp] (See also [Hackvän99].) Apache Apache HTTP Server: “The Apache Project is a collaborative software development effort
aimed at creating a robust, commercial-grade, featureful, and freely-available source code implementation of an HTTP (Web) server. The project is jointly managed by a group of volunteers located around the world, using the Internet and the Web to communicate, plan, and develop the server and its related documentation. These volunteers are known as the Apache Group. In addition, hundreds of users have contributed ideas, code, and documentation to the project.” “In February of 1995, the most popular server software on the Web was the public domain HTTP daemon developed by [NCSA].” Development of that software had stalled and a small group of webmasters gathered together for the purpose of coordinating their private changes. They put together a mailing list and shared information space for the core developers. “By the end of February, eight core contributors formed the foundation of the original Apache Group”. “[W]e added all of the published bug fixes and worthwhile enhancements we could find, tested the result on our own servers, and made the first official public release (0.6.2) of the Apache server in April 1995.” After a new design for the server architecture, extensive beta testing, many ports to several platforms, new documentation, and many additional features, Apache 1.0 was released on December 1, 1995. “Less than a year after the group was formed, the Apache server passed NCSA’s
2.6. Open Source Projects
35
[software] as the [number one] server on the Internet. The survey by Netcraft38 shows that Apache is today more widely used than all other web servers combined.” [Apache] (See also [Coar00].) Linux Linux Kernel: One of the most famous open source projects is the Linux kernel. Linus Torvalds
has started this project in 1991 and has been leading it since then. The source code package passed the 70 megabyte limit a while ago and is still growing. “Linux is a clone of the operating system Unix, written from scratch by Linus Torvalds with assistance from a loosely-knit team of hackers across the Net. It aims towards POSIX compliance. It has all the features you would expect in a modern fully-fledged Unix, including true multitasking, virtual memory, shared libraries, demand loading, shared copy-on-write executables, proper memory management, and TCP/IP networking. Linux was first developed for x86-based PCs (386 or higher). These days it also runs on Compaq Alpha AXP, Sun SPARC, Motorola 68000 machines (like Atari ST and Amiga), MIPS, PowerPC, ARM and SuperH. Additional ports are in progress, including PA-RISC and IA-64.” [LinuxKernel, What is Linux?] Mozilla Mozilla.org: On January 23rd, 1998 Netscape Communications announced that they would
release a version of their product ’Netscape Communicator’ as free software and the first developer source code was released to the public on March 31st. They named the new project ’Mozilla’. “Mozilla is an open-source web browser, designed for standards compliance, performance and portability. [Netscape Communications] coordinate the development and testing of the browser by providing discussion forums, software engineering tools, releases and bug tracking.” [Mozilla] (See also [Unknown00].) WorldForge “The WorldForge project evolved out of a desire for better internet role playing games.
Where Multi-User Dungeons have had a great degree of success through the early years of the internet, games like Ultima Online have failed to build upon these successes to produce an advanced virtual world of any merit. Avinash Gupta and a collective of perhaps a dozen enthusiastic gamers began to share their vision of what a truely fantastic gaming experiance should be. [...] Though some may find our willingness to give away our work a little odd, remember that our goal is to have the best gaming experiances available. Commercial ventures are limited by time and funding, if it’s not done on time and within budget, the companies responsible will go broke. For us it’s just about having fun, and it will be done when it’s done.” [WorldForge] For anyone who would like to look at more projects Freshmeat39 is a good starting point. This site is the most complete directory of open source projects I have found so far.
38 See
http://www.netcraft.com/survey/ for the Netcraft survey. They investigated the responses of over 15 million reachable web servers in their May 2000 survey and more than sixty percent were powered by Apache. 39 See http://www.freshmeat.net/ for details about Freshmeat.
36
2. The Open Source Phenomenon
37
3 Organization of Open Source Projects
This chapter examines the work of open source projects and tries to find basic relations and roles and illustrates the resulting organizational structures.
3.1
Resources
Although open source projects sometimes seem to work without any resources1 this is not true at all. They depend as much on them as any other development work. However, there are several interesting differences to traditional software development. The following section tries to give an understanding of the supply, allocation, usage and removal of resources in OSPs.
3.1.1 Unrestricted Information and Software
We have laws2 in our societies to regulate many parts of our social life. These rules do also apply on the utilization of any kind of property, e.g. you must not sell a rented car. This is not different with resources used in projects and laws play an important role when we talk about information as a resource because the usage of it is mainly limited by legal restrictions. The usage of information is governed by intellectual property3 laws which protect the producer of creative work by giving him several special rights over his work. However, creators sometimes forgo some of these privileges and thereby remove the corresponding restrictions off their work. Any creative work distributed under open source licenses4 can be considered (legally) unrestricted in this context as the license allows distribution and modification of the material under conditions all OSPs apply to because of their definition. Additionally, creative work is inexhaustible5 because of its intangible nature. For these two reasons unrestricted information is an unlimited resource for OSPs. Software and information are both results of creative work. Software is actually only a special kind of information. Therefore it is not necessary to examine it separately. It is only used differently than other kinds of information. For this reason there are different laws for software and the legal situation actually is a little bit different. However, looking at unrestricted software this aspect does not matter. 1 resource
(noun): “a stock or supply of money, materials, staff, and other assets that can be drawn on by a person or organization in order to function effectively” [Oxford98, resource] 2 law (noun): “the system of rules which a particular country or community recognizes as regulating the actions of its members and which it may enforce by the imposition of penalties” [Oxford98, law] 3 A detailed description of the general legal situation of intellectual property can be found in section 2.2 on page 16. 4 See section 2.3 on page 19 for details about open source licenses. 5 inexhaustible (adjective): “(of an amount or a supply of something) unable to be used up because existing in abundance” [Oxford98, inexhaustible]
38
3. Organization of Open Source Projects
3.1.2 Restricted Software
OSPs try to avoid software that is not distributed under open source licenses if possible because it introduces several problems. But before investigating these difficulties it is necessary to examine the software itself as there are three different groups of software resources that have to be distinguished: Tools All software participants use them in order to produce software. The user of the project’s result
does not need these resources in order to run the produced software, e.g. a user does not need the version management tool the developers use for administration. Components Any software component that is necessary in order to run the developed software later
on but is developed and distributed separately by someone else. Integrated Material This category deals with material that you integrate in your software, e.g. you
take a certain portion of the provided source code and insert it in your program. However, it is illegal and a violation of intellectual property law to integrate any kind of restricted material in software and redistribute it. Therefore this group is not further investigated. Every participant can use any tools he wants in order to work on the project’s work, but no restricted tools are provided by the project as such tools usually require one license per user and this simply does not function because the number of persons working on a project changes frequently and rapidly. Besides, it would be too expensive to use such tools for most projects. Nevertheless, in the case participants use their own restricted tools for development they belong to the category of individual resources. Although developers prefer open source components it is sometimes unavoidable to use some restricted components when you have to obey certain rules, e.g. given by your client. This is a compromise many, but not all, open source developers are willing to make. But one has to be careful about the combination of restricted with unrestricted software because of things like the known ’virus effect’ of the popular GNU Public License6 . In spite of the fact that open source projects sometimes use restricted software components it is very rare that they use software that is not available at low or no cost. A good example to illustrate this situation is Sun’s Java. Many people do not want to ignore Java because of its license, but it is restricted software because it is not available in source code, but it can be downloaded for free from their web site. Therefore even restricted software components normally are not a limited resource as they are available for free.
3.1.3
Restricted Information
Information always needs some media in which it is stored. This can be a book, a sheet of paper, a CD-ROM or more abstract entities like the Internet. As the Internet is a very special medium it is reasonable to examine it separately, dividing it into two different groups: online (published via the Internet) and offline (other media) information. 6 See
section 2.3 on page 19 for details about the GNU Public License and its ’virus effect’.
3.1. Resources
39
Online
There are a lot of manuals, reports and other documents published via the Internet which are freely accessible for everyone. As it is normally copyrighted material it is illegal to duplicate it or redistribute it, even printing it might be a violation of copyright laws, though, the authors normally do not mind. However, as it is normally good enough to use it online, knowing its location is sufficient and its Internet address can be distributed legally. Additionally, by using it online everybody can access it at the same time. Therefore such resources are unlimited in some respect as long as they are available at all. Offline
Offline media are things like books, magazines or CD-ROMs. Although it is possible to transfer the containing information onto new media (e.g. by photocopy) it is not done normally for various reasons: Laws All this material is protected by intellectual property laws (mostly copyright laws). For this
reason it is normally illegal to reproduce the work without permission except this action is covered by the term of ’fair use’. Effort Reproducing creative work sometimes is just not worth the effort and it is more reasonable to
buy simply a new copy, e.g. the original digital video disks are cheaper than empty medias at the moment. Quality When information is transfered from one medium to another the result is normally not of the
same quality as the original, e.g. when you photocopy a book you have only single sheets of paper. It is the same if it is a copy or the original media. Both have to be transported to the other participant when he wants to use them. However, this is normally not done because of the distributed members, the high shipping costs, long transportation time and similar inconveniences. So, offline information can normally be considered as individual resources even when it is bought with project’s money. 3.1.4 Individual Resources
It is very common to use private equipment or something that is allowed to be used by a third party like the employer or an university for the work in OSPs. Actually, most projects do not even have shared property apart from their developments, a mailing list and the project’s home page. The developments are not a resource but a result and more or less public domain because of their license, the communication channels and the Internet sites are normally sponsored by an educational institution or another organization that has got some interest in the project. This service would not cost much anyway, but it is comfortable not having to deal with any legal matters like contracts, etc. Another problem is the legal owner of shared resources. Of course, the ownership can be assigned to several individuals, but this procedure requires frequent reassignment as project members come and go while
40
3. Organization of Open Source Projects
the project continues. The best solution for this problem is a legal person in the form of a non-profit organization or anything similar, however, it means normally more inconvenience than the project members are willing to take. Therefore, people try to avoid this unpleasant extra work and simply use the resources they could get from their own sources. Nevertheless, there are some large projects which have shared resources and a few have founded a non-profit organization. But even then the amount of individual ones is much larger than the ones owned by the project. So, in most cases the major resources are individual ones. The handling of individual resources is relatively simple and more or less obvious: They are supplied by a person joining the project, the new member is the only one allowed to use them and he takes them with him when he leaves the project. He might do someone a favor and permit this person to use it, but this is an exception. Since these resources are combined with the project member they can be seen as a part of the participant. Most of the used resources therefore belong to another resource type: human actors. This perspective simplifies the further examination a lot as I can exclude these individual resources and only consider them investigating human resources. 3.1.5 Computer Equipment
Hardware is one of the limited resources OSPs depend on. Without a basic computer equipment the members cannot even communicate. Thus, it is an essential resource, but most of it belongs to the category of individual resources, too. Therefore it is necessary to draw a line between these two categories of resources. As it is not possible to look at each possibly used device I will give different groups of similar entities. There a several aspects that can be used for this purpose: Owner The owner might be an individual, a legal person like a company or the project itself. The
owner has the authority to decide what happens with a device: who is allowed to use it, at what time, where, how, etc.. Location As hardware is a physical entity it always has a place where it is situated at a certain point
of time and it takes some resources to transfer it from one place to another (e.g. actor’s time, money). Looking at the costs of worldwide transportation, the sensitivity of computer devices and the low prices of the equipment it is not sensible to frequently move it around. Thus, the location can be considered as static in most cases. Physical Access For some hardware a person is required to be at the location of the device in order
to be used, e.g. scanners: you cannot use a scanner without feeding it with the image you want to scan. This is normally because some of its in- or output is a physical entity or requires one as a medium (e.g. paper for a printer). Remote Use Devices that only process information somehow are at least theoretically capable of
being used without physical access (hard disk drive, processor, memory). However, in order to do this a connection must be established between the user and the device for its usage. The general solution of the problem is some kind of connection of the hosting computer with the Internet. Administrator Hardware that is connected to the Internet is always controlled by some software in
order to prevent misuse (firewall, operating system). Therefore every user needs access rights to use the equipment. The authority which decides who gets them and who does not is called the administrator. This person might be the owner himself or someone authorized by him.
3.1. Resources
41
Considering these aspects four groups can be distinguished:
Individual Hardware
The project does not own a device and the administrator only allows certain members to use it, e.g. the private computer of a participant or a workstation at an office. Therefore this group is not considered hardware, but part of individual resources.
General Available Hardware
The following conditions have to be true for a device to belong to this group: 1. It must be capable of remote usage. This excludes all devices that need physical access for operating. 2. The hosting computer system is permanently connected to the Internet. 3. The administrator is participant (maybe only for this job) and is willing to grant access rights for the hardware to anyone who is taking part in the project provided it is necessary for the project’s work. All hardware meeting these conditions can be used by any participant via the Internet. So, the major problem of this equipment are physical limits like CPU time and storage space. However, there is normally sufficient capacity to satisfy moderate requirements of the entire project because of the fast hardware development and the steep fall in prices of computer products. Nevertheless these resources have to be allocated to single users. This task is normally done by sophisticated multiuser operating systems (e.g. GNU/Linux) which handle this job sufficiently for most cases.
Partially Available Hardware
Some hardware cannot be used remotely because one needs physical access or it is rarely connected to the Internet by a voice phone line. Although these devices can be used with the help of others it is normally not reasonable to take this option considering the person’s time and/or the transport of the medium (e.g. floppy disk, image). So, this equipment can only be used at their actual location by people who have direct access to it.
Other Equipment
I do not consider equipment as hardware that is used, but not controlled by a project member. However, they are not excluded from this examination as they are investigated in other parts like service or infrastructure, e.g. a sponsor might provide a mirror site of the project’s data. I consider this as a service of the sponsor not as hardware.
42
3. Organization of Open Source Projects
3.1.6 Human Actors
Participants take part voluntarily in OSPs without getting paid by the project for their work, though, some of them get paid by companies who are interested in the project’s work for some reason, e.g. SuSE GmbH sponsors the two major developers of the ALSA7 project [ALSA99]. Still, there is no central authority inside the project that has enough power to force someone to do something he does not like as the hardest punishment is the removal from the project and this action is often more harm for the project than for the participant. Of course, you can ’ask’ developers if they are willing to handle a certain task, but it is preferred to wait until someone declares himself responsible for it because this procedure secures his motivation to actually do the job at best and without executive power it is much more important to have someone working on a task than being able to blame him for a’ failure. However, failures and irresponsible actions always have consequences for participants as they lose the other’s trust and respect and this is a real loss of valuable goods for them as it requires hard work to (re)gain them. However, bottlenecks sometimes arise because some jobs have just waited too long for someone to take care of them. This situation becomes annoying when others have to wait because of the pending task, but annoyance again is a good motivation and one of the affected developers would probably step out and solve it. If not, there are different strategies used to handle this situation. A typical method is to draw the logical conclusion from the dilemma and freeze every work that is depending on it. Normally this motivates enough actors to step out and take care of it. If not, the affected part or even the entire project is suspended until it is revived. As it is open source, there are no legal problems for totally different persons to take over the project and it is a common procedure. When working in a company one normally only provides his labor. Any material that is needed in order to do the job is usually supplied by the employer. In the profession of software development this covers computer equipment, rooms, software, furniture, office supplies, telephone, Internet connection and many other things. Besides, the employer has a certain interest in keeping his clerks fit for work. Therefore many of them provide training, health insurance, sickness benefits and so on. All this does not exist in OSPs. Project members use their own individual resources (see above) and have to care for themselves. These developers are much more subcontractors than employees and coordinate and optimize themselves independently as far as possible. The lack of payment raises the question of why someone takes part in a project without any obvious benefit. The answer is very simple: they want to have the job done according to their personal ideas. Therefore they help to develop or improve the project’s software and receive the corresponding second level benefits like honor, experience, know-how or simply suitable software. Investigating this issue further on leads to complex social and psychological questions which are not part of this thesis8 . 3.1.7 Financial Resources
OSPs normally do not have any income or expenses as most things are handled by the individual members instead of the project as an organization and the developed software is normally provided for free download. Therefore money usually is not an important resource. Sometimes, it seems to be a little problem for projects to decide what they should do with the money in these rare situations when they win an award or receive donations. 7 ALSA
stands for ’Advanced Linux Sound Architecture’. See http://www.alsa-project.org for details. 2.6.1 on page 32 gives some examples for more detailed motivation of developers.
8 Section
3.1. Resources
43
Although OSPs have learned to live without large amounts of money it does not mean that they would not like some more. There are many things the participants would like to do, but cannot afford it: exhibitions, congresses, project meetings in real life, expensive hardware or anything else that consumes a significant amount of money. Developers are willing to spend their time, use the facilities they already have available and also ’donate’ small amounts of money for their development work (e.g. for Internet connection), but it is something different to attend a congress on the other side of the globe with your own money. So, any donations or sponsorship are always welcome, though, the incoming money has to reach a certain level in order to be a significant help because of the number of participants. For this reason, in most cases only the ’core team’9 benefits from the money. 3.1.8
Service and Infrastructure
As members of OSPs are normally spread all over the world they depend on a communication infrastructure: the Internet. You have to pay your Internet service provider (ISP) for connecting you with this network. There are two possibilities: permanent and temporary connection. As the last one has usually the advantage to be cheaper most developers choose this alternative, though, this leads to a problem. In order to collaborate it is very useful to have a central storage point for project’s data, but this requires to have the hardware online all the time as coordination would become too complicated otherwise. Fortunately, many participants have access to such computer equipment and are allowed by the governing authority (e.g. a university) to use it for the project to some extent. This might be email accounts, home pages, storage capacity or entire computer systems. Since these sponsored services are often associated with a specific project member they have to be considered as an individual resource. Therefore the withdrawal of the resource is normally unavoidable when this participant leaves the project. As all persons interested in the project should be notified when changing the essential services this solution is uncomfortable. Additionally, there are always some services which a certain project cannot obtain in this way, e.g. a bug tracking system. Nevertheless, this situation is bearable when the associated participant stays on the project for a long time. Still, it seems as if there has been a demand for special services for these kind of projects for a while because the SourceForge10 project is attracting a lot of developers. It is an especially designed virtual development center for OSPs and provides a central access point to them and several useful services. The project space is not associated with a specific person and they do not charge anything for it so far. 3.1.9
Summary
I have presented several important resources in the last section, but most of them turned out to be very similar in some respect. Although the division of resources I have just presented is reasonable for traditional software development it seems to be inappropriate for current OSPs. I found the following categories much more suitable for the structure of OSPs and they give a much better picture of the situation as well: 9 Most
projects have a small group of developers that participate continously over a longer period of time and they are usually the most enthusiastic members, too. These people are often called the “core team” of the project. 10 See section 5.3.3 on page 88 for details about the SourceForge project.
44
3. Organization of Open Source Projects
Human Actors
They are the members of the project with all their individual resources as described in section 3.1.6 on page 42. Support Systems
Any service or tool used by the project to support their work belongs to this category. This covers the communication infrastructure and development tools like a versioning system. Since the very beginning of OSS there has been some kind of technical support that helped the different developers to collaborate. At first, it was only the Internet itself, then software tools like the revision control system (RCS) were developed. Today, the infrastructure and the tools merge and are provided to the developers in the comfortable form of service. Therefore it is more accurate to speak about a whole ’support system’ than tools and infrastructure. OSPs use any sensible resource that someone provides, but do not depend on them. The only required limited resources are the participants’ individual resources which they control themselves (e.g. Internet access). They are extraordinary flexible and adaptable to any kind of change and react mostly as individuals instead of as a large scale organization. Therefore their work is very productive measured by the consumption of resources and for this reason relatively efficient compared with other kinds of software production. Besides, the participants are very inventive in order to make the best of the provided resources and do not spend too much time and energy on their procurement as the main focus is on getting the work done. Unlimited Information
This category covers any informational resource that is not covered by the other two categories and can be used unlimited by anyone connected to the Internet. Using software means also to modify it to suit your personal requirements in this context. They are mostly used as ’input’ for the actual production process: Components the produced software will depend on, standards they want to be compatible to, source code that is integrated in order to speed up the project, etc.
3.2
Coordination
Coordination11 is very important in all parts of our life. It is present in every movement of our body, in traffic, music, computers and many other things. However, we are normally not aware of it, as coordination usually only becomes visible when it is lacking, e.g. in missing a bus. It is an important aspect of coordination to prevent such incidents. In OSPs coordination should help the different elements of the project to work together efficiently like one big organism, give the participants’ action a direction and avoid any kind of problems inside the project. Therefore the definition given by Malone and Crowston is suitable in this context: 11 coordination
(noun): “the organization of the different elements of a complex body or activity so as to enable them to work together effectively” [Oxford98, coordination]
3.2. Coordination
45
“coordination is managing dependencies between activities” [Malone93] They distinguish the following categories of dependencies: 3.2.1 Shared Resources
“Whenever multiple activities share some limited resource (e.g., money, storage space, or an actor’s time), a resource allocation process is needed to manage the interdependencies among these activities.” [Malone93] I have investigated resources of OSPs in section 3.1 on page 37 and identified three important categories:
Human Actors
As participants12 take part voluntarily in OSPs there is no central authority that has the enforcement power to simply ’allocate’ the project members like paid employees. All project members decide for themselves what they spend on their time and all other resources under their personal control. However, this does not answer the question of how actors are coordinated. It is a very complex question closely associated with a person’s social relations to his environment. For this reason I can only provide an abstract and a technical answer: The coordination is done by social conventions that often are not expressed in any abstract form but learned by experience13 . Although there is no general technical solution it is one of the major tasks of support systems described in the next section to help the human actors to manage the dependencies between their work, e.g. a simple solution is the administration of a ’to-do-list’.
Support Systems
Support systems14 are based on several limited resources (e.g. computer equipment) and for these reasons they have some limits, e.g. storage capacity on the web server used for project’s data. However, the control and thereby the allocation process is mostly not the responsibility of the projects’ ’management’, but the job of the corresponding sponsoring service provider. The support system is distributed over the provided server (e.g. email server) and the client of each individual user (e.g. his email tool). Therefore the individual resources of the developers have to be investigated as well, but they are under their private control and coordinated by the participants independently. Anyway, in these rare cases when some kind of project’s authority is responsible for the coordination of limited resources they use simple software tools to handle it (e.g. the operating system or quota software). Normally, OSPs do not spend much energy on such issues, but accept the given situation. 12 See
section 3.1.6 on page 42 for a more detailed examination of human actors in OSPs. section 5.1 on page 77 for the difficulties with unexpressed conventions. 14 See chapter 5on page 77 for details about technical support for software production. 13 See
46
3. Organization of Open Source Projects
Unlimited Information
As this category is only about unlimited resources, there are no dependencies between different used activities and it does not need any kind of allocation process. They can be used by as many persons as desired as long as they want to. 3.2.2 Producer/Consumer Relationships
“[...] a situation where one activity produces something that is used by another activity.” [Malone93] We have to distinguish two different producers: another project or a fellow project member. Relations to Other Projects
Many OSPs utilize a lot of components produced by other projects, but they only use already released results and therefore do not have to wait. Of course, it is better to base your work on the latest release and making sure there is no better release available soon is recommended. So, sometimes projects actually do wait for some others, but it is considered offensive to even ask a project (better its developers) to speed up their work without having contributed a significant amount. They allow you to use their work for free, but they are not your supplier and do not want to be treated as such. For this reason the usual answer to such question is: “If you want to speed it up, you should better start working!” Nevertheless, as many developers take part in several projects over the time and many information and experience is exchanged, there are a lot of friendly relations between OSPs even when they work on concurrent issues. Therefore they are often called the ’Open Source Community’. Still, the usual behavior is to simply ’ignore’ depending projects as the participants work as fast as they can anyway. This seems to be plausible as you need unused available resources in order to speed up your work and OSPs use anything they can get. Sometimes there are hundreds of somehow related projects as you can imagine looking at the dependencies of the XFree86 package15 shown in the figure 4.1 on page 65. Because of their number and complexity any kind of coordination of these relations would be hopeless anyway. Inside the Project
As mentioned16 , participants decide themselves what they work on. It is the regular procedure to wait until someone declares himself responsible for a certain task. Therefore bottlenecks sometimes arise because some jobs have just waited too long for someone to take care of them. With increasing pressure one of the affected developers normally steps out. If not, the logical conclusion is to suspend the affected part or even the entire project until it is revived. As it is open source, there are no legal problems for different people to take over the project later on. 15 XFree86
is the most popular graphical user interface of the operating system Linux. It is derived from the X11 window system of Unix. A package is software that is integrated in a system of many other packages. The software is a certain release normally produced by an OSP. 16 I have described the general coordination of human actors in section 3.1.6 on page 42.
3.2. Coordination
47
3.2.3 Simultaneity Constraints
“Another common kind of dependency between activities is that they need to occur at the same time (or cannot occur at the same time).” [Malone93] Although there might be more constraints I consider it sufficient to discuss the two major ones in order to illustrate the general strategy to solve such problems: Real Time Communication
Meetings, conferences, phone calls, discussions on Internet relay chat channels and other kinds of real time communication must take place at the same time for all participants. There are no general rules on how to handle this issue, but a common procedure is to propose, discuss and finally announce a certain point in time. Projects use mostly mailing lists for this purpose. Document Modification
Any conflicts arising out of parallel modifications of documents is undesirable as it results in extra integration work that often cannot be done automatically. For this reason most projects use some kind of version management system to coordinate these activities efficiently. This tool is part of the support system17 . However, sometimes extra work is unavoidable, but still seems to be the best solution for OSPs considering the required effort to ’plan’ the processing. 3.2.4 Task/Subtask Dependencies
“[...] a group of activities [that] are all ’subtasks’ for achieving some overall goal.” [Malone93] Several persons working on the same task must coordinate their activities somehow in order to make sure the result of their activities fits together in the end. I distinguish three different methods to handle this dependency. Top-Down Goal Decomposition
“[...] an individual or group decides to pursue a goal, and then decomposes this goal into activities (or subgoals) which together will achieve the original goal.” [Malone93] In this case, the team facing the goal is established from the beginning and is not supposed to change a lot. Significant resources are used to produce a plan of the future actions before the working process even starts, subtasks are created by some kind of management and assigned to specific team members. This method has an emphasize on the corresponding responsibility. Most commercial companies use this procedure for their coordination. OSPs sometimes use this method for administrative issues. In the area of actual software production it is not very popular as it introduces several problems: 17 See
chapter 5 on page 77 for details about technical support of OSPs.
48
3. Organization of Open Source Projects 1. There is no real management with enforcement power. 2. Plans become rapidly outdated by some kind of changes (persons leaving or joining the project, new technologies, etc.). 3. Ideas are not received by command 4. Creativity does not know the person’s limit of responsibility.
Bottom-Up Goal Identification
“[...] several actors realize that the things they are already doing (with small additions) could work together to achieve a new goal. ” [Malone93] Considering the large number of OSPs in the world and the philosophy of free knowledge exchange they are based on, it is only natural that new projects emerge from already existing ones and new goals are achieved by the combination of their resources. Actually, this is one of the fundamental ideas of the open source philosophy. Sometimes, developers even discover they are working on the same problem and merge their projects or only integrate the others work in their own software. Concurrent Task Processing
Another possibility18 to coordinate goals and their subtasks is to let people work concurrently. This method is close to the bottom-up type as the collaboration is a loose combination of team and individual work, too. This way, members have a lot of freedom, keep the control over their resources and still improve their productivity by collaboration. A common procedure is the following: Everyone starts working, produces some results, the group compares achieved (partial) solutions, discusses them and starts a new concurrent working cycle with the parts they agreed to be the most promising ideas. 3.2.5
Summary
Although I have not investigated all dependencies OSPs have to deal with, the most important ones are covered and the general method of coordination is sufficiently illustrated by the given categories. The examination reveals some special features of the coordination of OSPs: 1. For a central coordination you need a strong management authority like an employer who has the power to establish and enforce regulating rules. Since OSPs consist of voluntary developers, they normally do not have such an authority. Therefore a lot of dependencies are regulated by the establishment of social conventions and social interaction. The majority of these activities are processed unconsciously like many other things in our daily social life outside the Internet. 18 These more complex dependencies have not been investigated separately by Malone and Crowston.
this group because of its frequent use in OSPs.
I have decided to add
3.3. Identified Structures
49
2. Developers seem to avoid dependencies wherever it is possible. Unavoidable ones are accepted as given fact and every participant is responsible himself for finding a way to live and work with them. 3. When certain dependencies become too disturbing for several developers of the open source community and good ideas emerge out of the experienced problem, a new project is founded and people try to find a suitable solution to improve the situation in general. 4. Generally, the method of open source software production seems to produce only a few dependencies. Their special handling of resources, the high independency of open source developers and the acting on their own authority are probably major reasons for this phenomenon.
3.3 Identified Structures By investigating several OSPs, I was able to identify some structures that help to understand their organization and illustrate the model of a typical OSP.
3.3.1 Activities
This section originates from the question about what OSPs actually do. In order to answer this question I identified several major activities. They are continously running processes which interact with each other. Although the picture might not be complete looking only at the given activities it should at least illustrate the fundamental structure of the development work.
Communication
The available information of a project is distributed among its members. In order to work together and to achieve best results they need to exchange a lot of their information between each other. For this reason a project needs to provide an opportunity for its participants to share their knowledge with each other efficiently. This leads to the following questions: 1. Which knowledge should be transfered? 2. Who should provide the knowledge that should be transfered? 3. Who should receive the transfered knowledge? 4. How should the knowledge be transfered? The answer to the fourth question is communication19 itself because of its definition as ”the imparting or exchange of information by speaking, writing, or by using some other medium” [Oxford98]. Therefore the issue of this activity is to share the relevant knowledge of the project between all participants to support their work. 19 The
specific means of communication used by OSPs have been investigated in section 5.3.2 on page 87.
50
3. Organization of Open Source Projects
Since there is no general answer to the other three questions, I have tried to find something that is close to the situation in current OSPs. I do not claim that the following answers are the only or the best ones, but they are suitable for this context. Everyone who requires certain information notifies the other project members. This might be expressed by a call for comments, petitions, questions or just presenting material with the convention that any missing information should be transfered. Making any kind of working results available to others might be the most important and most fundamental request for getting information in OSPs as the publication of the material normally implies the invitation to review and comment it on condition that the reader is qualified for this job. By being aware of the situation of the requesting participant other project members can help him if they have some knowledge which they consider helpful for the presented issue. Having identified some useful information they transfer it to the other person through some kind of communication (e.g. email). Coming back to the questions above: Requested knowledge should be transfered (first question) from everyone who thinks to have something to contribute to the specified topic (second question) to the participant who announced the information request (third question). The important aspect is the active request for information. This is also a difference to the documentation activity which is done without such a request. In that case, nobody is really asking for this information at the time they are produced, but they might be useful later on for some purposes which may be even unknown at the writing time. Decision-Making
Setting up goals or priorities for the future, choosing between contradicting ideas, changing the project’s policies are all part of this activity. One person, a group or all members might be involved in such a process. It can be done formally following rules the members agreed on before or informally by a simple conversation. The result might be considered as final or temporary. All this is done by decision-making processes inside the project. It is not much work to actually make the decision: a single word is normally enough. But in order to be able to find the best alternative a good preparation is indispensable and that can be a lot of work: The collection of all required information, working out all alternatives, inform everyone who should know that the decision is about to be made and so on. Although most of this is covered by the other activities someone has to start, stop and direct them. For instance, the collection of the required information is part of the communication activity, but as mentioned someone has to specify the information that is needed by a request for the communication to start. This is done by decisionmaking, maybe by working out a document describing the issue which is done by the work activity which again is controlled by a coordination activity to manage dependencies with other activities. Human beings do many things without thinking about them because they are used to it for example. Considering the large number of situations where it is necessary to make decisions it would just paralyze us when we would have to do all this consciously. As this activity is focused mainly on decision-making it does not make sense to cover these situations. So, I consider only decisions as part of this activity that fulfill the following conditions: 1. There have to be two or more different options to choose from.
3.3. Identified Structures
51
2. The person has to be aware of his possibility to choose. 3. The decision is not obvious for him.
Coordination
The task of the coordination20 activity is to make the different elements of the project collaborate together as if it were one big organism. All activities of an OSP need some kind of coordination as they all depend on their context including other activities. These relations of the project’s activities have to be considered somehow in the performance of actions. However, a lot of these dependencies are hidden in the work activity as the coordination process can only handle relations that are visible in the used level of detail. Besides, many communication actions can be considered as part of the coordination process, too, because in order to manage relations I need to actually know them and communication is about sharing information. Since OSPs consist of voluntary developers, they normally do not have a strong management authority that is capable of a centralized coordination process. Therefore most dependencies are regulated by social conventions and interactions. The method of open source software production seems to produce only a few dependencies. Their special handling of resources, the high independency of open source developers and the acting on their own authority are probably major reasons for this phenomena.
Documentation
The work activity is close to this one. I have chosen to split them because these are actually two different tasks although the boundary between them is fluid. The issue of documentation is to provide any interested party with a detailed description of the development process. The primary goal of the work activity is to get the job done. Therefore the documentation activity slows down the projects productivity at first glance, but becomes valuable in the long term. Documentation of the development process should not be confused with a manual in this context. It is supposed to store information about how the results are achieved and why things were done this way. The collected data is not intended to help someone to use the final result, but to understand the construction work. The goal of this activity is to enable another expert to understand and improve the developments. Although this is aimed at others than the original developer it might be even valuable for him as people tend to forget things and it is nice to have something that helps to bring back the memories. Additionally, such information is useful for various other reasons, too. I just want to give some examples:
Manuals
Someone else than the developer can write manuals based on the stored information.
20 coordination (noun):
“managing dependencies between activities” [Malone93]. Section 3.2 on page 44 contains a detailed examination of OSP’s coordination.
52
3. Organization of Open Source Projects
It is helpful to have a detailed change history to track down bugs. Maybe you can even find out the point of time the bug was introduced to the system. Bugs
Any piece of software is situated in a software system21 . Compatibility requires to avoid contradictions to other software that is installed on that system at the same time. Investigations on source code level might be sufficient in the short term but ineffective as it is simply a large and complex amount of material to look at. This is especially true considering the rapid and frequent changes of software.
Compatibility
Every little note that is describing the idea behind the source code is helpful in order to avoid problems and to reduce work to be done when this part of the system is changed as the idea often remains longer than the code representing it. And even documents like simple log messages often contain important information for this purpose although they are originally done for other reasons. E.g. the change of a standard library to a newer version has broken Star Office 5.022 as it used socalled ’undocumented functions’ which are intended for internal use only. They had to update their product in order to make it compatible with the newer version although they used the old library code correctly, but had not considered the idea behind the source code which was to only use documented functions outside the library.
Work
All other activities I have investigated in this section take place on a meta level. “Work” is the actual production of the software. Therefore it is the most important activity and all other ones only serve the purpose to optimize this one, but the way the work activity is done depends on the processing of the other activities as well. It is something like a small sub-project and requires all described activities again internally on a lower level: decision-making, coordination, communication, documentation and even work itself. All these sub-projects have a certain issue that they want to handle. This might be large things like the creation of a system kernel or small ones like the fixation of a certain bug. Although this matter might be specified precisely it is usually expressed vague, e.g. by a prototype, a short description or an example. Some people like to collaborate without splitting the task to smaller sub-task which requires more communication and others want to work more separately and have clear boundaries between their responsibilities. The observer’s focus is scalable and it is up to him if he wants to analyze the next lower level in detail as well. There are many different perspectives and you choose the level of detail that is suitable for your purpose. Therefore even the project itself might be considered a regular work activity of a higher level undertaking. The number of people participating in a work activity may vary from one to several hundreds. 21 See 22 I
section 4.1.1 on page 60 for a closer examination of software systems. am talking about the upgrade of the ’glibc’ library from version 2.0 to 2.1.
3.3. Identified Structures
53
3.3.2 Basic Roles
The activities of OSPs normally are not planned much in advance and there is no central authority which has the power to assign certain responsibilities to someone. Therefore it is rather difficult to identify abstract roles in specific projects and even harder to find suitable roles for OSPs in general. For this reason, a concentration on specific tasks of the project work seemed to be the most promising strategy to divide the performed actions in different categories. Following this idea I have identified five major roles: Developer He participates mostly in the work and documentation activity. His actions are normally
intended to process a given issue. Manager A manager directs the project and therefore decision-making and coordination are his major
activities. Maintainer A maintainer is responsible for any kind of issue concerned with a specific component
the project is based on. Additionally, he is the interface to the project of this component. For this reason his major activities are work and communication. Administrator An administrator is something like a maintainer. He is responsible for the support
system, the software it is based on and the services the project uses. Commentor Anyone who wants to provide some kind of feedback and does not integrate it by himself
in the research work is called a commentor. This could be developers usually working on a different part of the same project, other participants, a maintainer of another project responsible for the component provided by this project or a regular user. Real participants are always somewhere in between these abstract roles and perform actions belonging to several different roles. This is not a contradiction to the given structure, but shows that real persons can have more than one job in a project. Additionally, this is also true the other way around: One role is normally occupied by several persons. Fig. 3.1 on page 54 shows the four active roles and their major interactions. Developers and maintainers provide the required software, managers manage it and commentors review all activities and give feedback. The administrator is the invisible agent which supports all these activities. 3.3.3 Objects and Data Organization
As OSPs act in the virtual space of the Internet the most important entity is information in any form. Therefore an investigation of the organization of projects’ data helps to understand OSPs and how they work. Three different categories of data objects can be distinguished: Work Objects Manuals, source code and all other first level data belong to this category. Meta Objects Issues, bugs, comments, work status, (basic) documentation of the work process and
other meta-data of the project’s research is considered a meta object.
54
3. Organization of Open Source Projects
Developer
Manage
Comment
Manage
Comment Manage
Manager
Comment
Manage
Commentor
Comment
Maintainer
Figure 3.1: Basic roles in open source projects with some interactions
Administrative Objects Roles, developer’s data (e.g. subscription to a mailing list), instances of
a role, object’s ownership or object’s access rights are all administrative objects. This group contains all kind of data of the support system. For this reason, they have only administrative value and are somehow associated with its technical environment.
As no central authority exists, there are no general rules saying how projects’ data should be organized. However, investigating OSPs you can usually find a simple data structure. This might be the result of the project’s focus on their central task: developing software. Fig. 3.2 on page 55 shows a generic, hierarchical structure which seemed to be suitable for describing most organization types: A project consists of one major section. Each section again consists of a meta domain (for meta objects), work domain (for work objects), sub-projects (0 or more) and sub-sections (0 or more). A sub-section consists of one section. A sub-projects consists of another project. Since administrative objects are part of the underlying support system they are not represented in the figure.
3.3.4 Project Relations
There are a lot of relations to other organizations or individuals. Some use the produced software (user), some provide components the project based their work on (component provider) and others provide software tools or services the administrator uses to provide the support system for project members (support system provider). Fig. 3.3 on page 55 illustrates these relations.
3.3. Identified Structures
55
Project
Section
Meta
Work
Sub-Section
Sub-Project
Figure 3.2: The structure of the project space
Project
Administrator
Support System Provider
Manager
User
Commentor Developer Maintainer
Figure 3.3: Relations of open source projects
Component Provider
56
3. Organization of Open Source Projects
3.3.5 Procedures
The materials given in this section so far are pieces of one big picture. They show OSPs seen from different perspectives: activities, roles, objects and relations. In order to demonstrate how the described structure could be used to understand the organization of real project I have mapped some procedures on the given structure. However, they are only abstract examples and might work differently in reality. For instance, a person might have both roles in a communication process (e.g. find a bug in his own work) and therefore leaves some steps out since he would not communicate with himself per email.
Process a Regular Issue
No 1 2 3 4 5
Action declare yourself responsible for a certain issue work on issue present your results and all available information review presented material discuss material
Activity coordination
Actor’s role developer
work communication
developer developer
work communication
commentor developer, commentor
Activity work decision-making communication decision-making
Actor’s role commentor manager manager, commentor manager
Activity decision-making communication decision-making work decision-making work
Actor’s role manager manager, commentor manager developer manager administrator
Create New Issue
No 1 2 3 4
Action find a problem, specify it accurately propose problem as a potential issue discuss potential issue accept or cancel issue
Release
No 1 2 3 4 5 6
Action propose preparation of a public release discuss release issue accept or suspend release issue prepare release declare release ready prepare public download
3.3. Identified Structures
57
Process Maintainer Issue
No 1 2 3
Action realize issue of your responsibility and declare it pass issue to maintained project initiate procedure for new issue
Activity coordination
Actor’s role maintainer
communication communication
commentor, maintainer commentor
Activity coordination
Actor’s role manager
communication communication decision-making work communication work communication
manager manager, commentor manager developer manager commentor manager, commentor
Process Manager Issue
No 1 2 3 4 5 6 7 8
Action realize issue of your responsibility and declare it pass the issue to your section discuss presented issue confirm, dismiss or return issue process issue present results to other project members review presented material discuss presented material
58
3. Organization of Open Source Projects
59
4 Basic Concepts of Open Source Software
Firstly, computer systems are presented as the technical context of (open source) software. Secondly, various identified processes of the development, deployment and usage of open source software are illustrated.
4.1
Computer Systems
We are investigating the production of software in this chapter and the result of this process is an abstract, complex entity: a piece of software. And it cannot be used in an isolated form. It always needs a special environment for usage. There is an incredible amount of dependencies between computer programs1 and their surrounding system and they all have to be considered somehow during the development process. Therefore it is necessary for the understanding of software production to investigate the system a program will be used in later on. All computer programs run in a larger software system that includes all software components interacting with it directly or indirectly. There are fluid boundaries to other systems like hardware devices, but to draw a sharp line between them is rather difficult and not necessary for the purpose of this section. The main focus of this section is the investigation of the software systems. Furthermore an overview of the components they are based on will be given for a better understanding.
4.1.1 Major Components of a Computer System
Software always needs a suitable computer system in order to be used and is useless at least for regular users without the suitable equipment. Today, computers are used as a multi-functional tool for many different tasks like communication, image manipulation, word processing or software production. Several sub-systems are necessary to do these different jobs. It depends on the user’s wishes and the planned tasks which components are required or optional. The elements of a computer system can be grouped into different major categories:
Hardware Devices
Any kind of computer equipment belongs to this category. Examples are main board, hard disk, scanner, printer or modem. 1I
will use ’(computer) program’ as a synonym for ’piece of software’ further on, though, some definitions of computer program might contradict this usage as they sometimes exclude things like system software (e.g. the operating system).
60
4. Basic Concepts of Open Source Software
Software System
Software uses other components (e.g. hardware devices) to provide the user or a depending component with a certain functionality. There are three different layers that can be distinguished: Operating system “[S]oftware that controls the many different operations of a computer and directs
and coordinates its processing of programs. An operating system is a remarkably complex set of instructions that schedules the series of jobs (user applications) to be performed by the computer and allocates them to the computer’s various hardware systems [...]” [Britannica, operating system] Middleware “Software that functions as a conversion or translation layer. It is also a consolidator
and integrator. Custom-programmed middleware solutions have been developed for decades to enable one application to communicate with another that either runs on a different platform or comes from a different vendor or both. Today, there is a diverse group of products that offer packaged middleware solutions [...]” [TechWeb, middleware] Applications Programs designed for end users. They sit on top of middleware and operating sys-
tem software because they are unable to run without them. Any program that processes data for the user belongs to this category. It includes generic productivity software (e.g. word processors) as well as custom and packaged programs (e.g. for billing). [TechWeb, application program][Britannica, computer program][Webopedia, application]
Documentation
In traditional computer systems documentation is considered part of the corresponding component. This is reasonable for selling isolated products, but inefficient for its usage in computer systems. You only require the suitable information for your personal computer system and it should be as accurate as possible. Most users know the problem of two thousand pages of manuals without saying anything about your problems. There are many different possibilities to solve this issue. One is to eliminate the need for information by preventing any kind of difficulties. Unfortunately, all strategies to reach this goal that I know failed more or less. A more realistic strategy is to separate the documentation from the corresponding component and treat it as a separate one, though, it is very important to keep components and their documentation synchronized then. This could be handled by additional dependencies. However, the following advantages compensate the extra effort: 1. By this procedure different documentations could be provided for the different cases: one for developers, one for technical users, one for non-technical users, one for children, etc. 2. Another advantage is that documentation could sometimes be written for complete sets of components instead of many isolated ones. 3. As other skills are required to write a documentation, it is naturally a different (but related) task.
4.1. Computer Systems
61
4. The developers of the corresponding component could concentrate more on their work and only support the documentation process with their knowledge. 5. Since documentation has often been neglected anyway, separate documentation projects would give unskilled users the opportunity to contribute and thereby raise additional resources. 6. Users that have not been involved in the production are less system-blind and understand problems of unexperienced persons much better.
User Data
The job of computer systems is to process any kind of information. Therefore user data is the essential part of those systems as everything else only serves the purpose of performing any demanded possible action on the data, e.g. manipulation, transportation or archiving.
Support Service
Computer systems are very complex and even experts often cannot handle them without the help of others. For this reason an increasing number of companies offer some kind of support to help users getting the computer systems work efficiently. On the other side more and more persons depend on these services to operate their computer. Sometimes it seems to be the regular case that unskilled computer users are totally helpless without such support. Therefore support service has to be considered as an important part of a complete computer system. There are different kinds that can be distinguished: Support Hotline Phone support by an independent service provider. It is used in the case of urgent
problems of some kind. Often these services are specialized on certain components or issues, e.g. they only provide services for one particular operating system or application. There are different forms of payment like billing over the telephone company. Home Service Other companies improved the service by sending technical stuff out to their cus-
tomers. Anyone who has tried to explain technical issues to unexperienced computer users knows why: it is very hard to talk about a computer issue on the phone without sharing the basic technical terms. The disadvantage is the high cost of such a service, though, it is usually worth the extra money. Product Support Another idea is to help customers to use the things they have just bought. The
product could belong to any of the described categories: hardware, infrastructure, operating system, application or data. This service is offered as part of the purchase by the producer or the merchant. It might be included in the sales price, in other cases you have to pay extra.
Integrated Service
For any standalone computer that is not connected to a computer network the other categories would be sufficient, but today most computers are somehow linked to computer networks.
62
4. Basic Concepts of Open Source Software
Integrated services are the best solution I could find in order to map connected computer systems into the description of a local one. A consideration of remote computer systems leads to a much more complex system and the exclusion of the corresponding components from the system results in a loss of essential functionality. Therefore the necessity for (temporally or permanent) connection to another computer system is a required condition for this category. Integrated services provide certain functionality based on software and hardware components distributed over several linked computers. Everything that is not part of the local computer system, but supplies some kind of functionality like storage or calculations is thereby providing an integrated service. The local component that is interacting with the remote component is called an integrated service interface. As there are different (local) network layers there might be also different levels of interfaces. 4.1.2 Task Processing
Today, computers are used as a multi-functional tool for many different tasks like communication, image manipulation, word processing or software production. Private users often do not know what they are going to do with their equipment at the time they buy it. Some people are curious about the new technology, some think they are expected to own a computer, some want to play a little bit with it like a toy and others might even consider it a nice piece of furniture. They do not plan the future usage of their equipment, but even these people want to use their computer system at a certain point in time to support a certain activity like sending an email, playing an adventure game or writing a letter. In the field of business, computers are mostly used as a tool to help someone to do specific jobs like writing an essay, sending information to a customer, storing and retrieving data or to do calculations. Although they might not be specified in advance, users expect computers to do exactly what they want during a job’s execution. So, it is of major interest that computer systems process a task according to the user’s personal demand and do this perfectly. Computer systems should do what the user demands. But what is the normal situation today? Users do what the system forces them to do and the result of our hardest efforts are sometimes unsatisfying although we have obeyed precisely all ’commands’ of our computer system. We invest a lot of energy in order to get it working: 1. read manuals and other documentation 2. spend hours for the configuration of the software 3. do extra work to recover lost data or generate it a second time 4. accept the loss of important data that cannot be recovered 5. spend a lot of time for the preparation in case of a system failure 6. repair our installed system when it gets corrupted 7. accept bad results because the production of better ones failed for too long 8. spend a lot of money on expensive support services to fix mysterious problems we simply cannot handle
4.1. Computer Systems
63
Considering all these obstacles it sometimes seems questionable if it makes any sense at all to use a computer. Therefore it would be reasonable to reduce the amount of work that has to be invested in order to get your things done properly. This includes all efforts caused by any kind of system failures like crashes, virus attacks or undesired actions. 4.1.3 Usage
When we talk about usage of computers we think of a system that is ready to go, but this perspective is leaving out all the other important activities that have to take place in order to get and keep the system running. Although several decades have passed since the invention of computers, it is still a challenging undertaking to operate them properly. If you want to enable your computer system to process a new task you always have to integrate some new components into your system. Although the required second level components might belong to other categories, there is always a top level software component that has to be integrated into the software system and I will use it to give a short overview of the relevant phases of usage that usually have to be passed: System Preparation You have to make sure that all required second level components are installed.
They could belong to any of the described categories: hardware devices, infrastructure, software system or service2 . Installation The software system has to be adjusted to host the new piece of software and the com-
ponent has to be integrated into the software system. Customization As most components have a generic functionality it is normally necessary to config-
ure it to the individual requirements of the user. Daily Usage When we are lucky, we are now able to process our new task exactly they way we want
it, everything is fine and the phase of daily usage starts. Problem Solving Unfortunately, in most cases some difficulties arise sooner or later. We can accept
the unsatisfying situation or try to solve the problem somehow. Brave users do this on their own, others ask for service and hopefully both get it resolved. Sometimes, you only have to change a little bit of your configuration, but the corresponding option might be hard to find. Update Frequent updates of software components is a usual procedure to avoid and fix problems.
Although some people think that they can buy a computer system and never change it thereafter, most of them give up this idea as soon as they encounter the first severe bug. Replacement Sometimes we do not only want to improve the installed components, but want to use
a different one to process the same task for some reasons. It might have fewer bugs or be more suitable for your personal needs. Removal The removal of components are often not considered when you start using them, though,
you will find out that it is an important phase as soon as you want to deinstall some components that are not prepared for a removal. Sometimes the only way to remove a component completely is to rebuild your system from scratch. 2 Some
people only have little knowledge and many systems have little protection against inappropriate usage. Therefore I sometimes think unskilled persons should not even turn on their computer without support service.
64
4. Basic Concepts of Open Source Software
4.1.4 Dependencies and the Complexity of Software Systems
Dependencies play an important role in computer systems and software production as these systems consist of many small components. All these different modules have to co-exist and collaborate together in their shared environment. Additionally, software components are not developed for one system only and a lot of differences between the various systems exist. There are several categories of dependencies. Some examples:
Conflict The integration of one component excludes the integration of another particular one. This
could be caused by shared resource for example. Requirement One component is based on another one and therefore requires its installation to work
properly. The absence of the required resource results in a total failure, e.g. the program does not even start. Optional Certain functionality of a component is based on another one and therefore demands its
installation to function correctly. The absence of the optional resource results in a partial failure of the component as certain tasks cannot be processed. Choice Two components provide the same functionality. Although they have no problem to co-exist,
users have to choose the one the system should use for a certain task, e.g. transporting emails.
Another important group of dependencies emerge out of the social life of the developers as it always has some impact on the produced software systems and thereby create social dependencies for the users of the software3 . This integrated social dependencies might be accepted or refused by the users. The original research team often has not been conscious of the issue and have not known about alternatives, e.g. many tools are developed under the condition that you write from the left to the right side, but in some cultures people write the other way around, or from top to bottom. Many developers are not aware of conflicts and their consequences before they are told. Therefore the relations between software components are not technical problems only, but involves social dependencies as well. Figure 4.1 on page 65 illustrates the major technical dependencies of one fundamental component of a Linux based computer system: the XFree86 package. Most technical relations to other components of the computer system (e.g. hardware, depending applications) are not shown, some indirect dependencies are cut of and pure social relations are not considered. Still, the displayed dependencies seem to be not manageable with known methods considering that all these components are modified at the same time and not one after the other. Someone may think that developers make the computer system more complex by the separation of all these components, but in most cases this is not true. Larger components often only hide relations inside and tend to enlarge the total amount of source code. For this reason, the extra code produces unpredictable conflicts as all components finally have to use the same resources anyway, e.g. one keyboard. Nevertheless, it is true that components should not become to small, but this is not the case for the given example. 3 See
[Grinter96] for a more detailed examination of social dependencies in software development.
4.2. Improvement of Open Source Software Components
65
xbase
xf86setup
rstart
xmh
xterm
xdm
xlib6g-static
xsm
xbase-clients
xproxy
xnest
xslibg
twm
xprt
xlib6g
libc6-dev
rsh-server
libstdc++2.9-dev
libpam0g
nmh
mh
libncurses4
libncurses5
ncurses-base
ncurses
ncurses-runtime
libpam-runtime
libpam0g-util
liblockfile1
libpwdb0g
libdb2
libpam0
debconf
libpam-util
libpam
xinetd
netbase
tcpd
cpp
libwrap0
libg++272-dev
libg++-dev
libpwdb0
libgdbm1-dev
g++
xserver-agx
gcc
libstdc++2.10-dev
binutils
libg++272
libdb1-dev
egcc
gas
xserver-8514
libg++27-dev
elf-binutils
xmodmap
xlib6
xdevel
xlib
libpthread0-dev
xserver-s3v
libc-dev
xserver-mach32
libg++2.8-dev
libc5-dev
xfntscl
xfntcyr
timezone
libglib1.2
libtricks
xfntbase
xfonts-75dpi
xfnt100
tk8.2
xfnt75
tk
tk40
tcl
libc6-bin
gconv-modules
libc6-doc
libpthread0
elf-x11r6lib
xserver-i128
xserver-3dlabs
xserver-w32
xserver-mono
libstdc++-dev
libstdc++2.8-dev
xfree86-common
xserver-s3
xserver-p9000
xserver-svga
xserver-mach8
xserver-fbdev
xserver-vga16
libz1
xfs
xvfb
perl-5.004-doc
makedev
xsun-utils
base-passwd
glibc-doc
glibcdoc
libwcsmbs
libstdc++2.9
libc-doc
tcl8.2
tcl74
perl-doc
perl5
perl
perl-5.004
perl-base
perl5-base
io
perl-5.004-base
libgdbmg1
libgdbm1
fakeroot
libstdc++2.10
xfonts-100dpi
rsh-client
libc6
timezones
xfonts-base
xaw-wrappers
xserver-common
libstdc++2.9-glibc2.1-dev
xfonts-cyrillic
libdl1-dev
xserver-mach64
xserver-tga
xserver
xfonts-scalable
xlib6-altdev
xlib6g-dev
rstartd
xext
apt
libc5-altdev
libc5
ldso
ldconfig
Figure 4.1: Dependencies of Debian X-Window packages generated by Debian’s package administration tool (see [Gunthorpe00] for details)
4.2
Improvement of Open Source Software Components
The easiest way to improve a computer system is to change some components of its software system. The major problems of this undertaking are the described technical and social dependencies that have to be considered. Although developers give their best to produce good software it is the regular case that software contains a lot of bugs and insufficiencies. Many methods have been tried to improve software systems, but there are still many bugs to be fixed and a lot of modifications to be made in order to support the users’ tasks in the most efficient way. Users often focus on single components instead of looking at the software system as a complex entity. It does not help to have a perfect word processor when the keyboard driver is unusable or the operating system is messing up your data frequently. Therefore it is necessary to locate any problem of all installed software that is affecting your demanded task and get it resolved. However, this undertaking seems to be impossible considering the amount of source code that has to be investigated. The open source movement has found its own solution for this problem. The following is a description of the procedure normally used for the improvement of software that is integrated into computer systems based on open source software like GNU/Linux. 4.2.1 Major Processes
Figure 4.2 on page 66 shows the three processes that illustrate the parallel activities a software component is involved in and their major interactions. Basically, the user process specifies the component, the research process commits the changes and the support system provides the working tools.
66
4. Basic Concepts of Open Source Software
Additional User Input
User Process
Payback for Component
Component
Additional Research Output
Research Process
Payback for Support System
Additional Support System Input
Additional User Output
Additional Research Input
Support System
Support System Process
Additional Support System Output
Figure 4.2: Improving Systems Based on Open Source Software
User Process
The goal of a user is to get his task done. In order to achieve this goal he procures the component C that is provided by a project. The project has specified a set of several other components that are required in order to use C. Each component of the set is provided by another independent project. The user procures the demanded components of the set and integrates them together with C in his software system and tries to process his original task. When the user is completely satisfied with the execution of the task everything is fine and there is nothing left to be done. However, the user usually finds something that he dislikes and prefers to have it changed. It could be a bug, a new feature, a little extension or some other modification. So he will send a description of his problem back to the project and hopes that it will be resolved. When the issue is of general interest (e.g. a severe bug) or a developer likes the provided idea, a consideration of the comment and thereby a corresponding modification of the component is likely. When no project member is really interested, the issue will be ignored or suspended until nothing more important is to be done. When a user has got an important individual problem with the component, the project is not interested in it and no better equivalent component is available, nothing will happen until the user acts himself. Then, he has two major possibilities to achieve his goal: He tries to convince another person to process the issue (e.g. by payment) or join the project and do it himself. As an active project member he will also have much better chances to find other participants to help him with his problem, but then this work is part of the research process.
4.2. Improvement of Open Source Software Components
67
Research Process
A project is established4 and starts to work on a software component to help processing certain tasks. These tasks normally are not formally specified in detail, but described in various forms like text documents or software prototypes. Another opportunity to learn anything about the idea the project is based on is communication with developers and users. Once in a while the project decides to collect some of the results of their research work and releases it to the public as a new version of the corresponding software component. The time intervals between two releases could range from hours to years. However, users have to distinguish different types of releases5 , e.g. bug fix releases are published whenever they are required. New major versions with a set of added features are usually only released when a reasonable usage of the new functionality is possible. Besides, there might be different release branches for different software systems, e.g. an experimental version for developers and a stable one for productive work. Therefore it is sometimes hard work for an unexperienced user to identify the version that is best for his system. Another service most OSPs provide is free read access to most of their research material and communication channels (e.g. mailing lists). This openness gives non-project members the opportunity to learn more about the project’s work and reduces the communication efforts for unproductive questions and comments. Each time a version is released the developers receive important hints in return about problems, bugs and ignored dependencies to other components. Therefore users are essential as testers and inspectors for OSPs, though, the users’ intention is to get their tasks done. Sometimes, feedback is not about technical problems, but ideas to optimize the component functionality in order to support certain (user) tasks. Such comments are not necessarily valuable for developers, but could give them a better understanding of the component’s job in the different computer systems. Besides, users can be a good source for new ideas as they often think differently and have another perspective than the regular developer. Since many OSPs have already been established all around the world, the opportunity to reduce the project’s work by using existing components is likely. All these used components are usually specified somewhere to a certain degree in order to help users to get the component running. However, it is difficult to identify all required components because of the general complexity of software systems6 and the included (indirect) dependencies. Developers use a specific support system to speed up their work. Editors, version management tools, communication systems and provided FTP servers are a part of it for example. Basically anything used to work on the software belongs to the system. Anything becoming part of the resulting software component or being required for usage is not considered part of the support system. When developers experience any kind of difficulties they report it to the provider of the corresponding part of the support system and hope that it will be resolved. 4 See
section 2.6.2 on page 33 for the typical creation of a project. section 5.1.6 on page 81 for details about release versions. 6 See section 4.1.4 on page 64 for details about the complexity of software systems. 5 See
68
4. Basic Concepts of Open Source Software
Support System Process
The subject of the support system is any functionality developers use in order to work on the project. It is provided by local software components on their personal computer systems and integrated services supplied by the connected computer network: the Internet. It has to be considered that all utilized integrated services of the individual project members form one very complex service system that is based on a certain information infrastructure. The infrastructure again depends on the Internet or is even identical with it. This definition includes the project’s interaction with their users and the corresponding integrated services. Single software components of a local computer systems are maintained by the individual developers. He is then considered a regular user of the corresponding project. Some integrated services are not supplied by any project member, but separate service providers. It depends on the providers’ motivation how much the project can influence the efficiency of the corresponding service. Nevertheless, failing services are not useful and are likely to be replaced if there is no possibility to improve them. 4.2.2 Balance between Delivery and Payback User-Research Relation
When the user is completely satisfied with the execution of a task he might pay the project something back to make sure it will provide a good solution for this issue the next time, too. This might be simple positive feedback, some money or anything else that the project can appreciate. Although most people do not give any payback, some users do it. The special situation of the Internet helps to keep things balanced7 . As providing something for download means the same effort no matter how many persons actually use it, projects do not mind downloading without payback as long as the total amount of payback value is worth the effort. When they are not satisfied with the payback, they lose interest, stop maintaining the project and it is up to the next user to help himself. Research-Support-System Relation
OSPs are actually producing valuable goods. Although they normally are not paid directly by anyone, they have many indirect benefits. One of them are several free integrated services. The commitment of the sponsors might have several reasons like advertising income, better social image, personal usage of the produced component. Experience shows that the provision of free services is a good deal for many organizations. Other services are simply paid by someone, e.g. the Internet service provider of a private developer.
4.3
A Simple Framework for Open Source Software
This section illustrates the main model of open source software by presenting its surrounding framework. It is closely associated with the identified structures of OSPs in section 3.3 on page 49. 7 See
[Ghosh98] to learn more about economical theories of the Internet.
4.3. A Simple Framework for Open Source Software
69
4.3.1 Objects
Objects represent passive elements of the model. Source Code
Source code8 objects contain a human readable form of a software component. Basic information should be included to explain how the software works, e.g. comments. Therefore a skilled developer should be able to understand each given step of the program and be able to modify it according to his needs providing he has a sufficient knowledge about the functionality of the program (e.g. accounting). However, the source code only often demands a lot of extra work in order to understand the concepts and ideas behind the code. For this reason additional information is usually stored in documentation objects. Most source code contains instructions and procedures for building an executable. So, these objects could be considered as a source code package with little meta information and a very simple packaging system, but source code is a developer’s object, packages are intended for users. Therefore it is sometimes a question of perspective if an item is a package or a source code object. Package
Packages are software components prepared for integration into a suitable computer system. The preparation could support any of the usage phases9 : system preparation, installation, customization, daily usage, problem solving, update, replacement or removal. Therefore packages include certain meta data additionally to their main contents. The notation and contents of the meta information depend on the used packaging system. Stored information could be dependencies, scripts10 and signatures. Two different types can be distinguished: Source Code Packages The main content is source code. Although it is required to translate them
into binary code before usage, there are several reasons why some people prefer them. As they are compiled in and for a certain computer system it is possible to optimize the result for the local machine. Besides, anyone considering to modify parts of the component for some reason depends on the source code. The major disadvantage of this type is the extra work, time and know-how required for compiling. Binary Packages The main content is executable code (binary code). Most regular users prefer this
type as it provides the desired functionality with the smallest efforts in most cases provided that everything was prepared correctly. However, sometimes problems are encountered because of bad compilation or incompatibilities with your system. In these cases they cannot be resolved independently without the source code and the difficulties are sometimes tiny, but fatal (e.g. wrong search path). 8 source
code (noun): “a text listing of commands to be compiled or assembled into an executable computer program” [Oxford98, source code] 9 See section 4.1.3 on page 63 about the different phases of component usage. 10 Scripts are simple command sequences. Normally, they are used for short tasks and require an interpreter in order to run as there is no binary code.
70
4. Basic Concepts of Open Source Software
Beneath the actual software component packages normally contain a lot of meta data. The stored information depends on the used packaging system. They are important in order to support the integration of the component. In the following, I have chosen a list of entries I consider important. Although it might not be complete, it should give a good understanding of the subject. 1. Information about the containing software component a) Dependencies11 b) Short description of the functionality c) Release data: version, date, work status (stable/experimental, alpha/beta/final), etc. d) Information about the producing person or organization e) License 2. Information about the package itself a) Extra package dependencies (e.g. required for integration procedure) b) Release data: version, date, etc. c) Information about the packager 3. Procedures for integration, update and removal Documentation
In most cases there is a lot of additional information available for a package or source code. It could be a text document, figures, pictures, videos or anything else that is helpful for the corresponding subject. The contents might explain how something was done, how it works, or it should be used, etc. Good examples are online manuals or HOWTOs12 . Distribution
It is hard work to collect and build a complete efficient software system out of several thousand packages provided via the Internet. Therefore users may prefer to pay someone else to do most of this job for them. Thus distributions are pre-selected collections of packages that fit together. They are administrated with the help of packaging systems. Normally, the included software components do not require any other ones that are not provided as well. The various available distributions differ mainly in the containing packages, component’s release versions and the used packaging system. Therefore each distribution is aimed at a different user group. Some examples: 1. Software developers want to have accurate access to most parts of their system and eventually like to modify some parts of it. 11 See
section 4.1.4 on page 64 for details about dependencies. is a common term for documents that explain ’HOW TO do something’, e.g. ’How to configure your sound card XYZ’.
12 HOWTOs
4.3. A Simple Framework for Open Source Software
71
2. A secretary would like to have a comfortable distribution for efficient work without knowing or accessing anything of the technical details. 3. Computer systems used for the control or observation of critical processes want to have the highest possible reliability. 4. Game players require good multimedia capabilities (e.g. excellent 3D graphics) and support for special devices (e.g. force-feedback joystick). It might be possible to produce a generic distribution for most of these users, but it would result in low satisfaction for each single task. Today, computer systems are too complex for comfortable handling anyway, more generic systems make this situation even worse. However, it might be possible in the future. Packaging System
This is a method to partially automate the administration of software components in a computer system. In order to handle available and installed packages the various packaging systems normally include tools to process the following tasks on packages: Installation The transfer of the software to a storage media of the computer system, its integration
and configuration. Update When a new version of the component becomes available it is normally not necessary to
follow all steps of the installation again and sometimes harmful when user data is replaced with default data by the corresponding procedures. Removal Cleaning up all references, software, data, etc. that was established by the software compo-
nent. It is also an important task of these tools to prevent the removal of essential components without replacement as it would result in an unusable system. Observation of Integrity As the installation, update and removal of packages require some superuser
privileges these procedures are dangerous. Therefore the integrity and the origin of a package should be observable and checked before usage in order to prevent severe damage. Issue
People who are engaged with open source software often have some ideas they want to share with others. I call this an issue. Normally, issues make a reference to another object. The referred object might be any of the given object types including issues. Examples for the subject of an issue: Bug Someone discovered a strange behavior of a software component. Improvement A user or developer has a good idea to improve a component in some respect at pro-
cessing a certain task. New Feature A suggestion to integrate some new functionality.
72
4. Basic Concepts of Open Source Software
Question A person requires some information about a certain matter. Comment Someone wants to share certain information with another involved person. (e.g. an impor-
tant project started) 4.3.2 Roles
Roles represent human actors in the model. They are defined by a specific abstract task. Developer
The developer role represents open source projects. Although the main interest might be executable software, other subjects like documentation or data collection could be the subject as well. The following activities belong to this role: 1. Produce source code for a software component 2. Document the development process 3. Document the usage of software components of other projects 4. Release source code and documentation to packager 5. Resolve issues passed by consultant 6. Communicate with consultant about issues (arising (used packages) and resolved) Packager
A packager collects results of a developer and prepares their integration into a specific computer system by producing one or several packages of the collected material. Sometimes, a split of different parts of provided material is reasonable as projects may release several components combined together that could be used separately (e.g. application, used library and documentation). On the other hand, a combination of different components in one package could be done, too. The following activities belong to this role: 1. Select a packaging system 2. Collect source code and/or documentation 3. Collect and produce required meta data 4. Write procedures for installation, update and removal 5. Build package(s) 6. Resolve issues passed by consultant 7. Communicate with consultant about issues (arising and resolved)
4.3. A Simple Framework for Open Source Software
73
Distributor
Distributors collect the different available packages for their purposes and combine them together in a distribution. Normally, each distributor aims for particular goals, e.g. high security or ease of use. According to this goals they will choose the components and the released packages. The following activities belong to this role: 1. Determine primary goals for the distribution 2. Select a packaging system 3. Search for suitable packages 4. Compare found packages and choose the best solutions 5. Collect required packages (dependencies) 6. Observe and control included packages 7. Provide tools to administrate and configure the distribution 8. Simplify customization of packages for the usage in this specific distribution 9. Resolve issues passed by consultant 10. Communicate with consultant about issues (arising and resolved)
User
A user obtains a distribution somehow and utilizes it. He may get it from a shop, by downloading or borrow it from a friend. The different opportunities depend on the specific distribution and its license. The main concern of a user is the processing of his tasks. The following activities belong to this role: 1. Select distribution 2. Install basic computer system 3. Select tasks he wants to process 4. Select packages suitable for these tasks 5. Install selected software packages 6. Use selected software packages 7. Communicate with consultant about arising issues
74
4. Basic Concepts of Open Source Software
Consultant
It is quite normal that many problems and ideas arise when dealing with software components. As there are various parties involved from the creation until usage, the identification of the problem’s origin is often the hardest part of the problem. Therefore a central point of access is essential for unskilled users because they are not able to locate the origin themselves and do not know which party they should contact. Although the distributor and the consultant could be the same organization, they do not have to be. Smaller companies might use another distribution, but offer the corresponding consultant service. The following activities belong to this role: 1. Communicate with user, packager and distributor about issues 2. Resolve received simple issues independently 3. Summerize, filter and complete other issues 4. Identify affected parties (developer, packager, distributor) for these issues Integrated Service Provider
Since the participating parties could be spread all over the world, most of the communication will be done via the Internet. Therefore an advanced communication and information infrastructure is required. The integrated service provider offers this service. Although one big homogenous infrastructure is a nice thing, it would be hard to establish. Therefore interfaces between the different systems are necessary. If they do not exist, the transfer has to be done manually. The following activities belong to this role: 1. Provide communication channels 2. Collect informational objects 3. Provide informational objects 4.3.3 Relations
As the subject of open source software is information in various forms, I have chosen the information exchange to illustrate the relations in figure 4.3 on page 75. Although some aspects are not considered in this point of view, it resembles a major part of the actual relations. The simple provision of informational objects13 is represented by single-sided arrows. Putting something on your home page is a suitable example for the nature of this communication, but there are more efficient ways to publish this kind of information as you know who is potentially interested in 13 ’informational
object’ is a term I use for any object that could be transfered by electronic communication systems. This could be source code, software, text, speech, etc.
4.4. Software Production
75
Developer
Consultant
Packager
Distributor
User
Figure 4.3: Information Exchange between the Different Roles the provided objects (where the arrow points at) and it would be better to put them at a central point, so interested persons do not have to search the whole Internet in order to find your information. Lines with arrows on both sides do not stand for the simple supply of information in both directions, but a process of an interactive exchange of knowledge which normally results in an increase of knowledge on both sides, although the nature of this information is usually different. For instance one person gets an answer to his question and the other obtains the knowledge about the lack of information the first one had.
4.4
Software Production
I have described the improvement of open source software components14 and a framework15 in which the corresponding processes could take place. Figure 4.4 on page 76 shows how these activities work in this framework. The support system process is left out as it covers the role of the integrated service provider which is an implicit part of the given activities.
Research
Research represents the activities of the developer. The provided experience is used in order to improve the subject of research: a component. This could be a software component or documentation. The produced or improved component is released and thereby passed over to integration. Although the developed component is normally created to be used, the developer considers integration and usage also as a test run of his results. 14 See 15 See
section 4.2 on page 65 for details about the improvement of open source software systems. section 4.3 on page 68 for details about the framework.
76
4. Basic Concepts of Open Source Software
Research
Integration
Experience
Usage Figure 4.4: Activities of Software Production Integration
Integration covers the activities of the packager, distributor and a certain part of the user role. The packager concentrates on single packages, however, he has to consider other packages because of the corresponding dependencies. The distributor works on a complete collection of packages (sometimes several thousands) to combine them to one big potential software system. Then, the user selects some of these packages, integrates them in his computer system and thereby customizes them to apply to his personal needs. Usage
This activity is performed by the user. The user simply does what he wants and talks about arising issues. Experience
The consultant collects all the emerged issues from integration and usage. He filters out all parts that does not require interaction with the developer. The remaining matters are summarized, completed and passed on to the developer for improvement of the component or to gain required additional knowledge for independent solutions by the consultant.
77
5 Technical Support for Open Source Projects
Open source software development is based on the Internet. Without its infrastructure such a large scale development with so many independent participants all around the globe would be impossible. Therefore the communication and information infrastructure is an essential part of the development environment of the different projects. As the Internet itself with email, newsgroups, the world wide web and other basic technologies can be considered an effective infrastructure, the open source movement used this generic functionality for their collaboration and many projects still have not introduced more specialized services today as they are suspicious about sophisticated systems that seem to be comfortable at the first glance, but actually are less effective in the long run. For example, they could make the system more complex and thereby harder to maintain without adding extra value. Nevertheless, the situation has changed and many projects have adopted some of the new systems or planning to commit some fundamental changes to their infrastructure. Although the support system is used by each developer like one big system it consists of many different parts provided by several organizations. Additionally, the support system is also maintained by several administrators as it is distributed over many collaborating computer systems and there is a fluid transition between the personal computer system of an individual developer and the connecting infrastructure. As each developer has got his own favorite editor, debugging method, etc. I have concentrated in the following chapter on the investigation of the used infrastructure and related services.
5.1
Weak Points of Former Support Systems
To say that open source software production does not have any kind of problems is wrong. Actually, there are many difficulties and the developers have done an excellent job to achieve the current results in spite of all the obstacles they have to face. Nevertheless, many present-day problems in OSPs are caused by insufficient technical support. Therefore many new tools and services emerged out of this situation, but in order to understand current support systems and their job it is very helpful to know about the situation before their introduction and the problems that demanded their development. The following difficulties are some examples of the situation without good technical support. 5.1.1 Language
Although English is the language of the open source community it is not the native language of all its members. This results in several problems. There are misunderstandings and people could feel
78
5. Technical Support for Open Source Projects
offended by others because their words could be interpreted in several ways. Another problem is that many people might just not speak good enough English in order to follow a conversation. Therefore the single language just excludes the participation of several persons.
Possible Support
1. Kalle Dalheimer of the KDE project1 talked about such problems and suggested personal contact on events like developer meetings as one solution for this problem. 2. Basic automated translation systems like Babelfish can improve the situation up to a certain degree. 3. Voluntary translation of important documents can help people to overcome the language barrier. This procedure requires a well coordinated management of the different documents, otherwise this would lead to a lot of confusion. 4. The formation of local user groups in native language might be another solution for important projects. However, the source code and related documents should be written in one language in order to prevent total chaos. 5.1.2 Mailing Lists
The normal way to solve a problem with open source software is to post a short description to the associated mailing list, but it sometimes happens that the only answer is – silence. In these cases people may think: 1. “Was the question too silly?” 2. “Have you asked something that was already answered a thousand times?” 3. “Is your domain filtered out and the mail has never reached the mailing list?” 4. “Does someone have to check your problem before he wants to answer?” 5. “Does nobody know a solution for your problem?” 6. “Does everybody think your question has already been answered?” 7. “Is something wrong with your mail account and the reply got lost?” 8. “Have you simply mailed to the wrong mailing list?” 9. “Is nobody interested in your problem?” 10. “Were the answers only sent to the mailing list but not to you personally and you have not subscribed as you do not want to handle one hundred mails a day?” 1 KDE
stands for K Desktop Environment (http://www.kde.org). He mentioned these problems at the ’Wizards of OS’ conference in 1999 - a congress in Berlin about Open Source Software (see http://www.mikro.org/wos for details).
5.1. Weak Points of Former Support Systems
79
11. “Are there any other reasons you have not even thought of?” It is a really uncomfortable situation when you only use software and some problem arises, but it must be even harder for someone, who contributed a significant part, to ’hear’ this silence after suggesting a good idea. Marc Lehmann of the GIMP project gave a good example for this problem2 : Their developer mailing list was so crowded that up to ninety percent of the mails were of negligible value for the developers as many users discussed their issues there. Finally, important announcements were simply overseen accidentally between all the other stuff. After some frustrated developers threatened to split the project, the problem was resolved. Mailing lists are a very simple form of communication and it does not require a lot of work to establish them. However, they have different weaknesses: 1. People do not get a confirmation for the arrival of their message when they are not subscribed. Although there are archives of most mailing lists, messages need several hours or days to be visible there. 2. People never know if someone has read or even opened their message. 3. Nobody knows if someone is dealing with the message. Double work or no attention might be the result. 4. The number of postings sometimes reaches an unbearable level (sometimes more than one hundred a day) and it is hard to process the messages in a reasonable way. 5. In order to be sure to receive all interesting mails about a certain issue you have to subscribe to a mailing list, but many people are only interested in one particular problem and do not want all the other mails. 6. Searching the archives for certain issues can be hard job, especially without having a specific catchword. Possible Support
1. Create several different mailing lists with exactly defined tasks. 2. Filter or restrict mailing list in order to prevent the distribution of undesired postings. 3. Find another communication system that is more suitable for the situation. 5.1.3 Multiple Work
As OSPs coordinate so little3 people sometimes do the same work independently and the different parties often do not even know about each other. This procedure consumes additional resources, but it 2 Marc 3 See
Lehmann described this problem at the ’Wizards of OS’ conference in 1999. section 3.1.6 on page 42 for a detailed description of the coordination of human actors in OSPs.
80
5. Technical Support for Open Source Projects
also has the nice side effect that there are often several solutions to choose from. The choice between different alternatives again helps to improve software’s quality in most cases. However, the concurrent research sometimes leads to conflicts and passionate discussions when someone feels left out in matters that are important for him. The argument about the SCSI generic driver of the Linux kernel is a good example for these conflicts [Schilling99][Gilbert00].
Possible Support
1. A central place for the registration of different projects might improve the knowledge of the projects on similar work. 2. An topic-oriented, project-independent communication platform might help to exchange ideas and research results provided that all parties are willing to collaborate. 5.1.4 Pending Tasks
Because of the way human resources are managed, it sometimes happens that tasks are not processed for a long time. Especially, participants of large projects sometimes lose their overall view and simply do not know about pending tasks. In other cases the responsible person has not the time to continue his work and nobody knows that he has stopped working on the issue.
Possible Support
1. Remind messages could be sent to persons who are responsible of pending tasks. 2. Pending tasks that have not been assigned to someone for a while can be announced as pending before becoming a real problem. 3. A developer of an issue that turned out to be to difficult for him can ask for work support. 5.1.5 No Written Conventions
OSPs normally do not have anything like laws or written conventions, but there are rules all participants are supposed to follow. This is a very uncomfortable situation for newcomers. It could be a good procedure to scare off people who are not fully committed to the project, but normally this is not intended.
Possible Support
1. Many conventions could be integrated into the support system which can take over the performance of corresponding actions. For instance when a convention exists to forward a message concerning a particular task to anyone who was occupied with it, the support system can send new messages about the issue to all of them, and the sender does not need to search old postings to find these persons.
5.1. Weak Points of Former Support Systems
81
2. The conventions can be written down and provided to the participants like in the Debian project. 3. Project members could ask the support system for hints about certain conventions. This could make it easier for new participants to integrate into an existing group, though, it requires the explicit formulation of the conventions. 5.1.6 Release Versions
There are a lot of different release versions of software components circling around in the Internet. Although it is normally easy to distinguish different versions by their ’version number’, it is sometimes very hard to figure out what the version number means. For instance one announcement in the Linux Daily News4 contained the following phrase: “It turns out the 2.4.0-test1 is actually 2.3.99-pre10-pre3 [...]”. And people may find better examples for strange version numbers ... So, what is all this about? Projects want to have self-explaining version numbers, e.g. ’pre’ stands for pre-release. I have listed the major pieces of information included in the version numbers below: Numbers Numbers like 2.3.20 usually means major version 2, development branch (because 3 is
odd) 3, point release 20. Development Branches Many projects have different branches of development and thereby for re-
lease: ’stable’ - productive usage, ’experimental’ - developers, sparc - Sun Sparc station, etc. Work Status OSPs provide as many information as possible to the community. Therefore they have
different labels in order to inform the user about the work status of the corresponding release: alpha – heavy development, beta – features frozen, pre – last test run, RC – release candidate, final – that’s it; and there are more. Code Names Some projects use code names for their major releases, e.g. ’potato’ instead of 2.2. Release Dates Some projects use the release date as the version number. They have the disadvantage
that there are different conventions in different countries about the sequences of day, month and year. For this reason it is sometimes not clear which version is the newest. Custom versions Major contributors of some projects make their own versions available and mark
them somehow (e.g by the suffix “ac”) for others. So, everyone knows this is only a custom version and not an official release. All these conventions are not standardized and many projects use them differently. Therefore it is useful to find out about the versioning policy of the corresponding project in order to be sure of their meaning. Possible Support
1. Standardized version numbers. 2. Store the meta information somewhere else than in the version number. 4 ’Linux
Weekly News -Daily Updates’ on May 25th, 2000. (See http://lwn.net/daily/.)
82
5.2
5. Technical Support for Open Source Projects
Basic Support Tasks
In section 3.3.1 on page 49 I have illustrated the basic activities of the development work of OSPs: work, coordination, communication, documentation and decision-making. Since helping the project to succeed is the primary goal of the technical support system, the support of the given activities is a major task. Besides, there is an additional activity emerging out of the technical environment: protection. The presented activities (as action categories) give a good overview about the actual job of the technical support system, but when looking at concrete technical issues a sharp division is not useful anymore as the abstract tasks only appear in combination. For example, coordination is impossible without communication and version management covers more or less all six categories. Therefore I will not consider these categories in the following, but concentrate on presented problems to provide another view of the subject. 5.2.1
Protection of Stored Data and Performed Actions
Computer systems control a lot of valuable information, many expensive devices and several charged services. As most people do not like losing valuable goods, they prefer to protected these things against any damage or misuse, if possible. This issue becomes even more important for computer systems connected to the Internet because it allows any other person linked with this computer network at the same time to interact with the system. Although many obstacles have to be removed before someone gains control over your system via the Internet, it is a possible threat as recent attacks like the I-love-you-virus have shown. This protection is the task of system security5 . There are several different threats that have to be faced: System Failures Today, most computer systems contain bugs of some kind and many of them result
in data loss or other unpredictable harmful behavior. Although most people do not care about technical perfection, they care a lot about their data and the time they have spent to generate them. Security could mean the absence of any system failures which is a state most computer users and software vendors would prefer, but it seems to be unreachable in the near future. However, accepting the insufficiency of current computer systems gives us the opportunity to work on strategies to reduce the caused damage to a minimum. The most simple example is a frequent system backup or journalizing file systems. Bad Operation Since the short description of certain tasks is sometimes hard to understand, users
might command the execution of jobs, they would have never allowed if they knew what they actually mean. The average user normally does not really know what he is doing, but hopes it is the right thing: the good, old try-and-error method. Still, it is often the user’s own fault as they do not read all available documentation. However, it is an understandable behavior as they do not turn the computer on to read manuals for hours, but to get their job done. Therefore it would be best to get computer systems intuitive and foolproof by default and allow the corresponding experts to get round the security mechanisms by special access (e.g. Unix root account). Insufficient Feedback Any kind of actions a computer system is commanded to perform could fail
for various reasons, e.g. no floppy disc in the drive while saving. This alone has to be accepted 5 security
(noun): “the state of being free from danger or threat” [Oxford98]
5.2. Basic Support Tasks
83
because it is sometimes the user’s problem (he forgot to put the floppy in the drive), but to get no reply or a confirmation of the execution (e.g. ’OK’) is a bad security policy. Any failure should be corrected or reported reliably to the responsible user. Unauthorized Retrieval So far, we have investigated incidents that are not caused on purpose. No-
body actually wanted to produce any damage, but there are also some ’bad guys’ out there, maybe only your little brother who is curious about the last email you have written to your girl-friend. Although the open source movement is based on information exchange, even the developers want to keep some information confidential, e.g. passwords, credit card numbers or some parts of their private life. Therefore it is important to secure a certain level of privacy if demanded and make sure that nobody could obtain the protected data. Data Manipulation Another kind of attack is the manipulation of stored information like removal
or modification. Most computer viruses belong to this group as they modify certain system components and thereby cause some malfunction and/or data loss. Breaking into a computer system and erasing all or some stored files is another example. One of the most harmful actions is the secret modification of some frequently used components, e.g. a standard library, as these could produce a lot of damage before detection6 .
The given problems are general computer problems, but they are also essential for the technical support of OSPs. The systems are connected to the Internet which increases security risks dramatically because they get more complex (system failures, etc.) and can be much more easily attacked.
5.2.2 Observation of Actions
There are different strategies to make sure that nobody inside a technical support system is causing damage to the project. For instance the software could monitor all performed actions and enforce their correctness according to the policy rules by refusing forbidden ones. Another possibility is to leave the executive part to a human being and inform someone about undesired activities. However, it is often difficult to give generally valid rules of what the different participants are allowed to do and of what the software should refuse to process. By Being on the safe side security is increased, but developers’ freedom is reduced and it would chase them away from this ’paranoid’ project. Therefore observation (e.g. by logging performed actions) is sometimes the most reasonable procedure, although it takes away a lot of privacy from the participants. Anyway, the usual situation is a responsible-minded developer who does not need to be controlled by a Big Brother and does not like to be. Still, there might be a few black sheep who can mess up everything and only a little bit of observation of the system could stop them ... It is the job of each project to find their own balance between privacy and security and the corresponding technical support should be prepared to enforce the agreed policy. 6 I have heard about an incident where someone modified nothing else than the system constant π slightly without detection
in a research institution that had to do a lot of calculations. So, all calculations based on π where inaccurate until the problem was discovered after several years. Although I cannot prove the truthfulness of this story, the possibility demonstrates the potential danger.
84
5. Technical Support for Open Source Projects
5.2.3 Version Management of Stored Data
The task of version management is to record any modification to the project data and by request recover it to any state that could be specified accurately. There are many good reasons for the usage of version management systems in OSPs. I just want to give some examples: 1. When a chosen strategy to implement a feature or solve a problem turns out to be the wrong one, it is much easier to get the entire source code back in the previous state. 2. version management systems normally provide a version history and the opportunity to store log messages. These features can be used as a documentation for the development process. This kind of documentation can be useful for various things. 3. Concurrent revision control systems like the famous CVS7 help a lot to coordinate and merge distributed work. The administration of the various received results would be impossible for many projects without them. 5.2.4 Online-Offline Management
Large parts of the support system are not situated on the local computer system, the different computers are not permanently connected to each other and developers might work on various systems (e.g. office and home), but they always use and modify the same project’s data. However, the version management is responsible for handling conflicts that could emerge from concurrent work on the same data and security has to make sure the distributed system itself is protected, but the distributed support system must allow users to attach and detach from it without difficulties. The work of the participant should be possible in both states, in particular without a connection as many users have to pay for their connection to the Internet or use a voice phone line. Everything that is caused by the frequent disconnection belongs to this tasks. Some examples: 1. Data synchronization: Each time a developers goes online the local and remote data is updated and thereby synchronized again. 2. Notification about the presence of participants for real-time communication. 5.2.5 Distribution of New Information
Information in its various forms (software, documents, URLs, etc.) is the most important resource for OSPs. Every day the knowledge available online grows and a certain part of it is useful for certain projects. As soon as one participant recovers some useful information he can share it with others. However, the distribution raises several problems: 1. Which members are interested in the discovered resource? 2. What is the best way to notify interested persons? 7 CVS
stands for ’Concurrent Versioning System’. See http://www.cvs.org for details.
5.2. Basic Support Tasks
85
3. How should this source of information be archived for later retrieval? 4. How do we prevent communication channels to get flooded with such hints, considering there are thousands of developers searching the Internet and talking about what they find? 5.2.6 Communication between Participants
The Internet itself provides several means of communications like email, newsgroups and Internet relay chat, but they are insufficient as an easy, efficient communication system. I think, it would help OSPs a lot if they find better ways to communicate. However, using something more complex than the basic Internet infrastructure also introduces problems: For instance, all developers must install the extra software everywhere from where they want to communicate with the project; email is available everywhere. To illustrate what I mean, I will give some additional features for a theoretical replacement of a regular mailing list: 1. The opportunity to check if your message entered successfully the communication system 2. To be able to count how many persons actually read your message (at least looked at) 3. To join/leave single discussion threads instead of entire mailing lists 4. To open/close discussions 5. Notification about new discussion issues (instead of all posted messages) 6. The possibility to view all open discussions 7. Intelligent automatic system to ’invite’ all affected participants to a discussion (maybe together with information distribution?) 8. Preparation for an easy retrieval from an archive Of course, there are many means of communication, but mailing lists seems to be the most important ones. For the creation of a better communication system all others should be examined as well. 5.2.7 Issue Administration
Issues8 are basically ideas that require some work to be done. They play an essential role in OSPs as they are the major instrument for coordination. It is a simple, but efficient procedure to give their activities a direction, get an overview of what has to be done in the future and of what is going on in a project. Therefore it is important to find a good way to handle and administrate issues as it helps to reduce work and increases efficiency. Technical support should help to make things transparent. Some examples: 1. who works on which problem 8 The
various functions of issues in OSPs have been described in section 4.3.1 on page 69.
86
5. Technical Support for Open Source Projects 2. which group requires some help 3. what depends on a certain issue 4. who is familiar with this topic 5. where can I find more information 6. provide a place to comment 7. what progress has been made 8. when was the last modification (is the issue dead?)
5.2.8 Automation of Frequent Processes
Most tasks mentioned above could be considered a special case of this category, but they are too important to be covered in this way. There is a lot of boring work in a project that has to be done frequently without anyone being interested in anything else than the result. This stuff could often be processed automatically by some technical support software, e.g. setting and checking all access permissions correctly. However, it is important to watch each of these tasks closely while transferring them to system control as automation that first seems to be comfortable often becomes annoying and disturbing in the long term. Automation should support and help the participants and not hinder them. Therefore it is normally the best solution to leave the decision whether they want to use a specific tool to the individual project members instead of activating it for the entire project, but the simple provision of an option for additional technical support is generally a good thing.
5.3
Technical Support for Distributed Development
This section investigates some present technical support solution and their usability for open source software development. 5.3.1
Groupware: Tools for Traditional Collaboration
Traditional Groupware9 is designed for large companies that have several collaborating branches or groups which prefer to collaborate via computers for some reason (different work times, etc.). Because of this target group some typical features can be associated with the design of these systems: 1. Social structures of the corporate environment are integrated in such systems as it would not be adopted by the participants otherwise. 2. The collaboration model is based on existing, traditional working processes. 9 Groupware (noun):
“Computer mediated collaboration that increases the productivity or functionality of person-to-person processes.” [Coleman97, Section 1.2]
5.3. Technical Support for Distributed Development
87
3. The virtual team resembles existing project teams in corporate environments. 4. The size of a virtual collaborating team is relatively small (2 - 20). 5. The ideal of the collaboration system is to provide an equivalent to a physical presence of the participants in the same room. Unfortunately, OSPs use another collaboration method: 1. The number of participants vary from one to several thousand. 2. OSPs have a totally different social structure as their members are volunteers and does not necessarily share a common environment like the same company, culture, profession, education or language. 3. Members come and go. Therefore Groupware is not suitable to support OSS development in most cases.
5.3.2 Present Means of Communication
I have looked at several OSPs and tried to identify their way of communicating. The following list might not be complete, but at least gives a good impression: Mailing Lists The most common way of communication are mailing lists. It is amazing how much
can be done with this simple method. News Newsgroups are similar to mailing lists. Internet Relay Chat Some projects use this text based, real time communication system for their
conversations. One disadvantage of this is that the information is normally not recorded for later use. Personal Mails Sometimes it is easier to do some work privately and only presenting the result to the
public. For this task regular mails are still the preferred medium. Meetings As most developers of OSPs are spread all over the globe it is an expensive undertaking to
meet personally. For this reason this is done very rarely. Documentation HOWTOs10 , manuals and all other kinds of documentation are an important source
of information for project members although it might be thought for users in the first place. Source Code Reading the source code itself might be hard, but there is a lot of information in it. And
for a good programmer who is familiar with the work it is sometimes the most important source of information. Especially, when there is a large amount of comments in it. 10 HOWTOs
is a common term for documents that explain ’HOW TO do something’, e.g. ’How to configure your sound card XYZ’.
88
5. Technical Support for Open Source Projects
Special tools Investigating the log messages of the version control system (e.g. CVS), bug reports
of bug tracking systems or information archived by some other tools can be helpful. Traditional Communication Things like telephone or letters still exist. Maybe someone uses it. Next Generation Communication Video-conferencing, white board, etc. may soon enter the open
source scene. 5.3.3
Support for Open Source Software
Several services have started in the past year as a special support for the production of OSS. Most of them are still under heavy development and some have not left the ’beta phase’ yet. Still, the supplied services at the present day make the work of OSPs much easier as they provide an efficient infrastructure for this kind of development: Opendesk “Opendesk.com is an Application Service Provider11 designed especially for small busi-
nesses [...]. When you create an Opendesk.com account, your company is given a secure web address where you can access all elements of your company Intranet: Intranet, Business Tools (software applications), Organizer, Storefront, File Storage.” [Opendesk, What is Opendesk.com] WISE “The Web-Integrated Software metrics Environment (WISE) is the first WWW-based project
management and metrics system available on the WWW. WISE is a WWW-based tool that provides a framework for managing software development projects across the Web. Programmers and managers can log issue reports, track the status of issues, and view project metrics using standard WWW browsers. WISE provides a non-intrusive method to coordinate project activities and allow software development teams to view their progress and performance.” [WISE, The WISE Project Management System] SourceForge “SourceForge is a free hosting service for Open Source developers which offers, among
other things, a CVS repository, mailing lists, bug tracking, message forums, task management software, web site hosting, permanent file archival, full backups, and total web-based administration.” [SourceForge, What is SourceForge?] “With about a third of the known open source projects being developed there so far, SourceForge has already become the single most important development center in the world of open source.” [VALinux00] Bitkeeper “A scalable configuration management system, supporting globally distributed develop-
ment, disconnected operation, compressed repositories, change sets, and named lines of development (branches).“ [Bitkeeper, What is BitKeeper?] Asynchrony “Asynchrony is an Internet community where software developers can connect with
each other, create software together and share in the profits of products they develop. [...] We provide a virtual environment for programmers to find each other, create projects and collaborate. We market the resulting [open source or proprietary] software. The programmers 11 “Application
Service Providers deliver, host and manage software applications on a remote server. Most ASPs rent applications on a per user basis. ASPs allow for the outsourcing of much of the technical expertise required to host and manage applications, freeing up time and resources for the customer.” [Opendesk, explanation for ASP]
5.3. Technical Support for Distributed Development
89
receive the lion’s share of revenue. Millions of programmers around the world who currently program for free because they love to do it will be able to program for profit on the Asynchrony.com website.” [Asynchrony, What is Asynchrony?] Cosource “Cosource.com is a collaborative, reverse-auction web site enabling international con-
sumers and developers of Open Source products to work together to fund development of innovative software solutions.” [Cosource, Introducing Cosource.com] SourceXchange “[S]ourceXchange is a new forum linking Open Source Software developers world-
wide with intensifying commercial interest in Open Source software.[...] [S]ourceXchange is a wide-open marketplace to facilitate a dynamic exchange between buyers and sellers, a place where highly skilled Open Source developers supply their expertise to committed buyers with well-defined, financially-backed Open Source projects.”[Sourcexchange, What We Are]
90
5. Technical Support for Open Source Projects
91
6 Conclusion
By looking at open source software, software usage and development turned out to be inseparable. Therefore the development process starts with the first related idea and ends with the removal of the software from the last hosting computer system. Another effect is the direct or indirect inclusion of all users in the development, though, some of them might only frequently update their software components. In order to handle these new circumstances, the major task of special development environments for open source software is the provision of a suitable information and communication infrastructure. This infrastructure must consider all special features of the entire process of open source software development, particularly, the mentioned integration of users in the development process. Several basic structures have been identified that could help to create a complete open source model in the future. The following gives a short summary.
Social Background The open source community is not governed by any central authority which represents its members or is entitled to decide and enforce any kind of rules. Therefore the open source community cannot be examined as an organization, but only as a social phenomenon including many different interacting organizations and individuals. Without a central management the community depends significantly on social relations of the global Internet society. The resulting dependencies have a strong impact on activities inside the community and its interaction with the rest of the world. For this reason I tried to identify the major social structures by investigating various aspects of the community like economy, law, philosophy of science, politics, history and software engineering. The examination has helped to gain a basic understanding of the open source phenomenon. In particular, the role of intellectual property and the special application of corresponding law by open source projects is the most important result. As the definition of the term ’open source software’ is based on the attached licenses, they turned out to be the only reasonable indication to distinguish open source projects from other ones. These open source licenses should be seen as the legal representation of more complex models. The subject of these models is software and all related issues like development or usage. Additionally, it is important to understand that each license is derived from different underlying principles that might belong to various categories like economy, philosophy or politics. Therefore the open source phenomenon should not be considered as one united movement of any kind, but as a collection of many different thinking individuals and parties that have the same interest
92
6. Conclusion
in a small field. When covering all participating parties, the only general agreement appears to be the acceptance of open source software as a useful thing. Actually, some parties seem to only tolerate open source software for the moment because it does more good than harm to their personal interests. Open source projects (OSPs) as the organization form of software developers turned out to be the basic element of the social network of the community. Although many economical activities are based on project’s results, OSPs themselves do not seem to follow economical principles for various reasons, e.g. the lack of business management. Considering OSPs as academic research appears to be a more suitable model.
Organization of Open Source Projects OSPs are forced to adapt themselves to the special conditions of their activities like distributed membership, no direct income or no strong management. These circumstances usually result in certain organizational structures in most cases which have the following consequences:
1. Many committed users participate in OSPs. Actually, most projects are started by users which have not found the right component for their own needs. For this reason, there is no risk that the result will not satisfy its users. Additionally, releasing software as open source gives anyone the opportunity to make components fit his requirements by participating in the project or starting a concurrent one. 2. Precisely specifying the task of a software component prior to implementation turned out to be a difficult and expensive task for software development in general, but specifications are often required for a business contract to clarify the given job. As OSPs do not get paid for producing any results, they do not need such specifications either. Instead, they use early and frequent software releases to approach the component’s actual task by continuous modification. 3. Any kind of OSP management needs the unanimous approval of involved project members because they are volunteers. For this reason traditional management based on authority power does not work and existing management usually only suggests working tasks and selects finished results, but any participant is actually free to do what he wants while working. Unmanaged activities are coordinated by social interaction supported by software tools and Internet services. Finally, competition on all levels eliminates inefficient procedures. 4. A relatively optimized coordination process of the individual developers, the projects and the open source community as a whole is one result. Since coordination is one of the major problems for the management of traditional software development this is an important aspect. 5. Developers are usually motivated, self-responsible and creative because everything is their own choice. 6. Since communication takes place via the Internet the entire developing process is relatively transparent, though, only skilled persons might understand the exchanged information.
93
Technical Support for Projects Since technical support aids have been changing rapidly the major effort was their collection and observation. The investigation of development tools and Internet services has led to three major subjects: experienced problems, required support tasks and emerging support services. Several support services have been started during the last year. Their success, measured by the number of users, indicates that they provided an improved infrastructure for the open source community. Although the service providers have not found the optimal way to support the so-called ’open source model’ yet, the services are continously improved by frequent modifications and extensions. Additionally, most of the software used by the providers is open source itself and related OSPs can again participate in the development to lobby for their personal interests.
The Presented Model for Open Source Development The complexity of software systems with their large number of components, the fast development and the recent prospering of the Internet business demands a new theoretical framework for software in general. I have tried to consider all these new circumstances in the presented model. Since an isolated investigation of the development process could not provide the demanded structures, an extension of the examination turned out to be necessary: All different phases of software’s ’life-cycle’ has been added, e.g. usage and deployment. Additionally, software components were investigated as parts of computer systems as multi-functional tools. Considering software as a service rather than a product gives a better understanding of the nature of open source software. This perspective is based on the observation that software only exists in a transient state because of its frequently required modification. Qualified users play an essential role in this model. They are reporting testers, provide creative suggestions and represent new potential project members that could help developing. Therefore they are strongly involved in the development process from the very beginning of OSPs and projects usually give users the opportunity to communicate directly with the developers. However, more and more computer novices have been entering the open source community recently. The result is an increasing number of unskilled users that demand support, but are not able to provide useful feedback. Thus, the quality of questions and comments decreases. Protecting the communication channels of OSPs from these distracting interactions was one of the reasons to introduce a mediating consultant in the presented model.
94
6. Conclusion
Acronyms
Acronyms
ALSA Advanced Linux Sound Architecture API Application Programming Interface BSD Berkeley Software Distribution BTL Bell Telephone Laboratories CD-ROM Compact Disc - Read Only Memory CPU Central Processing Unit CSRG Austin Common Standards Revision Group CVS Concurrent Versioning System DEC Digital Equipment Corporation EL/IX Linux Based Open Standards for Embedded Development FEST Framework for European Services in Telemedicine FHS Filesystem Hierarchy Standard FSF Free Software Foundation FSG Free Standards Group FTP File Transfer Protocol GE General Electric Company GIMP GNU Image Manipulation Program GNU GNU is Not Unix GPG GNU Privacy Guard GPL GNU General Public License HOWTO Document Explaining ’How to ...’ HP Hewlett Packard HTTP Hypertext Transfer Protocol
95
96 IBM International Business Machines IEEE Institute of Electrical and Electronic Engineers IDE Integrated Development Environment IEC International Electrotechnical Commission ISO International Standardization Organization ISP Internet Service Provider IT Information Technology KDE K Desktop Environment LGPL GNU Library or ‘Lesser’ General Public License LI18NUX Linux Internationalization Initiative LSB Linux Standard Base MIT Massachussets Institute of Technology MPL Mozilla Public License MULTICS Multiplexed Information and Computing Service NDA Non-Disclosure Agreement NCSA National Center for Supercomputing Applications ODP Open Distributed Processing OSD Open Source Definition OSP Open Source Project OSS Open Source Software POSIX Portable Operating System Interface Exchange QPL Q Public License RCS Revision Control System SCSI Small Computer Systems Interface SGI Silicon Graphics Inc. TINA Telecommunications Information Networking Architecture TINA-C Telecommunications Information Networking Architecture Consortium TCO Total Cost of Ownership URL Uniform Resource Locator WISE Web-Integrated Software Metrics Environment
Acronyms
Bibliography
97
Bibliography
[ALSA99]
Ceske Budejovice: ’The ALSA professional team has been funded’, December 2nd, 1999, Czech Republic (Online: http://www.alsa-project.org/announce/ profi.php3)
[Apache]
Apache Software Foundation: ’About the Apache HTTP Server Project’, February 1999, retrieved June 11th, 2000 (Online: http://www.apache.org/ABOUT_ APACHE.html)
[Asynchrony]
Asynchrony documents, retrieved May 22nd, 2000, Asynchrony.com, Inc., 13397 Lakefront Drive, St. Louis, MO 63045, USA (Online: http://www.asynchrony. com)
[Bezroukov99]
Nikolai Bezroukov: ’Open Source Software Development as a Special Type of Academic Research (Critique of Vulgar Raymondism)’, October 1999, published in First Monday, volume 4, number 10 (Online: http://www.firstmonday.org/ issues/issue4_10/bezroukov/index.html)
[Bezroukov99b] Nikolai Bezroukov: ’A Second Look at the Cathedral and the Bazaar’, December 1999, published in First Monday, volume 4, number 12 (Online: http://www. firstmonday.org/issues/issue4_12/bezroukov/index.html) [Bitkeeper]
Bitkeeper documents, retrieved May 22nd, 2000, BitMover, Inc., 550 Valley St., San Francisco, CA 94131, USA (Online: http://www.bitkeeper.com)
[Bloom00]
Howard Bloom: ’The History of the Global Brain’, 1999/2000, Verlag Heinz Heise, Hannover, Germany (Online: http://www.heise.de/tp/english/special/ glob/default.html)
[Breslow86]
Jordan J. Breslow: ’Copyright Law’, 1986, Walnut Creek, CA 94596, USA, (Online: http://www.ifla.org/documents/infopol/copyright/breslow.txt)
[Brinson96]
J. Dianne Brinson and Mark F. Radcliffe: ’Multimedia Law and Business Handbook’, 1996, Ladera Press (Online: http://www.laderapress.com/ laderapress/mlbh.html)
[Britannica]
Online Encyclopedia Britannica (Online: http://www.britannica.com)
[Burk94]
Dan L. Burk: ’Transborder Intellectual Property Issues on the Electronic Frontier’, 1994, published in Vol. 5, Stanford Law & Policy Review (Online: http://www. ifla.org/documents/infopol/copyright/dburk2.txt)
98
Bibliography
[Callahan96]
John Callahan and Sudhaka Ramakrishnan: ’Software Project Management and Measurement on the World-Wide-Web (WWW)’, 1996, NASA/West Virginia University Software IV&V Facility, 100 University Drive, Fairmont, WV 26554, USA (Online: http://research.ivv.nasa.gov/docs/techreports/ 1996/NASA-IVV-96-006.ps)
[Coar00]
Ken Coar: ’Software Development, Apache-style’, 2000, retrieved April 4th, 2000 (Online:http://www.opensourceit.com/news/000327_apachestyle.html)
[Coleman97]
David Coleman: ’Groupware - Collaborative Strategies for Corporate LANs and Intranets’, 1997, Prentice Hall Inc., Upper Saddle River
[Copyleft]
Free Software Foundation, Inc.: ’What Is Copyleft?’, April 14th 2000, 59 Temple Place - Suite 330, Boston, MA 02111, USA (Online: http://www.gnu.org/ copyleft/copyleft.html)
[Cosource]
Cosource documents, retrieved May 22nd, 2000, Cosource.com, 112 Turnpike Road, Westboro, MA 01581, USA (Online: http://www.cosource.com)
[Cox90]
Brad J. Cox Ph.D: ’Planning the Software Industrial Revolution’, 1990, published in IEEE Software magazine, ’Software Technologies of the 1990’s’, (Online: http: //www.virtualschool.edu/cox/CoxPSIR.html)
[Cox98]
Alan Cox: ’Cathedrals, Bazaars and the Town Council’, 1998, published by Slashdot (Online: http://slashdot.org/features/98/10/13/1423253.shtml)
[Cybernetica]
Documents of the ’Symposium: Theories and Metaphors of Cyberspace (April 9-12, 1996)’, Principia Cybernetica Web, Principia Cybernetica Project, Free University of Brussels, Krijgskundestraat 33, B-1160 Brussels, Belgium (Online: http:// pespmc1.vub.ac.be/Cybspasy.html)
[Dalheimer99]
Kalle Dalheimer: ’Organisation eines OpenSource-Projekts’, July 1999, published by Verlag Heinz Heise, Hannover (Online: http://www.heise.de/tp/deutsch/ special/wos/6431/1.html)
[DeHon94]
Andre DeHon et al.: ’Global Cooperative Computing’, 1994, National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, 152 Computing Applications Building, 605 East Springfield Avenue, Champaign, IL 61820-5518, USA (Online: http://www.ncsa.uiuc.edu/SDG/IT94/ Proceedings/DDay/dehon.global/gcc_www94.html)
[DiBona99]
Chris DiBona, et al. (editors): ’Opensources - Voices from the Open Source Revolution’ , 1999, O’Reilly & Associates Inc., Sebastopol (Online: http://www. oreilly.de/catalog/opensources/book/toc.html)
[FairUse99]
US Copyright Office: ’Fair Use’, June 1999, Factsheet, Library of Congress, 101 Independence Avenue, S.E., Washington, D.C. 20559-6000, USA (Online: http: //www.loc.gov/copyright/fls/fl102.pdf)
[FEST95]
Marlene Gerneth et al. (editors): ’FEST: A Framework for Telemedicine Services in Europe - An Essential Guide for Establishing Telemedicine Services (Vol. 1 - Vol. 6)’
Bibliography
99 1995, KIT-Report 123, ISSN 0931-0436 (Online: http://flp.cs.tu-berlin. de/~kit/reportliste/kitlistehtml.html)
[FreeSoftware]
Free Software Foundation Inc.: ’What is Free Software?’, retrieved May 22nd, 2000, 59 Temple Place - Suite 330, Boston, MA 02111, USA (Online: http: //www.gnu.org/philosophy/free-sw.html)
[Ghosh98]
Rishab Aiyer Ghosh: ’Cooking pot markets: an economic model for the trade in free goods and services on the Internet’, 1998, published in First Monday Journal, c/o Edward Valauskas, Chief Editor PO Box 87636, Chicago IL 60680-0636, USA (Online: http://www.firstmonday.org/issues/issue3_3/ghosh/index.html)
[Gilbert00]
Douglas Gilbert: ’Cdrecord 1.8 and Linux’, January 2000 (Online: http://www. torque.net/sg/c_index.html)
[Gimp]
Gimp Project: ’The Gimp’, retrieved June 11th, 2000 (Online: http://www.gimp. org)
[GNUProject]
GNU Project documents, Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111, USA (Online: http://www.gnu.org/)
[Goodheart94]
B. Goodheart and J. Cox: ’UNIX System V Release 4 - Reise durch den Zaubergarten’, 1994, Prentice Hall Verlag GmbH, München
[Grinter96]
Rebecca Elizabeth Grinter: ’Understanding Dependencies: A Study of the Coordination Challenges in Software Development’, 1996, Ph.D. thesis (Online: http: //www.bell-labs.com/user/beki/thesis.html)
[Gunthorpe00]
Jason Gunthorpe: ’Package Graphs’, April 2000, retrieved May 22nd, 2000 (Online: http://lists.debian.org/debian-devel-0004/msg01032.html)
[Hackvän99]
Stig Hackvän: ’Where did Spencer Kimball and Peter Mattis go?’, 1999 (Online: http://www.linuxworld.com/linuxworld/lw-1999-01/lw-01-gimp.html)
[Hecker99]
Frank Hecker: ’Setting Up Shop: The Business of Open-Source Software’, December 6th, 1999, Revision 0.7 (Online: http://www.hecker.org/writings/ setting-up-shop.html)
[Hoffmann00]
Paul S. Hoffman: ’The Software Legal Book’, 2000, Shafter Books, Inc. Croton-OnHudson, NY, ISBN 0-931687-00-4 (Online: http://www.soft-law.com/index. html)
[Huizinga38]
Johan Huizinga: ’Homo Ludens - Vom Ursprung der Kultur im Spiel’, 1956, first published 1938, Rowohlt Taschenbuchverlag Verlag GmbH, Hamburg
[ISO9000-1]
EN ISO 9000-1 : 1994 D: ’Quality management and quality assurance standards – Part 1: Guidelines for selection and use (ISO 9000-1 : 1994)’, 1994
[ISO9000-3]
EN ISO 9000-3 : 1997 D: ’Quality management and quality assurance standards – Part 1: Guidelines for the application of ISO 9001 : 1994 to the development, supply, installation and maintainence of computer software (ISO 9000-3 : 1997)’ , 1997
100
Bibliography
[ISO9004-1]
EN ISO 9004-1 : 1994 D: ’Quality management and quality system elements – Part 1: Guidelines (ISO 9004-1 : 1994)’, 1994
[ISO9004-2]
EN ISO 9004-1 : 1991 D: ’Quality management and quality system elements – Part 2: Guidelines for services ( ISO 9004-2 : 1991)’, 1991
[Jessie]
Jessie documents, retrieved May 22nd, 2000, Silicon Graphics, Inc., 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA (Online: http://oss.sgi. com/projects/jessie)
[KDE]
K Desktop Environment Project: ’K Desktop Environment’, last updated June 9th, 2000, retrieved June 11th, 2000, (Online: http://www.kde.org/index.html)
[Kuwabara00]
Ko Kuwabara: ’Linux: A Bazaar at the Edge of Chaos’, March 2000, published in First Monday volume 5, number 3 (Online: http://www.firstmonday.org/ issues/issue5_3/kuwabara/index.html)
[LinuxInt]
Linux International documents, 80 Amherst St., Amherst, NH 03031-3032, USA (Online: http://li.org/)
[LinuxKernel]
Linux Kernel Archive documents, retrieved June 11th, 2000 (Online: http://www. kernel.org/)
[Livermore00]
Freemont Avenue Software Inc.: ’Press Release (T.REX Announced)’, Feb 10th, 2000, 1830 S. Kirkwood, Suite 205, Houston, TX 77077, USA (Online: http: //www.opensourcefirewall.com/pressrelease_02102000.html)
[Malone93]
Thomas W. Malone and Kevin Crowston: ’The Interdisciplinary Study of Coordination’, 1993, published in ’ACM Computing Surveys’, V26, No1, 87-119, 1994 (Online: http://ccs.mit.edu/papers/CCSWP157.html)
[Miller00]
Barton P. Miller et al.: ’Fuzz Revisited: A Reexamination of the Reliability of UNIX Utilities and Services’, February 2000, Computer Sciences Department, University of Wisconsin, 1210 W. Dayton Street, Madison, WI 53706-1685, USA (Online: ftp://grilled.cs.wisc.edu/technical_papers/fuzz-revisited.ps)
[Mozilla]
Mozilla Project: ’Mozilla.org at a glance’, last modified January 3rd 2000, retrieved June 11th 2000 (Online: http://www.mozilla.org/mozorg.html)
[ODP P1]
ISO/IEC 10746-1.1998(E): ’Information technology - Open Distributed Processing - Reference Model: Overview’, 1998 (Online: http://www.dstc.edu.au/ Research/Projects/ODP/standards.html)
[ODP P2]
ISO/IEC 10746-2.1996(E): ’Information technology - Open Distributed Processing - Reference Model: Foundations’, 1996 (Online: http://www.dstc.edu.au/ Research/Projects/ODP/standards.html)
[ODP P3]
ISO/IEC 10746-3.1996(E): ’Information technology - Open Distributed Processing - Reference Model: Architecture’, 1996 (Online: http://www.dstc.edu.au/ Research/Projects/ODP/standards.html)
Bibliography
101
[ODP P4]
ISO/IEC 10746-4.1998(E): ’Information technology - Open Distributed Processing - Reference Model: Architectural semantics’, 1998 (Online: http://www.dstc. edu.au/Research/Projects/ODP/standards.html)
[Opendesk]
Opendesk documents, retrieved May 22nd, 2000, Opendesk.com, 460 St-Catherine Street West, Suite 210, Montreal, Quebec, H3B 1A7, Canada (Online: http:// www.opendesk.com)
[OpenSource]
Open Source Initiative documents, retrieved May 22nd, 2000 (Online: http:// www.opensource.org)
[Ossowski99]
Sascha Ossowski: ’Co-ordination in Artificial Agent Societies - Social Structures and Its Implications for Autonomous Problem-Solving Agents’, 1999, SpringerVerlag, Berlin
[Oxford98]
Judy Pearsall et al. (editor): ’The New Oxford Dictionary of English’, 1998, Oxford University Press, Oxford
[Prasad99]
Ganesh C. Prasad: ’The Practical Manager’s Guide to Linux - Can you profitably use Linux in your organisation?’, 1999, published by Linux International, 80 Amherst St., Amherst, NH 03031-3032, USA (Online: http://li.org/li/ resources/papers/1999-pracmgr/Manager’s-Guide-to-Linux.html)
[Raymond99]
Eric S. Raymond: ’The Cathedral & the Bazaar - Musings on Linux and Open Source by an Accidental Revolutionary’ 1999, O’Reilly & Associates Inc., Sebastopol
[Ritchie78]
D. M. Ritchie and K. Thompson: ’The Unix Time-sharing System’, published in ’Bell System Technical Journal’ 57 no. 6, part 2, July-August 1978; a revised version of the article that appeared in ’Communications of the ACM’, 17, No. 7, July 1974 (Online: http://cm.bell-labs.com/cm/cs/who/dmr/cacm.html)
[Ritchie79]
Dennis Ritchie: ’The Evolution of the Unix Time-sharing System’ in ’Lecture Notes in Computer Science’, 1979; published in ’Language Design and Programming Methodology’, 1980, Springer-Verlag (Online: http://cm.bell-labs.com/cm/ cs/who/dmr/hist.html)
[Salus94]
Peter H. Salus: ’A Quarter Century of UNIX’, 1994, Addison-Wesley
[Salus94b]
Peter H. Salus: ’Unix at 25’, 1994, published in BYTE.com, October 1994 (Online: http://www.byte.com/art/9410/sec8/art3.htm)
[Schilling99]
Jörg Schilling: ’Some notes on the Linux SCSI implementation’, 1999 (Online: http://www.fokus.gmd.de/research/cc/glone/employees/joerg. schilling/private/linuxscsi.html)
[Sommerville96] Ian Sommerville: ’Software Engineering - Fifth Edition’, 1996, Addison-Wesley [SourceForge]
SourceForge documents, retrieved May 22nd, 2000, VA Linux Systems, INC. 1382 Bordeaux Drive Sunnyvale, CA 94089, USA (Online: http://sourceforge.net)
102
Bibliography
[Sourcexchange] SourceXchange documents, retrieved May 22nd, 2000, Collab.Net, Inc., 425 2nd St., San Francisco, CA 94107, USA (Online: http://www.sourcexchange.com) [Tanenbaum96] Andrew S. Tanenbaum: ’Computer Networks - Third Edition’, 1996, Prentice Hall Inc., Upper Saddle River [TechWeb]
Online TechEncyclopedia by TechWeb (CMP Media Inc.) (Online: http://www. techweb.com/encyclopedia)
[Thies99]
Aaron Thies: ’Linux Time Line’, 1999 (?) (Online: http://www.cs.buffalo. edu/~thies/Alt/Time_Line.html)
[TINA95]
Martin Chapman and Stefano Montesi: ’Overall Concepts and Principals of TINA - Version: 1.0’, 1995, Telecommunications Information Networking Architecture Consortium
[TINA97]
Martin Yates et al.: ’TINA Business Model and Reference Points - Version 4.0’, 1997, Telecommunications Information Networking Architecture Consortium
[Thompson00]
Nicholas Thompson: ’Reboot! How Linux and open-source development could change the way we get things done’ , March 2000, published in ’The Washington Monthly’ (Online: http://www.washingtonmonthly.com/features/2000/ 0003.thompson.html)
[Unknown00]
’2001 - Odysee im Webraum’, April 2000, published by Software und Support Verlag GmbH (Online: http://entwickler.com/jm/ausgaben/2000/4/online. html)
[VALinux00]
Conference Call Script, 2/24/00, Preliminary, 2/23/00 7:50 pm, Second Quarter Fiscal 2000 Earnings Release, Filed by VA Linux Systems, Inc. (Online: http: //www.freeedgar.com)
[Webopedia]
Online Webopedia by Internet.com Corp. (Online: internet.com)
[Wilkins95]
John S. Wilkins: ’Evolutionary models of scientific theory change’, 1995, master thesis, Department of Philosophy, Monash University, 3 Peel Grove, Mount Martha 3934, Australia (Online: http://www.users.bigpond.com/ thewilkins/darwiniana.html)
[WISE]
Web Integrated Software Environment (WISE) documents, retrieved May 22nd, 2000, NASA/West Virginia University Software IV&V Facility, 100 University Drive, Fairmont, WV 26554, USA (Online: http://research.ivv.nasa.gov/ projects/WISE)
[WorldForge]
WorldForge Project: ’An Introduction To WorldForge’, last modified: 24 Nov 99, retrieved June 11th, 2000 (Online: http://www.worldforge.org/website/ about/)
[XFree86Org]
XFree86 Project documents, retrieved June 11th, 2000 (Online: http://www. xfree86.org/)
http://webopedia.
German Abstract
103
German Abstract
Open Source beschreibt Software, die unter Verzicht auf wesentliche Copyright-Rechte vom Urheber vertrieben wird. Dadurch wird eine freie Weitergabe und Veränderung der Software ermöglicht. Mein Ausgangspunkt war die Beobachtung einer unzureichenden technischen Umgebung zur Nutzung und Entwicklung von Open Source Software. Die Untersuchung von vorhandenen Werkzeugen zur informationstechnischen Unterstützung zeigte grundlegende Unterschiede zur kommerziellen Softwareentwicklung. Eine darauffolgende Betrachtung von klassischen Modellen (z.B. ODP, TINA, ISO9000) und deren geisteswissenschaftlichem Hintergrund lieferte die Basis für die weitere Arbeit. Daraus ergab sich die hier angestrebte Zielsetzung der Formulierung eines grundlegenden Modells zur informationstechnischen Unterstützung von Open Source Software. Diese Arbeit liefert dazu einen ersten Ansatz, indem die Entwicklung, Verteilung und Benutzung von Open Source Software betrachtet wird, um einfache Strukturen zu identifizieren. Der gesellschaftliche Hintergrund, die Projekt- und Ablauf-Organisation sowie die existierende technische Unterstützung wird untersucht und gefundene Besonderheiten werden in einen allgemeinen Zusammenhang gebracht. Das daraus entstehende Modell wird dann exemplarisch erläutert und illustriert.