Abstract: This paper provides an exploration of the analysis and use of comparative case studies as an approach to understanding software processes in complex organizational settings. Case studies are well suited to capture and describe how software processes occur in real-world settings, what kinds of problems emerge, how they are addressed, and how software engineering tools, techniques, or concepts are employed. The overall purpose of comparative case analysis is to discover and highlight second- or higher-order phenomena or patterns that transcend the analysis of an individual case. Comparative case analysis provides a strategy that enables the development of more generalizable results and testable theories than individual or disjoint case studies alone can provide. This study incorporates an examination and review of four empirical studies of processes involved in developing, using, or evolving software systems that employ comparative case analyses. Finally, a meta analysis of these four studies then highlights the strengths and weaknesses of comparative case analyses when used to empirically examine and understand software processes.
Keywords: Empirical studies, software processes,
case studies, software productivity, process-support environments
As such, if our goal is to (re-) design, formally model, simulate and enact new to-be software processes that use advanced technologies, then we need a systematic basis for understanding as-is software processes [S93a]. Similarly, if our goal is to develop and refine testable and predictive theories of software development, use, or evolution, then we need to empirically examine, describe and analyze how software processes occur, or will occur, in different organizational settings. These are the goals of comparative case analysis for understanding software processes presented in this paper.
This paper describes an empirical approach for investigating and understanding as-is software processes. The approach can be used to develop insights, heuristics, formal models or theories of as-is, to-be and here-to-there software processes. The approach is called comparative case analysis of software processes. As the name suggests, it employs a comparative analysis of multiple empirical studies of as-is software processes, where each study is treated as a unit case [cf. KKV96, L96, Y94]. It is particularly well suited to support the analysis of studies that primarily capture qualitative data, whether as descriptive, nominal or ordinal indicators, or as subjective data quantified using interval or ratio measures. An analytical framework follows in Section 2 that defines what comparative case analysis means and how it is used in this paper. It also describes how the unit of analysis examined in a study can be compared across multiple comparable cases, multiple analytic perspectives, multiple analytic modalities or multiple levels of analysis. However, in this paper, we will only examine empirical studies of software processes that employ multiple comparable cases or multiple levels of analysis.
In the remainder of the paper, four comparative case studies of software processes are described. The purpose is to help elucidate aspects of the methodology involved in comparative case analyses, since each of the four studies approaches the matter in a different manner. Following this is a meta-analysis of the four studies. The meta-analysis seeks to focus attention to the advantages and disadvantages of comparative analysis of as-is software processes as a strategy for developing an understanding of how software processes occur. Finally, the potential value derived from comparative case studies is examined in a discussion of how such analyses can contribute to the development and use of process support environments for software engineering, as well as other supporting technologies.
The overall purpose of comparative case analysis is to discover and highlight second- or higher-order phenomena or patterns that transcend the analysis and results (i.e., first-order phenomena) of a unit study [KKV94]. It can be used to frame research questions, assess the accuracy of data and the uncertainty of empirical inferences, discover root cause effects, and improve qualitative research studies [KKV94]. Comparison strategies that focus on sampling cases with common threads but with binary or n-ary variation in process "factors" serve an essential role in the development of empirically grounded substantive or formal theory [SC90]. Following such theoretical sampling strategies enables the construction of factorial or quasi-factorial research designs, but using qualitative studies to populate the design space, rather than quantitative studies which is the traditional requirement for these designs. Finally, comparative case analysis can also lead to the construction of indexes for retrieving one or more case data narratives/records stored in a case asset base or repository that satisfy a given query, exhibit a particular lesson learned or match a specified pattern [KS96, L96, MLS92]. Such support can then be used to help reason about or provide relevant empirical data that might lead to subjective insight in the (re)design or improvement of selected software processes in a given setting [cf. KS96, NS97].
Comparative case analysis can be used to develop deductive or inductive inferences, as well as empirically grounded theories of the phenomena or processes under study [KKV94, SC90]. The discipline of Computer Science in general, and Software Engineering in particular, suffers from a dearth of empirically grounded, predictive or refutable theory [T98]. The availability of a research method that can lead to the development of such theories represents an opportunity that can be considered for incorporation into empirical research endeavors. Thus, in this paper, we will examine one comparative study whose results led to the development of an inferential framework and testable, predictive theory pertaining to how software processes breakdown and how they can be repaired [M92, MS90, MS93, MS96].
Case studies are the most common form of empirical study or experimentation found in the software engineering research literature [ZW98]. However, most of these may be individual or stand-alone studies examining some software phenomenon from a given perspective. Comparative case analysis on the other hand requires two or more cases that are comparable along a number of dimensions [KKV96,SC90]. This will be elaborated below and then demonstrated. Multiple cases can be examined and compared along their unit of analysis, mode of analysis, level of analysis, and terms of analysis [vBE+82]. Each of these is explained below. Nonetheless, overall effort is streamlined when cases selected for comparison follow from an explicit research design that structures data collection and data analysis with comparison, theory building or theory testing in mind [KKV94, SC90].
Each of the four studies is described in terms covering the focal software
process, organizational settings, process granularity, data collection
methods, major findings, salient issues, and how comparative case analysis
helped. The software processes in focus include software maintenance, software
specification, overall software development, and software processes that
breakdown. The organizational settings are highlighted according to the
workplace, tasks within the focal process, roles and tasks participants
enact, software tools in use, and other resources in contention. Process
granularity is addressed in two ways: first in terms of the structural
aggregation or decomposition of a software process; and second, in terms
of the temporal dimension or timeline covered during process enactment.
Data collection methods characterize the research modality employed in
the study, as noted earlier. Major findings and salient issues attempt
to summarize the results of each study. Last, for the purpose of this paper,
each of the four studies is characterized in terms of how comparative case
analysis helped or hindered in achieving its results.
Each research modality collects and analyzes different kinds of data in different ways to different ends. For example, if quantitative results are sought, then one needs to choose a research modality that samples, collects, and analyzes data to that end. Controlled laboratory experiments or surveys can be used to measure the frequency and distribution of events, activities, objects or attribute values that characterize the enactment of a software process. However, such quantitative data does not readily explicate process structure, resource flow, or behavioral dynamics. In addition, using individual case studies to collect quantitative data, especially self-reported data, is usually unreliable and not indicative of the rigors required for research-grade case studies [Y94]. Qualitative field studies are well suited for capturing and analyzing the structure, flow, and other behavioral dynamics that characterize how a software process is performed. But individual qualitative studies of a software process instance are not reliably generalized into a pattern, class, or process model. A conflict between the apparent value of research modalities thus emerges between those that stress either quantitative or qualitative approaches, analyses and results. How can we move beyond this?
Two additional strategies have emerged for analyzing and comparing software process studies across modalities, triangulation techniques and computational techniques. Triangulation techniques [J75, K91, KKV96, SC90] enable qualitative and quantitative modalities to be juxtaposed to a common end within an overall research design strategy. For example, qualitative studies may be used to develop an understanding of how some software process occurs, and what internal dynamics or patterns may have causal significance. Quantitative methods and measurements may next follow to collect data documenting the frequency and distribution of the relevant variables revealed through the qualitative methods. Then, a final round of targeted field studies is performed to substantiate and explain the significance of statistical inferences derived through analysis. Such an effort would require study of a software process that is repeatedly performed within a given setting on a tractable timeframe (E.g., software inspection [KS96] or configuration management [G96]). Triangulation can therefore lead to (but not guarantee) high quality results by building on the relative strengths, while diminishing the relative weaknesses, of quantitative and qualitative research modalities. However, triangulation studies in software engineering are uncommon, require diverse research skills, take a long time, and may be expensive [vBE82+]. But this may be the price to pay when one wants to pursue definitive or comprehensive studies of how software processes are performed.
Computational techniques enable empirical studies software processes
that have been modeled or formalized in a form suitable for interpretation,
simulation, or automated execution [CKO92, HB+94, M92, MS90, SM97]. By
computationally modeling an as-is software process that has been empirically
studied using qualitative or quantitative techniques, one can employ a
simulation test-bed as a "laboratory" for running controlled and measured
experiments, or for prototyping processes to engage and collect user feedback
[SM97]. This enables another form of comparative examination and triangulation.
However, the use of simulation of course raises questions regarding the
validity of the simulated models as a surrogate for real world phenomena.
In contrast, simulations also allow the construction of software process
conditions that might otherwise be difficult, costly, or hazardous for
people at work to perform. Nonetheless, it offers a potentially lower cost
alternative compared to conventional triangulation methods, albeit within
the caveats for validity. Beyond this, computational process models can
also be used to configure and drive automated software process execution
environments, which can capture and analyze process enactment records [GJ96,
HB+94, MS92, SM97]. Similarly, computational models of existing software
process histories or results have been incorporated in case-based reasoning
and case asset management systems, thereby also enabling another medium
for comparative analysis [KS96, L96, MLS92].
In early empirical studies of software system development, use, and evolution, six analytical perspectives were found in the literature [KS80]. Rather than restate these perspectives here, it is sufficient to note that in conducting a single empirical study, one can design the study to collect and comparatively analyze data conforming to each perspective [cf. KS80]. Thus, a single case study can be comparatively analyzed using the terms of analysis from multiple perspectives. Similarly, in re-analyzing a sample of published empirical studies of software processes, one can classify them in terms of the perspective used, as well as apply the other perspectives to see what may have been missed due to a blind spot [KS80]. Finally, in this paper, each of the four studies in Section 3 is analyzed using the terms of analysis drawn from a single, aggregate perspective. This perspective selectively combines terms from multiple perspective that sees computing systems in general, and software processes in particular, as inter-linked web of people at work performing tasks using tools to produce products or services that use or consume other organizational resources along the way [KS82, MS90, MS96, SM97].
Overall, if one seeks to conduct a comparative case analysis that is
the most comprehensive, providing prediction or inference across micro-
through macro-level scope, then one needs examine and compare a set of
cases that can span multiple levels of analysis. As the four studies in
this paper examine and compare multiple cases of different unit types in
similar and different settings, then an overall meta-analysis of these
studies will therefore span multiple levels of analysis.
The software processes under study address those involved in use, maintenance, and evolution the program suite over a period of years. CSRO and CSD were each associated with different universities. In both settings, the participants involved in using and maintaining the program suite were computer scientists working as graduate students, research assistants, full-time research associates or visiting scientists, faculty, or systems support personnel. Participants of each type were interviewed, some multiple times. Most of the participants had advanced degrees in Computer Science, and none could be said to be a naïve or ignorant system user. To the contrary, mastering the use of the program suite was essential to many participants in order for them to get their work products--research reports and presentations--produced and disseminated. Numerous samples of such artifacts were collected and examined. The program suite included alternative text editors (Emacs, Vi, etc.) and document formatting packages (TeX/LaTeX, Scribe, and Troff/Nroff). Each of the researchers involved in this study had multiple opportunities to use these tools to produce documents, as well as observing using and maintaining the software. Thus, in this study, we have people with similar skills, expertise and interests, producing similar products, using similar software tools, in similar workplaces. But by comparing the structure and dynamics of how the software maintenance process occurred, the evolutionary trajectories were found to diverge [BS87]. Why? The analysis reveals that the tools themselves were not the cause accounting for the observed divergence. Instead, the manner and circumstances that shaped how software maintenance was performed in setting was found as the cause.
Although both settings had similarly skilled and talented software people with similar professional interests, different participants had different career contingencies at play. These encouraged or discouraged the participants whether or not to personally get involved in software maintenance activities. At CSD, a small group of participants got very enthusiastic about mastering the internal and external operations of the program suite with its open source code. This in turn gave rise to highly supportive and reactive approach to software maintenance tasks, where software maintainers and users work together, resulting in their mutual benefit. Eventually, this interest overcame their other interests (research publication), and as a result they found their skills and interests would be better served in industrial positions. In contrast, at CSRO, maintenance activities on the same software programs as at CSD was considered "hack work" rather than something that would enhance the participants work interest (publishing research reports) or career opportunities [KS80]. CSRO participants discouraged anyone by systems support personnel from engaging in any software maintenance activities unless unavoidable. However, the systems personnel were generally focused on maintaining research applications and network services. Thus, as participants they had little interest in maintaining the text program suite.
Thus, at CSRO and CSD, the process of software maintenance was facilitated
and made effective, or else discouraged and made less effective, depending
on the career contingencies of the participants doing maintenance activities.
This finding emerged from a comparative analysis of the two cases. If either
case by itself were analyzed, then the salience of career contingencies
on how software maintenance activities were performed would not be easily
observed.
Each team was assigned to develop and deliver informal and formal software system specifications within a two-week schedule. All teams were simultaneously trained in the specification concepts, techniques, and tools that would be used [CSS86], as well as instructed in the common specification development process that all were to follow. The process guideline (an informal process model [CKO92]) stipulated a set of seven tasks to be performed. This included developing a work plan, developing informal specifications, developing formal specifications, analyzing the formal specifications using the provided tools, and others [BS89]. The process guideline was designed by the author to enable the teams to complete the process working part-time on the effort. Each team also elected to perform what was later called a "pre-planning" activity to help make sure they understood what they were suppose to do, and how to divide the effort and responsibilities among themselves.
Data was collected through multiple observation sessions with each team in a variety of locations over the two-week period. There was effectively no participation or intervention with the individual teams or developers by the researchers. However, four scheduled 90 minute meetings involving all the participants and researchers were held during the two-week period. These meetings were used to solicit and discuss "technical" problems (E.g., tool operation commands, format of final deliverables) that participants encountered during the course of their effort. Follow-up structured interviews with participants from each team were also conducted after the specification effort was completed. The specification deliverables were also collected, reviewed, and independently processes with the tools to assess their quality (i.e., the presence of specification errors detected by the tools [CCS86]). Finally, each team member was asked to keep a diary to record the amount of hours they spent working individually or together on process tasks. This would then be aggregated and used as a surrogate measure for team productivity [BS89].
One issue addressed in this study was the examination of the roles of the software specification tools, and of the specification process guideline, as contributors to the productivity and quality achieved by each team. A first-order result that appeared in the analysis was that the teams with the highest productivity (least time spent) had the lowest quality formal specifications (most errors detected), while the teams with the highest quality (fewest errors) had the lowest productivity (most time spent). Perhaps this is not surprising. However, it is only a circumstantial relationship, not a causal one. That is, we had no observational data that could account for this. For example, we found the ability to successfully use the provided specification tools or language notation had no distinctive relationship that might account for the variance. As such, it was a finding without significance. We did however find something significant elsewhere.
Part of the observational data collected was coded for how each team had organized and divided the tasks among one another. Six recurring teamwork structures were observed and coded, ranging from those analogous to chief programmer teams to those analogous to egoless programmer teams, with the others alternatives in between these extremes [BS89]. Interviews, observations, and review of the each team plans revealed that each team anticipated working according to different teamwork structures. Observational data also revealed that teams experienced unanticipated circumstances that resulted in them shifting teamwork structures independent of their plan. What then was found through comparative analysis was that each team had a pattern of shifts in teamwork structure throughout the duration of the specification process effort. Many were according to plan, yet some were unanticipated. A second-order pattern then emerged in the comparative analysis.
Depending on which pattern of planned teamwork structures, and when unanticipated shifts in work structure occurred in the specification process, one could see associations between the most productive teams, and the high quality teams. In short, the high productivity teams tended to have team members mostly work by themselves throughout the process, while the high quality teams spent lots of time working together. The more productive teams had trouble near the end of the schedule when they planned to put things together. However, by working alone, it seems their collective understanding of the software specification they were developing would drift or get out of sync. As they did not yet have an explicit software architecture to reference, then their conceptual or logical interfaces to their component subsystems would not smoothly fit together. This would then be manifest when processing their formal specification. In contrast, the teams with high quality results spent many hours working together. Even teams that had planned to follow independent teamwork structures, ran into unanticipated problems which they confronted and addressed as a group. Subsequently, high quality teams were those who spend more time and effort making sure they understood the problems at hand, could re-sync concurrent work tasks, and could collaboratively work together to get their tasks completed. Thus, a second-order pattern emerged that associates teamwork structure and how they intentionally or unintentionally shifted towards or away from high productivity or high quality results. This meant we now had new insights into how to manage teams to be more productivity or to produce higher quality results. Similarly, we could identify how software specification environments in particular, and software engineering environments in general, should be designed to solicit and support processes that accommodate alternative teamwork structures that can easily shift under team control [BS89,CS91,MS92].
In contrast, if we only examine the performance of any one of the five
teams we could get a very different picture of what happened. For example,
we might conclude a high productivity team with low quality was careless
and ineffective, while a high quality team was careful but inefficient.
A view of a team in between that simultaneously maximized both productivity
and quality, but maximized neither overall, might have enacted the specification
process most effectively, but with somewhat disappointing results (low
productivity and low quality compared to the best in each category). Thus,
the point here is that by conducting a study using one team could lead
to vary different outcomes and results, compared to what was found through
comparative case analysis.
The eleven productivity studies all examined development efforts where software systems of at least 50,000 source lines of code were produced, while some studies examine the development of systems of greater than a million SLOC. Studies were conducted in computer systems manufacturers, banks, insurance companies, telecommunications companies, aerospace firms, government agencies, and others in the U.S., Japan, and Australia from 1975-1993. Nearly all the studies collected data over a period of a year or more. Software systems were developed in various high-level programming languages (Cobol, Fortran, PL/1, etc.) typically for mission-critical applications. Development teams ranged from less than 10 people to greater than 1000. Various software tools were employed, and sometimes the choice of programming language and support tools were the subject of analysis in individual studies.
All eleven studies examined software productivity in terms of total development efforts. In comparing what was measured and how, one finding that became clear across the eleven studies is that there is no single or aggregate measure of software productivity that is common, nor that provides clear insight into what affects productivity. At best, a few dozen variables need to be considered, but their interactions are unclear [S93B]. For example, a number of studies use SLOC or Function Points as a measure of the outcome of the overall process of software development. However, SLOC and FP characterize the size of a software program, not the process that produces it. As such, their validity as productivity metrics can be called into question [C80]. Unfortunately, SLOC and FP are also metrics that can be manipulated or corrupted by developers. This can happen when the measures are used to evaluate developer performance. Thus, as measures, they lack reliability [C80]. Beyond this, these measures emphasize collection of data that ignores upstream development activities (E.g., requirements analysis, system specification, architectural and detailed design, design reviews, etc.) and instead focuses attention to the result of coding processes, which in turn are found to constitute as little as 15% of overall development activity. Unfortunately, the studies typically apportion their productivity measurements across the overall development process. As such, we have no causal basis for ascertaining whether or how individual upstream activities or their supporting tools or methods contribute to improving software development productivity.
The science of software productivity measurement is dismal. There is little or no replicability of studies, methods, or definition of measurements. Ironically, perhaps this is due to the fact that all of the eleven studies focused on the collection of quantitative data, from either self-reported sources or survey instruments. Thus, the analyses rely on snapshots of data, rather than the movies from which they were drawn. Alternatively, in studies where similar measurements were applied, for instance those using FP, comparative analysis finds contradictory outcomes [S93B]. As such, the eleven studies as a group does not report on any behavioral dynamics associated with any of the overall software development process efforts they studied. No qualitative studies or analysis were employed. Thus, an unvarying bias toward quantitative analysis is found in these eleven studies. The results of these eleven studies describe some of what and how much was developed, but not how it was developed.
Subsequently, a comparative analysis of the eleven software productivity studies reveals that we do not understand how the overall software development process facilitates or inhibits improved productivity. Similarly, we do not understand how the upstream or downstream processes affect productivity. Finally, as a community we do not seem to understand how to measure software productivity in a reliable, valid and replicable manner. Thus, we cannot have a theory for how software development processes can increase or decrease software productivity. Instead, all we have are individual cases with a single perspective that attempt to account for what might affect software productivity.
As such, perhaps a new study using comparative case analysis of multiple software processes in different settings across multiple modes, terms, or levels of analysis can begin to contribute to establishing such an empirically grounded, theoretical foundation. Other suggestions for empirical studies along these lines, as well as what kinds of research questions can be addressed about software productivity, are also found in the study [S93B] and elsewhere [VBE+82].
Software processes can breakdown or fail independent of workplace, process granularity, task at hand, who is involved, what they are producing or what resources they are using/consuming, and what software tools are involved. However, any of these may be the facilitator or cause of a given breakdown instance. Similarly, changes in any of these individually, or in combination, may be used in repairing, restoring, or realigning extant workflow in the process that ran into trouble. Software process breakdowns might thus be said to be a pervasive phenomenon occurring throughout the world of software engineering. Subsequently, ad hoc responses, while possibly effective in certain circumstances, are not what was sought in this study. Instead, focus is directed to the generalization of process breakdown instances into a taxonomic classification that respectively associates process breakdowns and repairs by type, each according to a separate class hierarchy [M92,MS91,MS93]. But how was this achieved?
Through a comparative analysis of the nine studies, preliminary models of recurring types of software process breakdowns and responses were identified. Each model was then formalized and represented as a class hierarchy. However, the hierarchies covering the nine cases was somewhat awkward, indicating gaps and overlaps could be observed. Following from the cases, seven types of breakdowns could be identified [M92,MS93]. But in each case, the problem giving rise to a breakdown might be due to the people involved, the tools they were using, the resources needed or input to a task, the products or resources output by a task, the task structure itself, or the workflow within or between tasks. The insight this triggered was that a diagnostic matrix could be developed that associates problem type with causal source with response/repair activities.
Seven breakdown types by five causal sources by two kinds of task workflow can serve as a diagnostic classification scheme. Seventy categories (7X5X2) of process breakdowns were possible to classify. A given breakdown instance could then be diagnostically classified by breakdown, source, and workflow type. A similar analysis with process repairs identified 300 possible repair or response heuristics. Subsequent analysis, formalization, and some common sense insights indicated that these classification spaces could be selectively pruned, indexed and cross-linked. Pruning helps eliminate impossible or unattainable conditions [M92]. Indexing, process structure and resource flow linking provide a scheme and context for associating breakdowns and repairs. About 150 production rules account for the diagnostic indexing and inferential linking and navigation scheme [M92, MS93]. The nine cases can be accounted by a subset of the pruned classification space. Thus, the remaining categories represent a space of classes of process breakdowns that are predicted to be possible.
As a whole, the taxonomic classification schemes and associated indexing and navigational rules provide an ontological model or theory for how software processes breakdown and how they can be repaired [M92, MS91, MS93]. To no surprise, it is a theory that cannot be stated on a single page of paper. It is instead a complex and extendible theory that is embodied in an executable computational model [M92, MS91, MS93]. Accordingly, its description and operation is significantly understated here and beyond the scope of this paper. Nonetheless, it is a theory that has been demonstrated with examples [M92], operationalized [MS93], and in principal, it could be extended, refined, or subjected to refutation and replicability validations [SM97,T98].
As four comparative case studies were described in terms covering a focal software process, its organizational settings, process granularity, data collection methods, major findings, salient issues, and how comparative case analysis helped, we can use these as a basis for comparatively re-analyzing the results across the four studies. We can also re-analyze the four studies in across the levels of analysis employed in each. This is done in order to address the goal of identifying the strengths and weaknesses of comparative case analysis as a strategy for understanding as-is software processes. However, it also is done to highlight how such a re-analysis can inform the development and formalization of a computational meta-model for representing software processes, as highlighted in Section 5 and described in greater detail elsewhere [MS96]. Nonetheless, effort is also made to make this meta-analysis simple and tractable.
Focal software processes: The four studies examined software processes that span the software life cycle. Thus, comparative case analysis can be applied to help in understanding existing individual or composite software processes for developing, using, or evolving software systems.
Organizational settings: The four studies examined software processes occurring in academic settings, as well as in a large variety of industrial and governmental settings. The studies do not specifically compare any software processes in both academic and industrial settings, but this is characteristic of the studies chosen rather than of the comparative analysis approach. Thus, it appears that comparative case analysis can be used to study software processes in single or multiple settings, wherever they occur.
Process granularity: Software processes of comparatively fine grain (E.g., software specification process), coarse grain (overall software development), or something in between can be examined. The granularity of process decomposition and precedence structure can be examined in terms that assess teamwork structures of small groups. However, processes whose granularity scales down to individual behavior or up to large team work structure have not been subjected to comparative case analysis. Thus, process granularity issues need to be addressed and carefully framed if comparative case analysis is to be performed.
Data collection methods: The four studies all employed qualitative research methods. This reflects a selection bias. Nonetheless, data collection and analysis methods appropriate for controlled laboratory experiments and survey research in support of meta-analysis techniques exist and have been applied in disciplines outside of Computer Science. These methods can be adopted for use in collecting quantifiable data from experiments or surveys addressing software processes to facilitate comparative analysis. Thus, both qualitative and quantitative data collection methods and research modalities can be employed in comparative analysis of software process instance histories.
Major findings and salient issues: Perhaps the most significant overall finding from the four studies is that comparative analysis revealed findings, patterns, or insights into the dynamics of software processes that could not be readily derived from any individual case within each study. Comparative analysis reveals that if in each study, only one case was chosen for analysis from a single perspective, findings or insights derived would not necessarily be indicative of the systemic behavior observed across multiple comparable cases. Beyond this, the four studies found: career contingencies can affect how software maintenance processes are performed; teamwork structures shift in patterns that influence productivity and quality; quantitative studies of software productivity are inconclusive; and empirical and theoretical results indicate that software process breakdowns can be diagnostically classified and heuristically repaired. Thus, comparative case analysis of as-is software processes can lead to a distinct class of results and insights compared to traditional individual case studies or quantitative studies. These results and insights in turn lead to new criteria for how software process support environments should be designed to operate in order to be more effective.
How comparative case analysis helps: Many realizable benefits were identified. These include the ability of comparative case analysis to enable the following capabilities. First, accommodate the collection and analysis of qualitative data of as-is software processes. Second, employ multiple, comparable cases to increase insights into higher-order phenomena not easily perceived in a single case. Third, increase the generalizability of results for theory building or use as heuristics. Fourth, selectively use other independently published empirical studies as unit of comparative analysis. Fifth, support meta-analysis of studies employing quantitative data. Last, to cast results in computational form via formal representation of case data, relations, pre-/post-conditions, trigger or alert rules, and value constraints that can be utilized within software process support environments [HB+94, KS96, MLS92, MS90, MS92, MS96, SM97]. As such, there appears to be a compelling basis for employing comparative case analysis as an analytical technique for understanding as-is software processes.
Focal software processes: None of the four studies address processes where software systems are developed use COTS methods, software components, or integrated development environments. Software maintenance and specification processes were examined in academic settings, not industrial settings. Software productivity studies did not examine development efforts using object-oriented languages and techniques. Therefore questions can be raised for how or whether processes addressing these matters can benefit from comparative case studies, and whether such studies will be conducted. However, one could also argue that there is nothing about such processes that would preclude their examination via comparative analysis, except for the absence of the required empirical studies.
Organizational settings: None of the four studies examined organizational settings where contracted software development or systems integration services were highlighted. Studies of software maintenance and specification teamwork processes involved well-educated computing participants. Would maintenance and specification processes fundamentally differ if not so well educated participants were involved? Would a different set of participants, tasks, software tools, and process inputs or outputs change the organizational setting, and thus the structure and behavioral dynamics of software processes observed? This seems likely. But the point of comparative case analysis that span multiple organizational settings is to help identify which conditions, events, or resource constraints impinge on the focal software processes, as well as how, when and why.
Process granularity: The studies of software maintenance and software productivity examine software processes at a coarse, "top-most" structural level and temporal timeline that lacks sufficient decomposition to directly benefit software process automation efforts. The results from these studies thus may only have value as management advice, insight, or as a point of departure for finer granularity research studies. However, the studies of software specification and software process breakdown exhibit sufficient granularity to inform the design and implementation of software process support environments [cf. CS91, MS92, SM97]. Thus, unless care is taken in selecting the choice of studies for comparative case analysis, it cannot guarantee the requisite process granularity needed to inform the design of new software process support technologies.
Data collection methods: Due to a selection bias, none of the four studies rely on the collection and analysis of cases with quantitative data. The study of software productivity discredits the value of existing quantitative metrics for its measurement. Thus software process modeling techniques that rely on algebraic formulae or statistical analyses cannot be readily employed. Comparative studies employing such quantification metrics for measuring and understanding software processes should therefore be desirable.
Major findings and salient issues: Comparative case analysis is not a panacea. It is not a substitute for other research modalities. Comparative analysis is a secondary assessment of extant studies and empirical findings, not a primary study seeking to identify first-order phenomena that heretofore have not been addressed or discovered. Comparative analysis comes after the pioneering studies are done. It realizes the benefits of hindsight, without incurring the costs or risks of primary studies. Its results may be ironic rather than bold assertion. Finally, it enables the development of theory that stresses the salience of situated aspects of process structure and behavioral dynamics. This at the expense of identifying the frequency and distribution of events, conditions, and constraints.
How comparative case analysis has hindered or misled: The four studies display a selection bias toward the required use of comparable cases of as-is software processes, rather than independent cases of to-be software processes, or of process support technologies [E.g., GJ96]. Thus, we do not have much of an empirically grounded basis for comparing or evaluating the value of alternative software process technologies or approaches [cf. CL+96].
Conducting comparative case studies can be time-consuming and labor-intensive. It directs attention away from the development of software process technologies based on intuitive insight, technical prowess, or funding opportunities addressing what technical capabilities are possible or desirable. Qualitative data, analyses, and results from comparative analysis of as-is software processes may be cast in terms that do not shed a clear light on what software process modeling techniques should be used [cf. CKSI87]. Comparative case analysis appears to require analytical skills uncommon in the disciplines of Software Engineering or mainstream Computer Science, but perhaps more common in the Social Sciences. Thus, is this approach something that should be part of research in software engineering or software process?
Formal representation of software process data and relations does not readily conform to procedural process programming or conventional object-oriented modeling paradigms. Thus widely available data management, modeling, and workflow support tools may be of little use.
Thus how comparative case analysis fits into mainstream software engineering
or software process research and practice is problematic. This seems plausible,
unless you have been able to see how it might be applied, and can find
value in how it can add to understanding software processes situated in
complex organizational settings.
Significant insights can result from a comparative case analysis. Although this is true of any research modality, comparative analysis helps identify patterns and findings that would be easily missed when relying on an individual case study or from an un-orchestrated sample of multiple cases. As individual case studies are the predominate modality for empirical research and experimentation in Software Engineering and Computer Science, then there is something to be gained through an analysis technique that overcomes a widespread shortfall in current research trends.
Comparative case analysis is an approach and methodology that crosses disciplinary boundaries. Thus, its role and fit within the software engineering and software process communities must be assessed, as must the value of such a crossing. Nonetheless, comparative analysis helps illuminate the behavioral and interactional dynamics of as-is software processes occurring in complex organizational settings.
Finally, this paper focuses on the use of comparative case analysis
in assessing and understanding as-is software processes. Could it also
be used to study to-be and here-to-there software processes, instead of
just as-is software processes? This is both plausible and likely as a natural
direction for process researchers to explore [CL+96]. Similarly, can comparative
case analysis be used to study other issues in software engineering, such
as alternative software architectures, or alternative techniques for developing
the formal specifications of software systems? Again, this seems plausible
and likely. Thus it seems we should be prepared to consider and incorporate
techniques for comparative case analysis into the workbench of concepts,
methods, and tools that can be employed to supplement empirical studies
of software engineering principles and practices.
[BS87] S. Bendifallah and W. Scacchi. Understanding Software Maintenance Work. IEEE Trans. Software Engineering, Vol. SE-13(3), 311-323, March, 1987. Reprinted in Tutorial on Software Maintenance and Computers, D. Longstreet (ed.), IEEE Computer Society 1990.
[BS89] S. Bendifallah and W. Scacchi. Work Structures and Shifts: An Empirical Analysis of Software Specification Teamwork. Proc. 11th. International Conference on Software Engineering, Pittsburgh, PA, IEEE Computer Society, 260-270, 1989.
[CL+96] A. Christie, L. Levine, E. Morris, D. Zubrow, T. Belton, L. Proctor, D. Cordelle, J. Ferotin, J. Solvay, G. Segoti. Software Process Automation: Experiences from the Trenches, CMU/SEI-96-TR-013, Software Engineering Institute, Carnegie-Mellon University, Pittsburgh, PA, 1996. http://www.sei.cmu.edu/pub/documents/96.reports/pdf/tr013.96.pdf
[C80] B. Curtis. Measurement and Experimentation in Software Engineering. Proceedings of the IEEE, 68(9), 1103-1119, September 1980.
[CKO92] B. Curtis, M. Kellner, and J. Over. Process Modeling, Communications of the ACM, 35(9), 75-90, September 1990.
[CKSI87] B. Curtis, H. Krasner, V. Shen, and N. Iscoe. On Building Software Process Models Under the Lamppost, Proc. 9th. Intern. Conference on Software Engineering, Monterey, CA, IEEE Computer Society, 96-103, March 1987.
[CKI88] B. Curtis, H. Krasner and N. Iscoe. A Field Study of the Software Design Process for Large Systems. Communications of the ACM, 31(11), 1268-1287, November 1988.
[GJ96] P.K. Garg and M. Jazayeri (eds.), Process-Centered Software Engineering Environments, IEEE Computer Society Press, New York, 1996.
[G95] R.E. Grinter. Supporting Articulation Work Using Software Configuration Management Systems, Computer Supported Cooperative Work: The Journal of Collaborative Computing, 5(4), 447-465, 1995.
[GKC87] R. Guidon, H.Krasner and B. Curtis. Breakdowns and Processes during the Early Activities of Software Design by Professionals. In G.M. Olson, S. Sheppard, and E. Soloway (eds.), Empirical Studies of Programmers (Second Workshop), Ablex Publishing Co., 65-82, 1987.
[HB+94] G. Heineman, J. Botsford, J. Caldiera, G. Kaiser, M. Kellner, and N. Madhavji. Emerging Technologies that support a Software Process Life Cycle. IBM Systems Journal. 32(3):501-529, 1994.
[H97] M.M. Hunt. How Science Takes Stock: The Story of Meta-Analysis, SAGE Publications, Newbury, CA 1997.
[JS95] A. Jazzar and W. Scacchi. Understanding the Requirements for Information Systems Documentation, Proc. 1995 ACM Conference on Organizational Computing Systems, San Jose, CA. ACM Press, August 1995.
[J79] T.J. Jick. Mixing Qualitative and Quantitative Methods: Triangulation in Action. Administrative Sciences Quarterly, 24(4), 602-611, 1979.
[KKV94] G. King, R.O. Keohane, and S. Verba. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton University Press, Princeton, NJ, 1994.
[KS96] H. Kitano and H. Shimazu. The Experience Sharing Architecture: A Case Study in Corporate-Wide Case Based Software Quality Control, in D.B. Leake (ed.), Case-Based Reasoning: Experiences, Lessons, and Future Directions. AAAI Press/MIT Press, Menlo Park, CA, 235-268, 1996.
[KS80] R. Kling and W. Scacchi. Computing as Social Action: The Social Dynamics of Computing in Complex Organizations, in M. Yovits (ed.) Advances in Computers, Vol. 19, Academic Press, 249-327, 1980.
[L96] D.B. Leake (ed.), Case-Based Reasoning: Experiences, Lessons, and Future Directions. AAAI Press/MIT Press, Menlo Park, CA 1996.
[M92] P. Mi. Modeling and Analyzing the Software Process and Process Breakdowns. unpublished dissertation, Computer Science Dept., University of Southern California, Los Angeles, CA 1992.
[MS90] P. Mi and W. Scacchi, A Knowledge-Based Environment for Modeling and Simulating Software Engineering Processes. IEEE Trans. Data and Knowledge Engineering, Vol. 2(3), 283-294, September 1990. Reprinted in Nikkei Artificial Intelligence, Vol. 20(1), 176-191, January 1991 (in Japanese); also in Process-Centered Software Engineering Environments, P.K. Garg and M. Jazayeri (eds.), IEEE Computer Society, 119-130, 1996. http://www.usc.edu/dept/ATRIUM/Papers/Articulator.ps
[MS91] P. Mi and W. Scacchi. Modeling Articulation Work in Software Engineering Processes. Proc. 1st. Intern. Conference Software Processes, Redondo Beach, CA, IEEE Computer Society, 188-201, October 1991.
[MS92] P. Mi and W. Scacchi. Process Integration for CASE Environments. IEEE Software, 9(2), 45-53, March 1992.
[MS93] P. Mi and W. Scacchi. Articulation: An Integrative Approach to Diagnosis, Replanning, and Rescheduling, Proc. 8th. Annual Knowledge-Based Software Engineering Conference, Chicago, IL, IEEE Computer Society, 77-85, September 1993. http://www.usc.edu/dept/ATRIUM/Papers/Articulation.ps
[MSL93] P. Mi, M.J. Lee and W. Scacchi. A Knowledge-Based Software Process Library for Process-Driven Software Development. Proc. 7th. Knowledge-Based Software Engineering Conference. McLean, VA, IEEE Computer Society, 122-131, September 1992. http://www.usc.edu/dept/ATRIUM/Papers/Process_Asset_Library.ps
[MS96]. Mi and W. Scacchi. A Meta-Model for Formulating Knowledge-Based Models of Software Development. Decision Support Systems, 17(3):313-330. 1996. http://www.usc.edu/dept/ATRIUM/Papers/Process_Meta_Model.ps
[S93b] W. Scacchi. Understanding Software Productivity: Towards a Knowledge-Based Approach, Intern. J. Software Engineering and Knowledge Engineering, 1(3), 293-321, 1993. Revised Version in D. Hurley (ed.), Software Engineering and Knowledge Engineering: Trends for the Next Decade, Volume 4, 1995. http://www.usc.edu/dept/ATRIUM/Papers/Software_Productivity.ps
[SM97]. Scacchi and P. Mi. Process Life Cycle Engineering: A Knowledge-Based Approach and Environment. Intelligent Systems in Accounting, Finance, and Management, 6(1):83-107, 1997. http://www.usc.edu/dept/ATRIUM/Papers/Process_Life_Cycle.html
[SN97] W. Scacchi and J. Noll. Process-Driven Intranets: Life Cycle Support for Process Reengineering. IEEE Internet Computing, 1(5):42-49, 1997). http://www.usc.edu/dept/ATRIUM/Papers/PDI.pdf
[S88]. A. Strauss, The Articulation of Project Work: An Organizational Process. The Sociological Quarterly, 29(2), 163-178, 1988.
[SC90] A. Strauss and J. Corbin. Basics of Qualitative Research: Grounded Theory Procedures and Techniques, SAGE Publications, Newbury Park, CA 1990.
[S86] L. Suchman. Plans and Situated Actions: The Problem of Human-Machine Communication. Cambridge University Press, Cambridge, UK, 1986.
[T98] W. Tichy. Should Computer Scientists Experiment More? Computer, 31(5), 32-40, May 1998.
[WR96] D. Wixon and J. Ramey (eds.). Field Methods Casebook for Software Design. John Wiley and Sons, New York, 1996.
[Y94] R. Yin. Case Study Research: Design and Methods. SAGE Publications, Newbury Park, CA 1994.
[ZW98] M.V. Zelkowitz and D.R. Wallace. Experimental Models for Validating Technology, Computer, 31(5), 23-31, May 1998.