Workshop on Developing Successful Object-Oriented Frameworks OOPSLA '97, Atlanta, GA Sunday, 5 October 1997 Todd Hansen, Steven Fraser, Craig Hilsenrath, Bill Opdyke, and Arthur Riel, organizers Department of Computer Science University of Illinois at Urbana-Champaign 1304 West Springfield Ave. Urbana IL 61801 (217) 328-3523 email@example.com
The remainder of this tale takes a candid look at my experiences with this framework, and, in the process of doing so, attempts to cover as many of the workshops laundry list of points of interests as it can. Ive noted with satisfaction that the organizers of this workshop are insisting on war stories from real framework developers, and are requiring mere researchers to go to extra lengths to establish their bona fides. Now, this is not to say that Im not also one of those framework researchers. Indeed, for the last several years, Ive worn two hats: framework researcher and framework developer. Ive been doing research on object-oriented frameworks for twelve years, and have written and spoken on the topic in a number of forums during this time. This paper however, is my first attempt to describe my experiences, both good and bad, with trying to practice professionally what I had theretofore preached.
OSIRIS is a domain-specific object-oriented framework for building psychophysiological realtime experimental control and data acquisition applications. The basic idea is that one places electrodes on the heads of experimental subjects, and records their brain waves as they perform simple experimental tasks. These data are digitized using an A/D converter, and stored in digital form. The nature of the tasks, as well as the data acquisition strategy, varies from experiment to experiment. By virtue of the nature of experimental research itself, the requirements placed on the experimental software change rapidly. A particular study may be run and published over a period of weeks.
The challenge for the software designer is to write applications that encompass as wide a range of experimental possibilities as possible, and to make it easy to write new applications when existing ones are not sufficiently flexible to cope with new scientific needs.
The framework was constructed on and off over several years during the period from 1992 to 1996. Line- of-code metrics can be particularly misleading when one is discussing object-oriented frameworks. However, they can be useful up to a point as a gross indication of scale. The OSIRIS source files weigh-in at 53877 lines of code. By contrast, the OWL 4.53 GUI framework itself is about 41000 lines long. To be fair, there is more internal documentation in OSIRIS than in OWL. Of course, Id argue that when it comes to frameworks, shorter is better anyway, and can make a case that OSIRIS would be shorter were it more mature. A typical application, DIM, built using OSIRIS contains 8563 lines of code.
The framework has been used to construct four distinct, major applications, for researchers at three different universities. Each application, in turn, has evolved to encompass the needs of between one to five distinct families of experiments. Customary laboratory programming practices would have employed distinct programs for each experiment (not application) and each program would have included its own versions of much of the code in the framework as well.
Parameterization allowed me to push design decisions out onto the user. If I could think of more than one way to do something, or a way of dealing with a particular request that could obviously be generalized to encompass a range of likely future requests, Id put this in a user configurable table of parameters, and let the users change these themselves. Usually, users would be delighted to find that a change that usually required a new program could be made on the spot by changing a parameter. Id be pleased as well, of course, when my anticipation of the potential need for such generality spared me additional work.
This strategy also helped to forestall what I could see would otherwise become the bane of my existence: a phenomenon I later dubbed Metastisization [Foote 1988a], or the proliferation of many nearly, but not exactly identical versions of the same applications. This proliferation usually came about as a result of expedient, cut-and-paste alterations to existing, working applications, often at the hands of clever researchers who were, nevertheless, relatively unskilled programmers. I could see that the maintenance burden associated with such unbounded proliferation could consume time that might be better spent writing (for instance) fancier displays for the new graphics hardware that was starting to emerge. This seemed like an ignominious fate. Surely, some combination of cleverness and laziness could triumph over this tedium and drudgery. (Of course, only later did I realize that the usual solution to this problem was to change jobs once these sorts of problems started to arise.)
My belief at the time was that there had to be a better way, and in 1981, at the National Computer Conference in Chicago, I came across something that looked like it might be the answer: object-oriented programming. It was hard to miss: Xerox had decided to unleash Smalltalk-80, and held a high-profile event to introduce their porting partners. Xerox was also showing the Star Workstation, which gave many attendees their first glimpses of the sort of windowed desktops that were destined to take over the world. Intel was showing the iAPX432, a super-CISC processor with object-oriented instructions, which ran a crude, Smalltalk-like language. Other vendors were showing bitmapped workstation displays for the first time. I returned home, and tried to find out all I could about objects, bitmapped workstations, and these processors of the future.
What I found out was that objects had some remarkable properties. You could use inheritance to share code and data that always stayed the same, and put just the things that changed out at the edges. Because of dynamic binding, you could use any object so constructed anywhere where its shared repertoire might be invoked. Here, perhaps, was the answer to my problem of how to avoid having to write pretty much the same program over and over again.
Of course, mere mortals working with minicomputers had no access to the sorts of languages, tools and displays Id seen at McCormick Place. Even so, I spent the next several years working on a battery of the psychophysiological applications we were constructing as part of a mini-OEM laboratory data acquisition system wed decided to sell to other researchers. About two dozen of these $50,000/$100,000 DEC LSI/11-based Pearl systems were sold during this period.
The battery software needed to accommodate a wide range of potential research needs, which change quickly, and are, by definition, difficult to anticipate. I tried to engineer these programs to be as reusable as I could. I pushed as much sharable code as I could into library functions, and made them as customizable as I could by exposing parameter editors (which exposed hundreds of parameters). I also crafted the core of applications so that they shared the same code base, using a home-built preprocessor. By doing so, I was intentionally trying to crudely glean the benefits I thought object-oriented inheritance would give me, had I access to such tools. The preprocessor also produced a table of dynamic meta-information that allowed users and programs to access parameters by name at runtime. These tables tracked data types, legal ranges, and user help information. (Here were the roots of my eventual interest in object-oriented reflection.)
Over the next several years, I read as much O-O literature as I could, and even acquired an Apple Lisa so as to get my hands on Classcal, Object Pascal, and, eventually an exceedingly slow version of Smalltalk. I also decided to resume my graduate education in earnest, and came across a junior professor, Ralph Johnson, who was eager to let me explore my object-oriented generic laboratory application idea in Smalltalk.
The initial result of this collaboration was a Smalltalk framework that implemented a simulation of the battery applications Id built over the previous several years at work. Every line of this framework was presented in Designing to Facilitate Change with Object-Oriented Frameworks [Foote 1988a]. The principal conclusion of this work was that the idea of constructing a reusable, generic skeleton application for this domain out of dynamically sharable objects, that could serve as the nucleus for a family of related applications, really was feasible. By specializing different classes in the framework, specific applications could be derived from the generic core with relatively little code.
This desultory but amusing work also contained a lengthy discussion of how frameworks evolve. I was frankly surprised that despite years of domain experience, my initial design for framework evolved very substantially as I incorporated requirements from new applications, and as I exploited opportunities to glean yet more general objects from my code. I also didnt initially expect the degree to which inheritance seemed to be supplanted by delegation and forwarding to components as my design became more mature. However, such a strategy allowed the kinds of dynamic pluggability for objects that my old parameter editors had made possible for simpler data in my earlier laboratory applications. The emergence of distinct architectural identities for these objects, which would allow them to themselves serve as loci for specialization and evolution, enhanced their reuse potential. Some of these findings were discussed in [Johnson & Foote 1988].
There is no substitute for this sort of detailed, multi-level documentation if a framework is to flourish. Developers seem to find it easier to provide the detailed anatomy, or reference, documentation than they do the catalogs and cookbooks. Good examples, and comprehensible overviews that give one a sense of how the pieces fit, and why I (as a potential client) might care are harder to come by.
The source code is the ultimate, irrefutable, utterly operational authority on how the framework behaves. The source has the final say as to how the framework works. If it doesnt work the way the client wants, he or she can extend or change it (at the cost of maintaining these enhancements). Genuinely good documentation is essential if a framework is to prosper and endure.
Note that while I have claimed that OSIRIS is a successful framework, I have not claimed that it has achieved significant popularity. There is adequate reference documentation for the low-level realtime portions of the OSIRIS framework, and there is, of course, commercial and even third-party documentation for OWL. However, I have been to date unable to find the time to document the framework, nor anyone who is interested in underwriting such an effort. One reason for this is that the perpetually looming possibility of major overhauls makes it easy to rationalize procrastination. This is, of course, a trap. The lack of a suitable patron to pay for the documentation is a rationale that stands up better to sober analysis.
Any discussion of framework economics must include the cost of producing the documentation, and the cost of keeping it up to date, and the cost of distributing it. The Web can certainly be of assistance with the latter two items.
Ive found myself playing this role with the OSIRIS framework. Since there really hasnt been that much independent development undertaken with OSIRIS, this burden has thus far been tolerable.
There is, of course, the implication that the term virtuoso dictates that the developer who plays this role must be exceptionally skilled. Various investigators report differences of up to two orders of magnitude in the skill levels of software professionals. My impression has always been that some of these differences are overdrawn, and result from correctable deficiencies such as a lack of empowerment, poorly partitioned projects, inadequate training, domain inexperience, poor communication, and a lack of incentives. Sometimes the way to win a medal isnt by writing superior code.
I like to think Im a pretty good software developer (who doesnt), but, in any case, the solo style was imposed on me by necessity. I can, however, give confident, though subjective, testimony as to some of the advantages of this approach.
Autonomy: I controlled the product, and the architecture, (though not always the process). In particular, decisions regarding architecture and evolution were made in my head. It is well know that communication among team members is slow, and consumes an inordinate amount of time. As a solo act, these decisions were made intra-cranially. Thought is usually much faster than talk (except when one engages in an extended debate with ones self).
Most importantly, when I wanted to change an interface, I was, as both client and provider, able to weigh the cost of complying with an interface change vs. retaining backward compatibility myself. Further, as author of both the clients and the core framework, I was intimately familiar with the ramifications of such changes on both.
Style: As a solo act, you get to do it your way. I found that its useful to code in a way so that your code is recognizable amidst the OWL code and sample templates I used. Code Ownership, as (for instance) Coplien has observed, contributes to pride in workmanship, accountability, and easier maintenance, and conceptual and architectural integrity. I no more want someone else to edit a piece of code that Im responsible for maintaining and improving than I want to arrive at work and find my office and desk rearranged. If somebody moves something, I want to know it, and I want to know why. With code, Ill often want to re-assimilate it my way.
I dont advocate this for a collaborative code, of course, but as a solo architect-builder, I have the luxury of not dealing with it. Indeed, Id be an advocate of stylistic standards for collaborative work that encourage a sense that a body of code is recognizably our code, and not the exclusive bailiwick of any one individual. (Ive been contemplating writing proto-pattern (to be named SCHOOL UNIFORMS) that would make this case, but need to collect more evidence in order to do so.)
A significant additional consequence of the solo approach is that one reaps what one sows. Its hard to retain a slash-and-burn mentality toward ones work when you know that youll be the guy who has to clean up after your mistakes. Conversely, if you take the time to make your framework general, you are the guy on Easy Street the next time you need to build a new application.
There are, of course some downsides to this approach. If one has developed any design blind-spots, there is no one else to spot them. If one has a tendency towards lily gilding, there is no one to arrest it. The fact that my clients are dependent on an irreplaceable architect-builder troubles them considerably more than it does me.
Who pays for refactoring? Who pays for architecture? Who benefits? Who owns what? How do you avoid giving away the store?
Especially if refactoring and architecture is expensive, how do you charge for it? My usual approach was to treat refactoring as a precursor to additional development in a particular part of the framework. This way, the current client, who was to be the first beneficiary of the change, paid for the time. (Ive always billed time, and not deliverables, when doing this sort of work. Ive also used license fees.) Of course, a case can be made that every subsequent client is the benefit of some of these reusable elements, and that the cost might be somehow amortized. Fortunately, in the small potatoes world of OSIRIS, this never became a serious issue. It is easy to imagine these issues being a far more serious concern to larger projects.
Some of the refactoring in OSIRIS was done on my own time, as sweat equity investments in the framework. While I might have preferred that some benefactor underwrite this work, such effort is unencumbered by deliverable pressures, and gives one a basis for making certain ownership claims as well.
Another serious issue is ownership. If you deliver a framework with source and examples, may your colleagues go into completion with you? They have everything they need to do so. What of the issue of code that was developed for a particular client, but emerges as general, sharable code. In the academic world, some funding agencies consider code developed on their dime as government property. One needs to be aware of licensing, copyright, and ownership issues to handle these questions, especially as the stakes grow larger.
Proprietary client concerns are a problem too. A particular client may not be thrilled that your passion for crafting reusable artifacts based on his or her requirements will technically empower his or her competitors. So far, Ive been able to keep such competitive sensitive material in the client applications, but its not hard to imagine it becoming a more serious problem in a broader framework.
Scheduling refactoring is another key issue. Blending it with development, and treating it as a background task are two ways of dealing with it. Convincing management that a long term investment in architecture really will pay off down the road is an enduring difficulty. Ralph Johnson has estimated that about a third of a development group's time should be devoted to refactoring. This sounds about right to me.
Frameworks are hard to build
There is simply no denying that it takes time, skill, and resources to get these right. You cant just commission one. You have to make a commitment to pursuing, even cultivating frameworks to reap the reuse rewards. You cant design a really good one up-front, unless you can leverage pre-existing domain experience. I am bemused by the hubris of those who think they can. If youve worked in the area for a while, you may have the knowledge to discern the right abstractions already. If you are working in an area where frameworks and libraries exist already, then there are shoulders to stand on.
One the other hand, if you are working in a domain where architecture has been unexplored or undiscussed, youve got a long row to hoe. Initial analysis usually yields a decent, but superficial collection of objects that model the surface structure of the domain, which can grow down. Initial implementation efforts grow up in response to linguistic and resource driven forces. In between is the realm of the truly reusable objects, which become discernable only as successive efforts to redeploy them are made, and the code is refactored to embody these insights. It is from this process that reusable frameworks emerge. Weve written on a number of occasions about this process [Foote 1988a][Johnson & Foote 1988][Foote & Opdyke 1994][Roberts & Johnson 1996].
Domain knowledge is essential
You need to find a skilled software architect who is willing to become extremely familiar with the domain in question, and spend time interacting with domain experts, in order to glean the right objects from amongst the pyrite. From the standpoint of the architect him or herself, this constitutes a significant commitment, and investment. Framework design is not a casual pursuit, and one needs to choose ones target domain wisely. (Now, if only Id gotten involved with realtime securities trading...)
Over-design is a trap
Top-down frameworks can be structural straightjackets. You can paint yourself into corners with architecture that your purely functional pieces would not force you into. This is one reason why white-box phases are a natural part of the early evolution of a framework. Form (architecture) really does follow function (features, data, and code), and mature components emerge from a messy kitchen architectural phase, where you scatter ingredients all over the counter before consigning them to their place in the stew.
Of course, its a mistake to say that its futile to attempt to design any architectural elements at all early in the lifecyle. Particularly in those cases where the architect has prior experience in the domain, reasonable initial conjectures can be made. My advice: take the obvious, dont pan for the clever. That can come later, when real opportunities to exploit such insights arise. Frameworks evolve, as do species, when they are stressed by the environment. They can even be cultivated by selecting successive challenges for them that seem representative of the range of applications they might be expected to ultimately encounter.
Reuse erodes structure, and structure can impede evolution. Consolidation and refactoring later in the lifecycle are essential to arrest these entropic tendencies. It is late in the lifecycle when the experience that can tell you what abstractions are necessary to straddle existing architectural requirements can be exploited.
C++ is hard to refactor
Having cut my object-oriented teeth on Smalltalk, I elected to use C++ to construct OSIRIS because using Smalltalk for the sorts of hard realtime tasks that my public demanded was impractical. In 1991, 286 and 386 era processors could not provide the RAM or CPU horsepower necessary to handle the response and processing requirements this domain posed under Smalltalk. There were licensing, cost, and training issues too. By contrast, C++ compilers were beginning to emerge with environments and performance that seemed promising. Existing client Fortran and C skills seemed more likely to map to a C++ environment than to Smalltalk. Porting existing computational functions to C++ would be relatively simple, and those that had already been coded in C would be immediately usable. (In hindsight, one of the reasons for C++s success (and it was successful, whatever you may think of its prospects now), was its ability to allow C programs to become C++ programs without requiring a total rewrite.) Furthermore, I had sketched the design in Smalltalk, and, with this in hand, could proceed to re-implement my framework in C++ with my simulation as a roadmap.
By-and-large, this was, in fact, what I did, and, by-and-large, it worked. As such, this would seem to vindicate those who advocate prototyping in languages like Smalltalk, while building production code using blue-collar languages like C++. I could, in a different frame of mind, and for a different audience, extol the virtues of this strategy, but this not my purpose at the moment.
Instead, in hindsight, Im struck most by the difference in programming tactics the two languages seem to demand. My sense is that it took me between 3 and 5 times as long to construct the same functionality in C++ as in Smalltalk. These numbers are somewhat subjective, but are based in part on logs I kept for billing purposes during the construction of OSIRIS. In particular, refactoring C++ code is a much more tedious process than in Smalltalk. The combination of two factors makes this so: the declarative redundancy C++ needs to do type-checking, and the paucity of tools to assist in this task. The combination of the two makes one much less quick on ones feet when it comes to performing refactorings, Refactorings are changes to the system that enhance its architectural integrity and reusability rather than add capabilities or enhance performance. Refactoring is the means by which architecture emerges as a framework evolves
Refactoring a C++ program entailed running around all over the system dealing with calls, and type declarations, and the like. More changes and more opportunities for inconsistency meant more chances to accidentally break something. Even a refactoring as simple as changing a member name might be so time consuming as to be avoided. The aggregate effect of this was that I found myself adopting a much more deliberate, conservative approach to refactoring than I had in Smalltalk. The eventual, cumulative effect was to change the character of the way the framework evolved. Architectural change took place in coarser, larger grained, more deliberate, less frequent increments than it did in Smalltalk. It became easy to defer more speculative exploration, and to defer obviously worthwhile change, for fear of collateral damage. One had to not try to break the framework unless one had time to fix it. Indeed, there are, even now, a bevy of deferred refactoring opportunities in OSIRIS waiting for enough of my time to realize them.
I have thought it likely that had my framework research been conducted in C++ rather than Smalltalk, our conclusions would have been quite different. The power of refactoring and our ideas about reuse itself, and the emergence of black box components from white box inheritance-based precursors might have escaped us. The experience has given me insight into why the C++ development world seemed so bureaucratic, and even as to why the patterns movement has resonated so clearly in it. The point at which one goes from being able to handle things oneself, to where you need a team, would seem to come considerably sooner in C++. In the case of C++, better tools could still help a lot. (Java would seem to sit somewhere btween C++ and Smalltalk with respect to both linguistic cumbersomeness and tool support.
Certainly, there are other forces at work here. Foremost among them is that OSIRIS, unlike my Smalltalk simulation, was a real, working system facing real requirements. The world doesnt pull its punches to make your architecture work out better. With my simulation, some of the architectural heavy lifting had been borrowed from my battery experience. With OSIRIS, the novel domain challenges really have been new ones to me.
Multifunction components are more complex than single function components, so this complexity must be managed. Its harder to come up with tools that do a range of jobs well, rather than one. There are times when this shouldnt even be attempted. The art of framework construction is in being able to recognize how and when exploit opportunities to factor common, reusable elements from among similar components, and consign that which distinguishes them to separate subclasses or components.
Frameworks really do work, but its a long hard road to get there
My overarching conclusion is having been there, frameworks really do work. Reuse is real, and its powerful. In my case, there is simply no way that I could have constructed and maintained the number of distinct, complicated applications I did without factoring the common parts into a framework. When you polish or repair a core element of a framework, everyone benefits, almost retroactively. This, in turn, amortizes the cost of this sort of effort across all its beneficiaries. Once one has distilled the generic essence of a domain, and embodies it in a framework, one really can deploy new applications in a fraction of the time it would otherwise take. (The line of code measurements discussed earlier are good rough indications of the scale of these efforts.) In my case, that made bidding certain jobs economically viable, where they otherwise would not have been. As an individual, I've found that my framework has been an incredible lever, and has enabled me to compete as a solo act in a domain where small armies are now being deployed.
How might one bottle and scale this? Small, commando teams of similarly inclined architect-builders could work. A skunk-works mentality seems to be helpful. This seems to be the model employed in so- called hyper-productive organizations. Its hard for me to imagine framework development being done in the big-shop assembly line style, without factoring the task along the grain of the domain into these sorts of teams.
For me, the reward is in being able to craft quality artifacts. Im paid to produce working programs, but, I must confess, my consuming concern is with artifacts themselves. The code is where I live, and when my nest is littered with spaghetti code squalor, its not a pretty place to be. Reuse is motivated by a particular combination of sloth and craftsmanship. If there is one thing more satisfying than a thousand line of code weekend, its a negative thousand line of code weekend. These are not unusual occurences during framework development. If there is anything worse than rewriting the same application over and over again, its maintaining a bushel of them. Frameworks give us a way out of this quagmire. Ten years ago, I said that if my alternatives were to roll the same rock up the same hill everyday or leave a legacy of polished, tested, general components as the result of my toil, I knew what my choice would be. I still believe that.
Brian Foote has over twenty years of professional programming and consulting experience in the realm of realtime scientific systems and applications. In addition, since 1985, he has also been engaged in research on object languages and frameworks, as well as software reuse, software evolution, reflection, patterns, and software architecture, and has taught and published in these areas. He is one of approximately 30 people to have attended every OOPSLA conference to-date. This highly unusual combination of practical experience and research training converges in his enduring interest in where good code comes from.
Brian is currently a Visiting Research Programmer with the Department of Computer Science at the University of Illinois at Urbana-Champaign. He received his MS in Computer Science there in 1988, and his BS in 1977.
[Coplien 1994] James O. Coplien A Generative Development Process Pattern Language First Conference on Pattern Languages of Programs (PLoP '94) Monticello, Illinois, August 1994 Pattern Languages of Program Design edited by James O. Coplien and Douglas C. Schmidt Addison-Wesley, 1995 [Foote 1988a] Brian Foote Designing to Facilitate Change with Object-Oriented Frameworks Masters Thesis, 1988 (advisor: Ralph Johnson) University of Illinois at Urbana-Champaign [Foote 1988b] Brian Foote Domain Specific Frameworks Emerge as a System Evolves Workshop on Methodologies and Object-Oriented Programming, OOPSLA '88, San Diego, CA Norman L. Kerth, organizer [Foote 1988c] Brian Foote Designing Reusable Realtime Frameworks Workshop on Realtime Systems OOPSLA '88 San Diego, CA John Gilbert, organizer [Foote & Opdyke 1994] Brian Foote and William F. Opdyke Lifecycle and Refactoring Patterns that Support Evolution and Reuse First Conference on Pattern Languages of Programs (PLoP '94) Monticello, Illinois, August 1994 Pattern Languages of Program Design edited by James O. Coplien and Douglas C. Schmidt Addison-Wesley, 1995 [Johnson & Foote 1988] Ralph E. Johnson and Brian Foote Designing Reusable Classes Journal of Object-Oriented Programming Volume 1, Number 2, June/July 1988 pages 22-35 [Roberts & Johnson 1996] Don Roberts and Ralph E. Johnson Evolve Frameworks into Domain-Specific Languages PLoP 96 submission