Pattern Refactoring Workshop

OOPSLA 2000

position paper by

Toby Sarver

Towards a Pattern Taxonomy

There is a lot of information about what all patterns have in common--generally a solution to a particular problem in a particular context. But it is difficult to define a taxonomy of patterns because one would need to define criteria for creating the taxonomy and for placing a pattern in one place in the taxonomy versus another. Of course, if this were easy, we wouldn't need an OOPSLA workshop to discuss it. Here are the kinds of patterns currently being produced by the patterns community:

Categories of Patterns	Comments	Types of Problems	Types of Solutions	Software Development Phase
Design patterns	related to general computer science concepts, application-independent	clarity of design, multiplication of classes, adaptability to changing requirements, etc.	factoring behavior, Class-Responsibility-Contract (CRC)	Detailed Design
Process or Organizational patterns	related to development or project management processes or techniques, or organization structures	productivity, effective and efficient communication	team building, software life-cycle, role assignment, communication prescriptions	Planning
Analysis patterns	usually application- or industry-specific	modeling the domain, completeness, integrating/balancing multiple goals, planning for common additional features	domain models, knowledge about what to include (e.g., logging & restart)	Analysis
Architectural Patterns	related to how objects interact within or between architectural layers	architectural problems, adaptability to changing requirements, performance, modularity, coupling	inter-object calling patterns (similar to Design patterns), architectural decisions and criteria, packaging functionality	Early Design
Idioms	coding and project standards	doing common, well-understood operations in a new environment, or across a large group, readability, predictability	very specific to a language, platform, or environment	Implementation, Maintenance, Deployment

Suppose we had this pattern taxonomy fully specified. What would it look like? How would one use it? How would I place my brand-new pattern into the taxonomy? I suggest that we look at such a pattern taxonomy from a pragmatic perspective: the view of someone who is looking for a solution to a particular problem. If somehow one could access all known patterns organized into this taxonomy, how would the person think about browsing the taxonomy? I would think that the first level should be based on target audience (role in the development process, e.g., Project Manager, CIO, architect, designer, etc.). The top layer might be Software Development Organization (CIO), Project Management, Software Development Processes, Software Development Techniques. (Note that Project Management is separate from Software Development Processes because the former is about managing a software project and the latter is about developing a software product. (Brian: I know you know this, but not everyone does.)) Most of the patterns that are discussed by the patterns community lie in the last category, Software Development Techniques.

The subsequent levels should be based on the degree of abstraction in the problem description, and the breadth of applicability of the solution. For example, I would expect domain-specific patterns to be lower in the taxonomy than domain-independent ones. But one could get to the domain-specific patterns by browsing through a domain-independent pattern that solves the same or similar problem. That is, the domain-specific pattern would be an example of applying the domain-independent pattern to a given domain. One would expect that the software development phase to which a pattern applies would advance (go from "Planning" to "Maintenance") as one goes further down the taxonomy. This is reasonable because the problems, solutions, and software artifacts one encounters become more detailed and specific as one moves through the software development life-cycle (SDLC).

Problems with this Pattern Taxonomy

This "pattern taxonomy" is not strictly a taxonomy. It doesn't allow you to assign a kingdom, family, genus, species to a given pattern based on the its inherent features or properties (e.g., animal, vertebrate, mammal, primate, intelligent ==> homo sapiens). Instead it uses a relative measure of where the pattern lies on the continuum from abstract to specific. Also, the problems in creating a pattern taxonomy are, in my opinion, similar to those in creating a library of reusable components to be used across an organization. Software engineers would like such a library to be organized similar to how electrical engineers search for chips or components to be used on a PCB. But unfortunately, software engineering does not have such an undisputable framework as electrical engineering has in the laws of physics.

Also, one rarely considers a pattern independent of other patterns; hence, the concept of a pattern language. A pattern author often references other patterns in the standard "Related Patterns" section, even if the pattern is not in a pattern language. Therefore, this (non)taxonomy would need some way of linking patterns together, both as declared by the author and as decided by the pattern (non)taxonomy maintainer.

The primary issue is, "What are the inherent properties of a pattern that we could use to build a Pattern Taxonomy?" This feature that I've latched on to, "level of abstraction" is relative and subjective. It also doesn't address qualitative features like prescriptive versus descriptive. It also doesn't help novices use the taxonomy to discover a pattern based on matching its problem description with the problem at hand. Since one objective of writing patterns is to encode knowledge for less-experience people to use, it is counter-productive that this problem-matching skill is a pre-requisite to using the taxonomy. I believe that one would have to express their problem at many levels of abstraction in order to use this taxonomy effectively.

Now that I have effectively shredded the foundation for reasons to use the pattern taxonomy that I've proposed, I would like to suggest that these issues affect every such proposed pattern taxonomy. Until we can place patterns into a paradigm that extracts their inherent properties, we will not be able to build a real taxonomy. Such a paradigm would be similar to physics (electrical, chemical, physical properties) or biology (animal, vertebrate, mammal, primate, intelligent). In the mean time, it will be a reasonable, browsable, inter-connected hierarchy of patterns that can only be used by someone with medium to advanced problem-matching skills. I believe we could come up with something like the Dewey Decimal system (it wouldn't have to be expressed numerically) that would reasonably group patterns, but this is still not a true taxonomy, and it would only be a starting point for searching for the pattern that exactly addresses a given problem. Remember how you searched for a book about a certain topic (back when "card catalogs" were comprised of 3x5 cards) in multiple Dewey Decimal categories?

Maybe there isn't a taxonomy of all patterns because some patterns are atomic and some are aggregate (like a molecule). I'd like to define an "atomic" pattern as one that uses only the concepts of object-oriented programming: Inheritance, Message Passing, Encapsulation and Polymorphism. While it is possible to represent any pattern using only these concepts (that is, after all the "Implementation" section), what makes a pattern "molecular" is that some identifiable sub-part of it looks like (maybe subjectively) some other pattern. Many pattern authors make this easy to identify by listing the component patterns in the "Related Patterns" section, or explicitly building a pattern language.

Maybe the study of patterns should follow a chemical approach. The pattern community would identify the "Periodic Table of Patterns". (I'm not sure how it would be organized. What in patterns is analogous to atomic weight and valence numbers?) Certain molecular patterns or pattern languages would use these atomic patterns to build up a solution to a problem at some level of abstraction. The more remaining design decisions, the higher the level of abstraction. Some patterns seem to "fit together" easily, like ["Composite", GoF] and ["Visitor", GoF]. Is this like matching valence numbers on the outer shell of electrons? What can we determine about a molecular pattern by only knowing its components? Is there an analog to physical properties? Would it be fruitful to do cluster analysis in order to say things like, "Patterns A, B, and C are often found together, but B is rarely found with F." I think that chemistry makes a promising paradigm for considering the body of patterns, as long as we don't try to take the analogy too far.

Refactoring Patterns

Since I have attempted to show that we can not currently build an effective taxonomy of patterns, what is left is learning about comparing patterns for similarity and "relatedness". Two patterns might refer to the same domain, offer a similar solution to two different problems, or offer two different solutions to the same (or similar) problem. One pattern might be a more abstract or more specific version of another. A pattern could use another as a component (the standard definition of a pattern language). Multiple patterns can lie on a continuum. All of these relationships can be inferred whether the author points them out or not. I believe that looking at pairs or groups of related patterns is more instructive than looking at patterns in isolation, or even in the pattern language defined by their author. If nothing else, looking at a group of related patterns gives insight to how different people perceive a problem and formulate a solution.

Examples of how patterns relate to each other

This next section shows examples of how patterns relate to each other, or what we can learn from a group (possibly of size two) of related patterns. You can go directly to the Conclusion if you wish to skip the section; you will not be missing any of my central message.

Different levels of abstraction

I think that looking at two patterns where one is more abstract than the other is especially enlightening because the more abstract pattern is (should be) applicable to a broader group of problems, but the more specific pattern is (should be) easier to apply to a specific problem. The pattern [Sommerlad, "Command Processor", PLOP 2] is an application of the ["Command", GoF] pattern. The Command pattern describes a particular object design that encapsulates commands as first-class objects rather than a cascade of method calls (initiated by the user via a menu item, for example). The Command Processor pattern extends and applies this encapsulation to describe a system that has undo and redo capabilities that use Command as the unit of work. The Command pattern solves a problem (cascade of method calls), but doesn't prescribe a context in which to solve the problem. The Command Processor prescribes a more specific context for the problem.

A meatier example of this would be to compare [Johnson and Woolf, "Type Object", PLOP 3] with [Buschmann, "Reflection", PLOP 2]. The objective of the Type Object pattern is for the program to be able to handle new types of objects without writing new classes. You can map this declaration of Type Objects and Type Classes to an object-oriented meta-data language, and the system that "reads" them as the base layer. In the example from the pattern, "Videotape" and "Rental Category" (I added it to represent the rental price of a category of videotapes) corresponds to a Type Class, Star Wars would be a Type Object, and "Videotape 3235" would be an Object. The meta-data might read:

RentalCategory name: "NewArrivals"
	rate: 3.5 currency: "USD"
	period: 5 periodUnits: "Day";
VideoTapeClass name: "Star Wars"
	rCategory: "NewArrivals"
	MPAARating: "PG"
	description: "Long ago, in a galaxy far, far away..."
	purchasePrice: 68 purchaseCurrency: "USD"
//The lines that follow would most likely be in a database rather than a flat file
VideoTape number: 899923 class: "Star Wars" currentlyRented: false dueBack: 99/99/9999
VideoTape number: 899924 class: "Star Wars" currentlyRented: true dueBack: 12/07/1978
VideoTape number: 899925 class: "Star Wars" currentlyRented: false dueBack: 99/99/9999

The base layer would associate a VideoTape with a VideoTapeClass and apply whatever logic is appropriate. For example, searching for a VideoTape of Star Wars that is not checked out would entail going to the VideoTapeClass named "Star Wars", asking for its instances (or more likely going to a database and asking for VideoTapes with class "Star Wars"), then iterating through them until one is found where the "currentlyRented" attribute is "false". Then when the rental transaction completes this VideoTape would have "currentlyRented" be "true", and "dueBack" would be five days from now. We get the rental period by looking at the RentalCategory of the VideoTapeClass.

Different top-level categories

In some cases, a pattern refers to another pattern which is in a completely different top-level category. For example, [Foote and Yoder, "Evolution, Architecture, and Metamorphosis", PLOP 2] refers to, and to some extent depends on, using rapid development techniques like incremental prototyping and continual evolution [McConnell, Rapid Development, Microsoft Press, 1996]. (I know that McConnell did not represent the incremental prototyping SDLC as a pattern, but it can easily be presented so.)

A specific application of multiple patterns

Sometimes a pattern is explicitly about applying a set of other patterns to a specific problem in a given domain. See [Duell, "Experience in Applying Design Patterns to Decouple Object Interactions in the INgage™ IP Prototype", The Patterns Handbook, Linda Rising (ed.)]. These types of patterns can be helpful in showing how to apply patterns and what decision process to use when matching a concrete problem with the problem descriptions of various patterns. They might also show novel ways of applying a given pattern, ways people might not have heretofore considered.

Patterns as points along a continuum

Sometimes a group of patterns relate to a particular concept or problem, and each pattern represents a point along a continuum. Let's look at the problem of choosing a particular class to carry out some behavior for us. At one level, we want to treat all such classes the same. For example, the Desktop would want to manipulate the Tools in a consistent manner, as in [Riehle and Zullighoven, "A Pattern Language for Tool Construction and Integration Based on the Tools and Materials Metaphor", PLOP]. At another level, we want different behavior based on the particular subclass chosen (e.g., a particular Tool for the given Material).

The continuum over which these solutions lie is the degree of and representation of the knowledge about which class is the correct one to instantiate. The lowest point in the continuum would be to distribute the knowledge throughout the system and call "new <theCorrectClass>(<params>)" directly. I haven't been able to find a pattern that advocates this approach. The next level would use ["Factory Method", GoF] to ask a Material to create a Tool. This distributes the knowledge only to the Material subclasses (or objects that implement the Material interface, depending on the design). However, if the context in which the program is running changes (e.g., moving from "NewWave" to "Motif"), one would have to rewrite these methods for each subclass. The next point on the continuum would be to use an ["Abstract Factory", GoF] to centralize the knowledge of which class to create. The next level is to use the Class Retrieval pattern from [Riehle, "Patterns for Encapsulating Class Trees", PLOP 2], where the knowledge is encoded as Class Specification and Class Clause. This would make it easy to say, e.g., "Use class MotifCalendar when the WindowingSystem = 'Motif' and the Material = 'AppBook'." The final point (that I can see) on the continuum is to use [Buschmann, "Reflection", PLOP 2] to represent the specification knowledge as meta-data, and the base layer interprets it at run-time to come up with the correct class. The advantage here is that you can change the meta-data and the program will change behavior without recompilation.

Conclusion

I hope I have convinced you that the software engineering industry is not yet ready to develop a true taxonomy of patterns. We do not yet have a way of obtaining the inherent properties of a pattern, which is required for building a taxonomy. One aspect of a taxonomy is that only one pattern would be able to hold a given "spot" in the taxonomy. We may have multiple patterns that provide a similar solution for a similar problem, yet we don't want to have to choose between them. Each pattern is valuable because it provides some unique insight into a problem, a solution, forces at work, or even just interactions with other patterns.

I think it will be interesting to discuss what a "Dewey Decimal system"-like categories of patterns should look like during the workshop.

About the Author: Toby Sarver first attended OOPSLA in 1987 as a graduate student, and has been in the software engineering industry since 1988. He has been writing and using patterns since 1994 when he participated in Peter Coad's OOPSLA workshop on patterns. He has worked in many industries including transportation, financial services, software development tools (including reverse engineering, parser generators, and software metrics), and e-commerce. He currently uses and writes patterns as the eScenario Architect for HAHT Commerce, an e-commerce solutions company helping click-and-mortar companies with B2B sell-side and SCM integration.

Toby Sarver
HAHT Commerce, Inc.
400 Newton Road
Raleigh, NC 27615
tobys@haht.com